[squeak-dev] Can I create a 75 Gb Image?

gettimothy gettimothy at zoho.com
Mon Oct 18 15:52:37 UTC 2021


Hi Marcel,



Thanks for the reply.



I have no clue on what happens regarding memory/disk. That the entire Smalltalk image runs in memory is a completely new concept to me.



My perception was that the image was "storage and initial state" and , like a db, the stuff resides on disk to be referenced and pulled into memory when needed.



Honestly, I have no clue.



(R , the most important statistics app on the planet is image based too. I wonder if that is pulled into memory too)


What I am going to do is load the 75Gb into the image as objects  that wrap XMLDocument  (the file has many, many XMLDocuments in it) and see what happens.

If that fails, I will go the PostgresDB route and store the data there and build objects on the fly from database calls.



I was really hoping to deal with just objects as the geek in me thinks that would be totally cool.



Anyhoo, it will be a fun experiment.



I will be sidelined for several days. I have a job interview--pure smalltalk--with Dolphin, VisualWorks(?) and a port to pharo coming up and I need to install and study Dolphin and do the mooc on pharo in prep.





cordially,



t






---- On Mon, 18 Oct 2021 11:14:44 -0400 Marcel Taeumel <marcel.taeumel at hpi.de> wrote ----



Hi Timothy --



I am not aware of basic support for automatic paging/swapping of the object space in the OSVM. Wouldn't the GC become more challenging to implement, too? Yet, I recall that there has been some PhD work on that topic ... probably using a pre-Spur VM ... Well, you can use (Native)ImageSegment (primitives 98 and 99) to manually manage part of the object space, right?



Best,

Marcel







Am 18.10.2021 13:45:16 schrieb gettimothy via Squeak-dev <mailto:squeak-dev at lists.squeakfoundation.org>:

Sorry for the formatting, my mail reader doesn't format replies "correctly"





Thank you for the reply.



When you write...



The bad news is that you need more than 73GB RAM as well. 

I presume even 256GB will not suffice in practice because objects probably 

use more memory than the xml text. 



Are you stating that every time I run Squeak, the entirety of the image is loaded into RAM?

I am/was operating under the presupposition that the Squeak Image is rather like that of Linux Swap Space or a Postgres DB, which both exist on hard drives.

And that, Cog/Spur would pull the portions it needs from the image on Disk and put them in memory as needed.





re:



#timeProfile is your friend. 







thanks! added and I will run it on both squeak and pharo to look at differences.



re:



It's because you use the wrong priority. Use #userBackgroundPriority 

instead of #lowIOPriority. 






I have another (I am sure, wrong) preconception that a higher priority results in higher executions speed.

I suspect that it just manages interrupts, etc and that if their are none, then the execution speed is unaffected.

Is the latter true?

thx for your patience.

t






On Sun, 17 Oct 2021, gettimothy via Squeak-dev wrote: 

> Eliot, 

> 

> Thank you for the reply 

> 

> 

> That sounds like fun. To be clear, I can set these VM parameters on the running image? Or do I need to build a VM with those parameters pre-set? 

> 

> 

> 

> . Space is no problem : 

> 

>       bash-4.3$ df -h /bulkstorage/ 

> Filesystem      Size  Used Avail Use% Mounted on 

> /dev/sdd1       3.6T   86G  3.4T   3% /bulkstorage 

> 

> that is where the 73Gb xml file is. 



The bad news is that you need more than 73GB RAM as well. 

I presume even 256GB will not suffice in practice because objects probably 

use more memory than the xml text. 



> 

> 

> Squeak 5.3 is very slow compared to the pharo8. 

> 

> I have a pristine image of both running that SAX parse on the same file. 

> 

> pharo : 

> 

>       ping: two million elements.  Time: 0:00:04:12.13878234600296 

> 

> 

> 

> squeak5.3 (I upped the priority too): 

> 

>       fs roughly 100000 elements in 5 minutes  



#timeProfile is your friend. 



> 

> 

> for some reason, pharo lets me run the thing and still interact with the system, while on squeak, the process has priority over my inputs. 



It's because you use the wrong priority. Use #userBackgroundPriority 

instead of #lowIOPriority. 





Levente
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20211018/afe43b79/attachment.html>


More information about the Squeak-dev mailing list