[squeak-dev] Can I create a 75 Gb Image?

gettimothy gettimothy at zoho.com
Sun Oct 17 16:46:28 UTC 2021


Thank you for the reply

That sounds like fun. To be clear, I can set these VM parameters on the running image? Or do I need to build a VM with those parameters pre-set?

. Space is no problem :

bash-4.3$ df -h /bulkstorage/

Filesystem      Size  Used Avail Use% Mounted on

/dev/sdd1       3.6T   86G  3.4T   3% /bulkstorage

that is where the 73Gb xml file is.

Squeak 5.3 is very slow compared to the pharo8.

I have a pristine image of both running that SAX parse on the same file.

pharo :

ping: two million elements.  Time: 0:00:04:12.13878234600296

squeak5.3 (I upped the priority too):

fs roughly 100000 elements in 5 minutes 

for some reason, pharo lets me run the thing and still interact with the system, while on squeak, the process has priority over my inputs.

Looking at the process browser on both shows differences, maybe its related to that.


---- On Sun, 17 Oct 2021 12:14:41 -0400 Eliot Miranda <eliot.miranda at gmail.com> wrote ----

Hi Timothy,

On Oct 17, 2021, at 8:41 AM, gettimothy via Squeak-dev <mailto:squeak-dev at lists.squeakfoundation.org> wrote:

Hi Folks,

I am thinking about the result of parsing that 73Gb xml file and storing many of its elements in squeak as live objects.

Assuming an ~1:1 relationship betwen that file and the resultant image, can Squeak handle that?

Can I tell squeak to "SmalltalkImage getLarge:73Gb"  ? 

I think It would be a fun experiment if this possible.

thx in advance.

In theory there should be no problem other than a rather sedate snapshot and start up time.  If your machine has enough ram and disc Spur will save and restore that size of image.  You may find that increasing the size of new space/eden to 128/256/512mb, increasing the old space segment size to 1gb, and especially changing the growth to size GC ratio, decreases build time substantially.

About Squeak is your friend.  Under vm parameters you’ll find info on sizes and rooting around you’ll find the levers for setting these sizes through vmParameterAt:put:.  See parameters 25, 45, & 55. #25 is badly described.  It is actually the minimum old space segment size.  Setting this to eg 1gb means a lot less segments to traverse than the default 16mb.

I’d be interested to see the vm stats at the end of a build. Those will help you tune and find out how much time is being spent in GC.

_,,,^..^,,,_ (phone)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20211017/a98a7c62/attachment.html>

More information about the Squeak-dev mailing list