As some of you know I from time to time do serious GC tuning in VisualWorks. http://www.smalltalkconsulting.com/papers/GCPaper/GCTalk%202001.htm
For years now (yes years) I've been thinking about doing serious tuning in Squeak, but yet things just weren't interesting enough. However with the release of Croquet, thoughts on TK4, and mumblings from folks I turned my eye upon what the Squeak GC was doing. First thanks to Ted Kaehler (among others) for writing this GC 10 odd years back which we've run with little change over the years until we added the ability to shrink/grow the memory space a few years back.
First I'll need some help from macintosh users that have interesting images and can run a pending 3.8.4 mac VM with instrumentation and then send me some diagnostic log. People who are willing to help should email me directly, and I'll get you a VM and changeset, and we'll see if we can fill up my gmail account with log data. I'm very interested in getting Croquet data, having a test case you can run before/after would be best.
So lets review how does the Squeak GC work:
A simplistic view is that your image (25MB) is loaded into Old Space, and 4MB (set in Smalltalk) is allocated for Young Space. After allocating 4000 (if they fit) new object (limit set in smalltalk) we do an mark/sweep/compacting GC on young space using the VM roots plus roots identified from Old Space to Young Space references. Normally this means looking only at a few thousand objects, then occasionally we might do a full GC across all 300K object based on various conditions, but that's rare.
I'll note the Old Space to Young Space remember table, remembers the Object containing the reference, not the reference so that if you have 100,000 element collection we interate over all 100,000 entries looking for the 1 or more old to young references, a cause for performance concern.
If after completing the mark/sweep/compaction and over 2000 objects (set in Smalltalk) are survivors we "Tenure" them to old space by moving the old/young pointer boundary, then start allocating objects again, mind if the young space exceeds now 8MB free (set in smalltalk) we given memory back and reset things back to the 4MB boundary.
Now object allocation can end early because we are allocating a big object, in that case we do an young GC, followed by a full GC (very expensive) followed by growing young space so the big object can fit. Also of course if we run out of root table remember space, we end the allocation process early and think about Tenuring things.
Or it can end if we drop below a minimum amount of memory (200K set by smalltalk) which makes the image grow, which if it fails signals the low space semaphore (and brings up a low space dialog which no-one really see and have bitterly complained about) Yes I now know why that occurs too...
The Problem:
Last weekend I built a new VM which has instrumentation to describe exactly what the GC is doing, also to trigger a semaphore when an GC finishes, and to allow you to poke at more interesting things that control GC activity.
What I found was an issue which we hadn't realized is there, well I'm sure people have seen it, but don't know why... What happens is that as we are tenuring objects we are decreasing the young space from 4MB to Zero.
Now as indicated in the table below if conditions are right (a couple of cases in the macrobenchmarks) why as you see the number of objects we can allocate decreases to zero, and we actually don't tenure anymore once the survivors fall below 2000. The rate at which young space GC activity occurs goes from say 8 per second towards 1000 per second, mind on fast machines the young space ms accumulation count doesn't move much because the time taken to do this is under 1 millisecond, or 0, skewing those statistics and hiding the GC time.
AllocationCount Survivors 4000 5400 3209 3459 2269 2790 1760 1574 1592 2299 1105 1662 427 2355 392 2374 123 1472 89 1478 79 2 78 2 76 2 76 2
Note how we allocate 76 objects, do a young space GC, then have two survivors, finally we reach the 200K minimum GC threshold and do a full GC followed by growing young space. However this process is very painful. Also it's why the low space dialog doesn't appear in a timely manner because we are attempting to approach the 200K limit and trying really hard by doing thousands of young space GCed to avoid going over that limit. If conditions are right, then we get close but not close enough...
What will change in the future.
a) A GC monitoring class (new) will look at mark/sweep/Root table counts and decide when to do a tenure operation if iterating over the root table objects takes too many iterations. A better solution would be to remember old objects and which slot has the young reference but that is harder to do.
b) A VM change will consider that after a tenure if the young space is less than 4MB then growth will happen to make young space greater than 4MB plus a calculated slack. Then after we've tenured N MB we will do a full GC, versus doing a full GC on every grow operation, this will trigger a shrink if required. For example we'll tenure at 75% and be bias to grow to 16MB before doing full GC.
c) To solve hitting the hard boundary when we can not allocate more space we need to rethink when the low semaphore is signaled and the rate of young space GC activity, signaling the semaphore earlier will allow a user to take action before things grind to a halt. I'm not quite sure how to do that yet.
Some of this might be back-ported to earlier VM, I think so, yet I won't know until we gather more data and try a few things.
-- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
I've posted a change set, a macintosh VM, and a readme to
http://homepage.mac.com/johnmci
look in the experimenta/GC work directory
On Oct 29, 2004, at 2:14 PM, John M McIntosh wrote:
As some of you know I from time to time do serious GC tuning in VisualWorks. http://www.smalltalkconsulting.com/papers/GCPaper/GCTalk%202001.htm
For years now (yes years) I've been thinking about doing serious tuning in Squeak--
======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
squeak-dev@lists.squeakfoundation.org