[Vm-dev] Is it possible to suspend the garbage collector?

Tue Jan 12 18:47:41 UTC 2016

Lots of questions.
I had a memory policy that would gather data, you can find it here
https://drive.google.com/file/d/0BzM4orHA3iGxd1JPcDFxZExYZE0/view?usp=sharing

Also it sets up targets like trying to get a incremental GC to happen
within N milliseconds etc by adjust parms.
If you search the archive for SophieMemoryPolicy or MemoryPolicy you'll
find more discussion

On Tue, Jan 12, 2016 at 8:39 AM, Max Leske <maxleske at gmail.com> wrote:

> Hi John,
>
> Thanks for the explanation.
>
> As far as I understand, these settings won’t help in my particular case (I
> experimented with turning the bias on and off before allocating the
> ByteArray but didn’t see any influence). Still good to know about them.
>
> I still have a few questions regarding garbage collection.
>
> 1. When I allocate a ByteArray, as Eliot suggested, like so:
>
> Smalltalk vmParameterAt: 24 put: 150*1024*1024.
> ByteArray new: 70*1024*1024.
>
> then I see that the young space has grown a lot, as expected, and free
> space is low:
>
> before allocation:
> memory 57,032,988 bytes
> old 44,259,392 bytes (77.60000000000001%)
> young 5,824 bytes (0.0%)
> used 44,265,216 bytes (77.60000000000001%)
> free 12,767,772 bytes (22.400000000000002%)
>
> after allocation:
> memory 117,463,180 bytes
> old 39,666,076 bytes (33.800000000000004%)
> young 76,523,392 bytes (65.10000000000001%)
> used 116,189,468 bytes (98.9%)
> free 1,273,712 bytes (1.1%)
>
>
> If I now force an incremental GC with #garbageCollectMost then then
> *nothing changes*, i.e. young space is still full and free space is low.
> What I would expect, since incremental GC, as I understand it, marks and
> sweeps young space, is that young space becomes small and free space
> becomes large (allocated memory of the ByteArray moves from young to free
> space):
>
> after #garbageCollectMost:
> memory 117,463,180 bytes
> old 39,666,076 bytes (33.800000000000004%
>  young 75,339,048 bytes (64.10000000000001%)
> used 115,005,124 bytes (97.9%)
> free 2,458,056 bytes (2.1%)
>
> Only once I do a full garbageCollect does the ByteArray get collected:
> memory 121,025,356 bytes
> old 39,703,404 bytes (32.800000000000004%)
> young 3,265,428 bytes (2.7%)
> used 42,968,832 bytes (35.5%)
> free 78,056,524 bytes (64.5%)
>
> Why do I see this behaviour? In Cog I see the expected outcome (free space
> big, young space small) even without triggering an incremental GC manually.
>
>
> 2. There’s also still the question about where those 500ms disappear to
> (see last e-mail), which are unaccounted for in the MessageTally output.
>
>
> Cheers,
> Max
>
>
> On 11 Jan 2016, at 20:03, John McIntosh <johnmci at smalltalkconsulting.com>
> wrote:
>
> Ok as the author of setGCBiasToGrowGCLimit & setGCBiasToGrow I was
> looking for a paper I gave at OOPSLA 2005 squeakfest on this. But can't
> find it.  However the original note from January 18th 2005 explains
>
> Mac 3.8.6b4
>
> Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024. "Set growth limit before
> full GC to 16MB"
>                                 Smalltalk setGCBiasToGrow: 1.
>                                 Set bias to grow upto GCLimit, this can
> avoid a problem where we attempt to avoid growing but results in thousands
> of incremental GC events as we approach a knee in a curve of space used
> versus the growth/compaction decision.
>
> Plus added this
> " A VM change will consider that after a tenure if the young space is less
> than 4MB then growth will happen to make young space greater than 4MB plus
> a calculated slack. Then after we've tenured N MB we will do a full GC,
> versus doing a full GC on every grow operation, this will trigger a shrink
> if required.  For example we'll tenure at 75% and be bias to grow to 16MB
> before doing full GC."
>
>
>
>
>> The Problem:
>>
>> Last weekend I built a new VM which has instrumentation to describe
>> exactly what the GC is doing, also to
>> trigger a semaphore when an GC finishes, and to allow you to poke at more
>> interesting things that control GC activity.
>>
>> What I found was an issue which we hadn't realized is there, well I'm
>> sure people have seen it, but don't know why...
>> What happens is that as we are tenuring objects we are decreasing the
>> young space from 4MB to Zero.
>>
>> Now as indicated in the table below if conditions are right (a couple of
>> cases in the macrobenchmarks) why as you see the
>> number of objects we can allocate decreases to zero, and we actually
>> don't tenure anymore once the survivors fall below 2000.
>> The rate at which young space GC activity occurs goes from say 8 per
>> second towards 1000 per second, mind on fast machines
>> the young space ms accumulation count doesn't move much because the time
>> taken to do this is under 1 millisecond, or 0, skewing
>> those statistics and hiding the GC time.
>>
>> AllocationCount         Survivors
>> 4000    5400
>> 3209    3459
>> 2269    2790
>> 1760    1574
>> 1592    2299
>> 1105    1662
>> 427     2355
>> 392     2374
>> 123     1472
>> 89      1478
>> 79      2
>> 78      2
>> 76      2
>> 76      2
>>
>> Note how we allocate 76 objects, do a young space GC, then have two
>> survivors, finally we reach the 200K minimum GC
>> threshold and do a full GC followed by growing young space. However this
>> process is very painful. Also it's why the low space dialog
>> doesn't appear in a timely manner because we are attempting to approach
>> the 200K limit and trying really hard by doing thousands of
>> young space GCed to avoid going over that limit. If conditions are right,
>> then we get close but not close enough...
>>
>> What will change in the future.
>>
>> a) A GC monitoring class (new) will look at mark/sweep/Root table counts
>> and decide when to do a tenure operation if iterating
>> over the root table objects takes too many iterations. A better solution
>> would be to remember old objects and which slot has the young reference but
>> that is harder to do.
>>
>> b) A VM change will consider that after a tenure if the young space is
>> less than 4MB then growth will happen to make young space greater than 4MB
>> plus a calculated slack. Then after we've tenured N MB we will do a full
>> GC, versus doing a full GC on every grow operation, this will trigger a
>> shrink if required.  For example we'll tenure at 75% and be bias to grow to
>> 16MB before doing full GC.
>>
>> c) To solve hitting the hard boundary when we can not allocate more space
>> we need to rethink when the low semaphore is signaled and the rate of young
>> space GC activity, signaling the semaphore earlier will allow a user to
>> take action before things grind to a halt. I'm not quite sure how to do
>> that yet.
>>
>
>
> On Mon, Jan 11, 2016 at 3:21 AM, Max Leske <maxleske at gmail.com> wrote:
>
>>
>>
>> On 10 Jan 2016, at 20:22, vm-dev-request at lists.squeakfoundation.org
>> wrote:
>>
>> Hi Max,
>>
>>    pre-Spur to avoid GC one has to a) grow memory by enough to do all the
>> processing you're going to do and b) change the shrinkage parameter so the
>> Vm won't shrink the heap back down before the processing is complete.  To
>> do b) I suggest you modify setGCParameters.  vm parameters 24 sets the
>> shrinkage threshold; see vmParameterAt:put:: "24 memory threshold above
>> whichto shrink object memory (read-write)".  growMemory.  Hmmm, I had
>> thoguht that there's a growMemoryBy: primitive in v3, but it appears there
>> isn't.  So simply allocate a ByteArray of the desired size and then GC to
>> get rid of it.  That should leave that much free space and then your load
>> should proceed without needing to GC.
>>
>> Anyway, it's worth a try.
>>
>>
>> Thanks Eliot.
>>
>> Setting the memory threshold helped. I’m still seeing one full GC which
>> I’m trying to avoid. I’ve experimented with #setGCBiasToGrow: and
>> #setGCBiasToGrowGCLimit: but I don’t fully understand what they do.
>> #setGCBiasToGrow: seems to turn memory growth on and off. But if this is
>> turned off, how can the VM then allocate more memory?
>> #setGCBiasToGrowGCLimit: seems to control if the growth should trigger a
>> full GC, which seems pretty much like what I need.
>>
>> Unfortunately, while setting these options seems to have an influence, I
>> can’t quite see the pattern, and that one full GC is still there. Maybe you
>> could explain how these options work exactly?
>>
>> One other question: the MessageTally output seems to be missing about 50%
>> of the running time. Summing up over full GC + incremental GC + time spent
>> in the tree, leaves about 500ms unaccounted for. Do you have any idea where
>> that half second goes missing?
>>
>>
>> Here’s the code I use experimentally:
>>
>> MessageTally spyOn: [
>>
>> | size shrinkThreshold |
>> size := (self settings segmentsDirectory fileNamed: 'snapshot.bin') size.
>> shrinkThreshold := (Smalltalk vmParameterAt: 24).
>> Smalltalk vmParameterAt: 24 put: shrinkThreshold +  (size*2). "(8MB +
>> twice file size)"
>> Smalltalk setGCBiasToGrowGCLimit: shrinkThreshold +  (size*2).
>> Smalltalk setGCBiasToGrow: 1. “enable growth??"
>> ByteArray new: size*2.
>>
>> "incremental GC should take care of collecting the ByteArray, so I’m not
>> doing anything
>>
>> manually here"
>>
>>
>> <load snapshot> ].
>>
>>
>>
>>
>> Cheers,
>> Max
>>
>>
>>
>> Output from current MessageTally:
>>
>>  - 1123 tallies, 1125 msec.
>>
>> **Tree**
>> --------------------------------
>> Process: (40s)  123994112: nil
>> --------------------------------
>> 12.7% {143ms} CBImageSegment class(NSImageSegment
>> class)>>basicSnapshot:from:do:
>>   12.6% {141ms} CBImageSegment class(NSImageSegment
>> class)>>installSegmentFrom:andDo:
>>     12.6% {141ms} CBImageSegment class(NSImageSegment
>> class)>>readSegmentFrom:
>>       12.6% {141ms} NSSegmentStream>>readObject
>>         12.6% {141ms} SmartRefStream>>nextAndClose
>>           12.6% {141ms} SmartRefStream>>next
>>             12.3% {138ms} SmartRefStream(ReferenceStream)>>next
>>               12.3% {138ms} SmartRefStream(DataStream)>>next
>>                 10.6% {119ms}
>> CBImageSegment(ImageSegment)>>comeFullyUpOnReload:
>>                   |10.6% {119ms}
>> CBImageSegment(NSImageSegment)>>restoreEndiannessAndRehash
>>                   |  5.5% {62ms} Dictionary>>rehash
>>                   |    |2.8% {31ms} Dictionary>>associationsDo:
>>                   |    |  |2.2% {25ms} Array(SequenceableCollection)>>do:
>>                   |    |1.7% {19ms} Dictionary>>noCheckAdd:
>>                   |    |  1.7% {19ms}
>> Dictionary(HashedCollection)>>findElementOrNil:
>>                   |    |    1.2% {13ms} Dictionary>>scanFor:
>>                   |  4.5% {51ms} primitives
>>                 1.2% {13ms} SmartRefStream(DataStream)>>readArray
>>                   1.2% {13ms} SmartRefStream>>next
>>                     1.2% {13ms} SmartRefStream(ReferenceStream)>>next
>>                       1.2% {13ms} SmartRefStream(DataStream)>>next
>> **Leaves**
>>
>> **Memory**
>> old +94,031,228 bytes
>> young -9,207,660 bytes
>> used +84,823,568 bytes
>> free +90,024,824 bytes
>>
>> **GCs**
>> full 1 totalling 85ms (8.0% uptime), avg 85.0ms
>> incr 15 totalling 271ms (24.0% uptime), avg 18.0ms
>> tenures 10 (avg 1 GCs/tenure)
>> root table 0 overflows
>>
>>
>>
>> On Sat, Jan 9, 2016 at 3:03 AM, Max Leske wrote:
>>
>>
>> Hi,
>>
>> I have a rather annoying problem. I’m running a time critical piece of
>> code that reads a big (~90MB) image segment from a file. I’ve optimized
>> loading as far as possible and now GC takes far longer than the loading
>> itself (see the MessageTally output below).
>> I’m wondering if there’s any possibility to defer garbage collection
>> during the load.
>>
>> For completeness, here’s the use case: the process is socket activated,
>> which means that the first request coming in will start the process. When
>> the image starts it will load the segment to restore the last state of the
>> application and, once that’s done, serve the request. The critical time
>> includes vm startup, image startup, starting the server in the image and
>> loading the snapshot. With a big snapshot the loading time of the snapshot
>> is the most significant contributor.
>>
>> Maybe I could preallocate the needed memory to prevent the garbage
>> collector from running?
>>
>> I’d appreciate any ideas you have.
>>
>>
>> Cheers,
>> Max
>>
>>
>> PS: This needs to run on a Squeak 4.0.3 VM (no JIT)
>>
>>
>>
>>
>> Output from MessageTally:
>>
>> - 1624 tallies, 1624 msec.
>>
>> **Tree**
>> --------------------------------
>> Process: (40s)  592969728: nil
>> --------------------------------
>> 4.4% {72ms} CBImageSegment class(NSImageSegment
>> class)>>basicSnapshot:from:do:
>>  4.4% {72ms} CBImageSegment class(NSImageSegment
>> class)>>installSegmentFrom:andDo:
>>    4.4% {72ms} CBImageSegment class(NSImageSegment
>> class)>>readSegmentFrom:
>>      4.4% {72ms} NSSegmentStream>>readObject
>>        4.4% {72ms} SmartRefStream>>nextAndClose
>>          4.4% {72ms} SmartRefStream>>next
>>            4.3% {70ms} SmartRefStream(ReferenceStream)>>next
>>              4.3% {70ms} SmartRefStream(DataStream)>>next
>>                3.2% {52ms}
>> NSImageSegment(ImageSegment)>>comeFullyUpOnReload:
>>                  3.2% {52ms} NSImageSegment>>restoreEndiannessAndRehash
>> **Leaves**
>> 3.2% {52ms} NSImageSegment>>restoreEndiannessAndRehash
>>
>> **Memory**
>>        old                     +92,704,656 bytes
>>        young           -8,008,252 bytes
>>        used            +84,696,404 bytes
>>        free            +1,287,768 bytes
>>
>> **GCs**
>>        full                    2 totalling 954ms (59.0% uptime), avg
>> 477.0ms
>>        incr            5 totalling 165ms (10.0% uptime), avg 33.0ms
>>        tenures         1 (avg 5 GCs/tenure)
>>        root table      0 overflows
>>
>>
>>
>>
>>
>
>
> --
> ===========================================================================
> John M. McIntosh. Corporate Smalltalk Consulting Ltd
> https://www.linkedin.com/in/smalltalk
> ===========================================================================
>
>
>

-- 
===========================================================================
John M. McIntosh. Corporate Smalltalk Consulting Ltd
https://www.linkedin.com/in/smalltalk
===========================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160112/bd997661/attachment-0001.htm