[Pharo-fuel] [Vm-dev] Fwd: Possible collections problem

Max Leske maxleske at gmail.com
Thu May 16 17:55:39 UTC 2013


I nearly went insane because of Ruby. Debian might be stable but you never get the new stuff… The result were several ours of trying to run "bundle install; rake build" until I finally just tried running "build_interpreter_vm.sh" and HURRAY! I had a working 4.12.4-2729 version.

So thanks Frank :)

The problem persists with this version too. During debugging I noticed that the image call stack was inside the weak finalization process nearly all the time (finalization -> GC, finalization -> GC, etc.). So what I'll try now is to kill the weak finalization process in the forked image. A preliminary test was successful and I'll now see how that works in production (it's an internal application, so not that much of a problem).

BTW: The graph I serialize does not contain any weak references, I double checked.

Cheers,
Max


On 16.05.2013, at 10:10, Frank Shearar <frank.shearar at gmail.com> wrote:

> On 16 May 2013 08:29, Max Leske <maxleske at gmail.com> wrote:
>> 
>> Hi
>> 
>> I'm forwarding this because I'd like to rule out a VM problem. Short summary:
>> I fork a squeak image and then serialize objects with Fuel. In roughly 40% of the cases the fork suddenly locks and consumes 100% CPU. The trace I most often see with gdb in that case is the one with
>> "#0  0x08060453 in updatePointersInRootObjectsFromto ()" at the top.
>> 
>> The object processed when the lockup occufrs is always of class Timestamp, although that doesn't necessarily mean anything. Maybe it's more about the number of objects.
>> 
>> I'm working on Debian, 32-bit and I can reproduce the problem with SqueakVM 4.4.7-2364 and 4.0.3-2202 (the newer ones wont run because of glibc). I haven't tried Cog yet.
> 
> If it's convenient, running `bundle install; rake build` in a checkout
> of https://github.com/frankshearar/squeak-ci will build you a
> very-latest Interpreter VM (4.10.2.2614; executable at
> target/Squeak-4.10.2.2614-src-32/bld/squeak.sh) that will link against
> whatever glibc you have. (It works happily on my Ubuntu Lucid machine,
> which has glibc 2.11.) It works on OS X and Linux, but not on FreeBSD.
> 
> frank
> 
>> I also just checked that the problem occurs even if I don't serialize any timestamps (nor Process, Delay, Monitor, Semaphore; just to be sure).
>> 
>> So if anyone has a clue as to what might be going on I'd really appreciate the help.
>> 
>> Cheers,
>> Max
>> 
>> Begin forwarded message:
>> 
>> From: Mariano Martinez Peck <marianopeck at gmail.com>
>> Subject: Re: [Pharo-fuel] Possible collections problem
>> Date: 15. Mai 2013 16:53:10 MESZ
>> To: The Fuel Project <pharo-fuel at lists.gforge.inria.fr>
>> Reply-To: The Fuel Project <pharo-fuel at lists.gforge.inria.fr>
>> 
>> I cannot see anything in particular. Too many GC stuff.
>> So I wouldn't spend more time trying to debug. I would try the none large collections. Then I would try with latest Cog and latest StackVM.
>> 
>> 
>> 
>> On Wed, May 15, 2013 at 11:47 AM, Max Leske <maxleske at gmail.com> wrote:
>>> 
>>> I've had several forks hanging around just now. Apart from one all of these were locked. I attached gdb and generated the c stack for all of them. Not sure if there's anything really interesting in there although clearly a lot of time is spent in GC and with creation of objects. That doesn't have to mean anything though.
>>> 
>>> I haven't yet tried your suggestion Mariano.
>>> 
>>> Cheers,
>>> Max
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>> 
>>> #0  0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1  0x08060a77 in mapPointersInObjectsFromto ()
>>> #2  0x08060bb0 in incCompBody ()
>>> #3  0x08065fa7 in incrementalGC ()
>>> #4  0x080661a4 in sufficientSpaceAfterGC ()
>>> #5  0x08069420 in primitiveNew ()
>>> #6  0x0806de15 in interpret ()
>>> #7  0x08073dfe in main ()
>>> 
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>> 
>>> #0  0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1  0x08060a77 in mapPointersInObjectsFromto ()
>>> #2  0x08060bb0 in incCompBody ()
>>> #3  0x08065fa7 in incrementalGC ()
>>> #4  0x080661a4 in sufficientSpaceAfterGC ()
>>> #5  0x08069420 in primitiveNew ()
>>> #6  0x0806de15 in interpret ()
>>> #7  0x08073dfe in main ()
>>> 
>>> 
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>> 
>>> #0  0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1  0x08060a77 in mapPointersInObjectsFromto ()
>>> #2  0x08060bb0 in incCompBody ()
>>> #3  0x08065fa7 in incrementalGC ()
>>> #4  0x080661a4 in sufficientSpaceAfterGC ()
>>> #5  0x0806fed2 in clone ()
>>> #6  0x08070095 in primitiveClone ()
>>> #7  0x0806de15 in interpret ()
>>> #8  0x08073dfe in main ()
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>> 
>>> #0  0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1  0x08060a77 in mapPointersInObjectsFromto ()
>>> #2  0x08060bb0 in incCompBody ()
>>> #3  0x08065fa7 in incrementalGC ()
>>> #4  0x080661a4 in sufficientSpaceAfterGC ()
>>> #5  0x08069270 in primitiveNewWithArg ()
>>> #6  0x0806de15 in interpret ()
>>> #7  0x08073dfe in main ()
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0xb76f0f68 in select () from /lib/libc.so.6
>>> 
>>> #0  0xb76f0f68 in select () from /lib/libc.so.6
>>> #1  0x08070880 in aioPoll ()
>>> #2  0xb762419e in ?? () from /usr/lib/squeak/4.0.3-2202//so.vm-display-X11
>>> #3  0x08073595 in ioRelinquishProcessorForMicroseconds ()
>>> #4  0x08061f24 in primitiveRelinquishProcessor ()
>>> #5  0x0806de15 in interpret ()
>>> #6  0x08073dfe in main ()
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>> 
>>> #0  0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1  0x08060a77 in mapPointersInObjectsFromto ()
>>> #2  0x08060bb0 in incCompBody ()
>>> #3  0x08065fa7 in incrementalGC ()
>>> #4  0x080661a4 in sufficientSpaceAfterGC ()
>>> #5  0x08069420 in primitiveNew ()
>>> #6  0x0806de15 in interpret ()
>>> #7  0x08073dfe in main ()
>>> 
>>> 
>>> 
>>> [Thread debugging using libthread_db enabled]
>>> 0x08064e7e in markAndTrace ()
>>> 
>>> #0  0x08064e7e in markAndTrace ()
>>> #1  0x0806593a in markPhase ()
>>> #2  0x08065f60 in incrementalGC ()
>>> #3  0x080661a4 in sufficientSpaceAfterGC ()
>>> #4  0x0806fed2 in clone ()
>>> #5  0x08070095 in primitiveClone ()
>>> #6  0x0806de15 in interpret ()
>>> #7  0x08073dfe in main ()
>>> 
>>> 
>>> 
>>> 
>>> On 15.05.2013, at 13:59, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
>>> 
>>> Ok. So, first thing you should try, is to replace the uses of LargeIdentityDictionary with IdentityDictionary. And LargeIdentitySet with IdentitySet.
>>> If the problem disappears, then yes, there is something wrong with LargeCollections. If there is a problem with them, try updating VM, since they use a particular primitive.
>>> Let us know!
>>> 
>>> 
>>> On Tue, May 14, 2013 at 9:29 AM, Max Leske <maxleske at gmail.com> wrote:
>>>> 
>>>> 
>>>> On 14.05.2013, at 13:52, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
>>>> 
>>>> Hi Max. Question, are you able to reproduce the problem?
>>>> 
>>>> 
>>>> Yes, but not "on purpose". The situation usually happens once or twice a day and then with consistent log entries. That's why I want to use gdb the next time it happens.
>>>> 
>>>> 
>>>> 
>>>> On Tue, Apr 30, 2013 at 3:57 PM, Max Leske <maxleske at gmail.com> wrote:
>>>>> 
>>>>> Hi guys
>>>>> 
>>>>> I have a problem serializing a graph. Sometimes (not always) the image will consume +/- 100% CPU and stop responding. I was able to pin the problem down a bit:
>>>>> - fails always in FLIteratingCluster>>registerIndexesOn: when called from FLFixedObjectCluster with class TimeStamp (this might not actually be relevan but it's consistent)
>>>>> - the problem *might* be in FLLargeIdentityDictionary>>at:put: (or further up the stack)
>>>>> 
>>>>> I've done excessive logging to a file but even with flushing after every write the results are not consistent. Sometimes the image locks after leaving #at:put: sometimes it does somewhere in the middle or in #registerIndexesOn: (but remember: the logging might not be precise).
>>>>> 
>>>>> It's probably not the size of the objects in the cluster (the graph is big but not overly large), since there are other clusters with more objects.
>>>>> 
>>>>> What I did find is that the #grow operation for HashedCollections can be *very* slow, up to 20 seconds or more, at other times the snapshot runs through within no time.
>>>>> 
>>>>> So here's my theory: There migth be a VM problem with HashedCollections.
>>>>> Now, the VM is a rather old one and I haven't had the possibility to test this with a newer one (but I'll probably have to). The version is Squeak4.0.3-2202 running on 32-bit Debian Squeeze.
>>>>> 
>>>>> I'll try some more but if anyone has any ideas I'd be very happy :)
>>>>> 
>>>>> Cheers,
>>>>> Max
>>>>> _______________________________________________
>>>>> Pharo-fuel mailing list
>>>>> Pharo-fuel at lists.gforge.inria.fr
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Mariano
>>>> http://marianopeck.wordpress.com
>>>> _______________________________________________
>>>> Pharo-fuel mailing list
>>>> Pharo-fuel at lists.gforge.inria.fr
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Pharo-fuel mailing list
>>>> Pharo-fuel at lists.gforge.inria.fr
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Mariano
>>> http://marianopeck.wordpress.com
>>> _______________________________________________
>>> Pharo-fuel mailing list
>>> Pharo-fuel at lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Pharo-fuel mailing list
>>> Pharo-fuel at lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>> 
>> 
>> 
>> 
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>> _______________________________________________
>> Pharo-fuel mailing list
>> Pharo-fuel at lists.gforge.inria.fr
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>> 
>> 
>> 
> 
> _______________________________________________
> Pharo-fuel mailing list
> Pharo-fuel at lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel



More information about the Vm-dev mailing list