[Pharo-fuel] [Vm-dev] Fwd: Possible collections problem
Max Leske
maxleske at gmail.com
Thu May 16 17:55:39 UTC 2013
I nearly went insane because of Ruby. Debian might be stable but you never get the new stuff… The result were several ours of trying to run "bundle install; rake build" until I finally just tried running "build_interpreter_vm.sh" and HURRAY! I had a working 4.12.4-2729 version.
So thanks Frank :)
The problem persists with this version too. During debugging I noticed that the image call stack was inside the weak finalization process nearly all the time (finalization -> GC, finalization -> GC, etc.). So what I'll try now is to kill the weak finalization process in the forked image. A preliminary test was successful and I'll now see how that works in production (it's an internal application, so not that much of a problem).
BTW: The graph I serialize does not contain any weak references, I double checked.
Cheers,
Max
On 16.05.2013, at 10:10, Frank Shearar <frank.shearar at gmail.com> wrote:
> On 16 May 2013 08:29, Max Leske <maxleske at gmail.com> wrote:
>>
>> Hi
>>
>> I'm forwarding this because I'd like to rule out a VM problem. Short summary:
>> I fork a squeak image and then serialize objects with Fuel. In roughly 40% of the cases the fork suddenly locks and consumes 100% CPU. The trace I most often see with gdb in that case is the one with
>> "#0 0x08060453 in updatePointersInRootObjectsFromto ()" at the top.
>>
>> The object processed when the lockup occufrs is always of class Timestamp, although that doesn't necessarily mean anything. Maybe it's more about the number of objects.
>>
>> I'm working on Debian, 32-bit and I can reproduce the problem with SqueakVM 4.4.7-2364 and 4.0.3-2202 (the newer ones wont run because of glibc). I haven't tried Cog yet.
>
> If it's convenient, running `bundle install; rake build` in a checkout
> of https://github.com/frankshearar/squeak-ci will build you a
> very-latest Interpreter VM (4.10.2.2614; executable at
> target/Squeak-4.10.2.2614-src-32/bld/squeak.sh) that will link against
> whatever glibc you have. (It works happily on my Ubuntu Lucid machine,
> which has glibc 2.11.) It works on OS X and Linux, but not on FreeBSD.
>
> frank
>
>> I also just checked that the problem occurs even if I don't serialize any timestamps (nor Process, Delay, Monitor, Semaphore; just to be sure).
>>
>> So if anyone has a clue as to what might be going on I'd really appreciate the help.
>>
>> Cheers,
>> Max
>>
>> Begin forwarded message:
>>
>> From: Mariano Martinez Peck <marianopeck at gmail.com>
>> Subject: Re: [Pharo-fuel] Possible collections problem
>> Date: 15. Mai 2013 16:53:10 MESZ
>> To: The Fuel Project <pharo-fuel at lists.gforge.inria.fr>
>> Reply-To: The Fuel Project <pharo-fuel at lists.gforge.inria.fr>
>>
>> I cannot see anything in particular. Too many GC stuff.
>> So I wouldn't spend more time trying to debug. I would try the none large collections. Then I would try with latest Cog and latest StackVM.
>>
>>
>>
>> On Wed, May 15, 2013 at 11:47 AM, Max Leske <maxleske at gmail.com> wrote:
>>>
>>> I've had several forks hanging around just now. Apart from one all of these were locked. I attached gdb and generated the c stack for all of them. Not sure if there's anything really interesting in there although clearly a lot of time is spent in GC and with creation of objects. That doesn't have to mean anything though.
>>>
>>> I haven't yet tried your suggestion Mariano.
>>>
>>> Cheers,
>>> Max
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>>
>>> #0 0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1 0x08060a77 in mapPointersInObjectsFromto ()
>>> #2 0x08060bb0 in incCompBody ()
>>> #3 0x08065fa7 in incrementalGC ()
>>> #4 0x080661a4 in sufficientSpaceAfterGC ()
>>> #5 0x08069420 in primitiveNew ()
>>> #6 0x0806de15 in interpret ()
>>> #7 0x08073dfe in main ()
>>>
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>>
>>> #0 0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1 0x08060a77 in mapPointersInObjectsFromto ()
>>> #2 0x08060bb0 in incCompBody ()
>>> #3 0x08065fa7 in incrementalGC ()
>>> #4 0x080661a4 in sufficientSpaceAfterGC ()
>>> #5 0x08069420 in primitiveNew ()
>>> #6 0x0806de15 in interpret ()
>>> #7 0x08073dfe in main ()
>>>
>>>
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>>
>>> #0 0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1 0x08060a77 in mapPointersInObjectsFromto ()
>>> #2 0x08060bb0 in incCompBody ()
>>> #3 0x08065fa7 in incrementalGC ()
>>> #4 0x080661a4 in sufficientSpaceAfterGC ()
>>> #5 0x0806fed2 in clone ()
>>> #6 0x08070095 in primitiveClone ()
>>> #7 0x0806de15 in interpret ()
>>> #8 0x08073dfe in main ()
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>>
>>> #0 0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1 0x08060a77 in mapPointersInObjectsFromto ()
>>> #2 0x08060bb0 in incCompBody ()
>>> #3 0x08065fa7 in incrementalGC ()
>>> #4 0x080661a4 in sufficientSpaceAfterGC ()
>>> #5 0x08069270 in primitiveNewWithArg ()
>>> #6 0x0806de15 in interpret ()
>>> #7 0x08073dfe in main ()
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0xb76f0f68 in select () from /lib/libc.so.6
>>>
>>> #0 0xb76f0f68 in select () from /lib/libc.so.6
>>> #1 0x08070880 in aioPoll ()
>>> #2 0xb762419e in ?? () from /usr/lib/squeak/4.0.3-2202//so.vm-display-X11
>>> #3 0x08073595 in ioRelinquishProcessorForMicroseconds ()
>>> #4 0x08061f24 in primitiveRelinquishProcessor ()
>>> #5 0x0806de15 in interpret ()
>>> #6 0x08073dfe in main ()
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0x08060453 in updatePointersInRootObjectsFromto ()
>>>
>>> #0 0x08060453 in updatePointersInRootObjectsFromto ()
>>> #1 0x08060a77 in mapPointersInObjectsFromto ()
>>> #2 0x08060bb0 in incCompBody ()
>>> #3 0x08065fa7 in incrementalGC ()
>>> #4 0x080661a4 in sufficientSpaceAfterGC ()
>>> #5 0x08069420 in primitiveNew ()
>>> #6 0x0806de15 in interpret ()
>>> #7 0x08073dfe in main ()
>>>
>>>
>>>
>>> [Thread debugging using libthread_db enabled]
>>> 0x08064e7e in markAndTrace ()
>>>
>>> #0 0x08064e7e in markAndTrace ()
>>> #1 0x0806593a in markPhase ()
>>> #2 0x08065f60 in incrementalGC ()
>>> #3 0x080661a4 in sufficientSpaceAfterGC ()
>>> #4 0x0806fed2 in clone ()
>>> #5 0x08070095 in primitiveClone ()
>>> #6 0x0806de15 in interpret ()
>>> #7 0x08073dfe in main ()
>>>
>>>
>>>
>>>
>>> On 15.05.2013, at 13:59, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
>>>
>>> Ok. So, first thing you should try, is to replace the uses of LargeIdentityDictionary with IdentityDictionary. And LargeIdentitySet with IdentitySet.
>>> If the problem disappears, then yes, there is something wrong with LargeCollections. If there is a problem with them, try updating VM, since they use a particular primitive.
>>> Let us know!
>>>
>>>
>>> On Tue, May 14, 2013 at 9:29 AM, Max Leske <maxleske at gmail.com> wrote:
>>>>
>>>>
>>>> On 14.05.2013, at 13:52, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
>>>>
>>>> Hi Max. Question, are you able to reproduce the problem?
>>>>
>>>>
>>>> Yes, but not "on purpose". The situation usually happens once or twice a day and then with consistent log entries. That's why I want to use gdb the next time it happens.
>>>>
>>>>
>>>>
>>>> On Tue, Apr 30, 2013 at 3:57 PM, Max Leske <maxleske at gmail.com> wrote:
>>>>>
>>>>> Hi guys
>>>>>
>>>>> I have a problem serializing a graph. Sometimes (not always) the image will consume +/- 100% CPU and stop responding. I was able to pin the problem down a bit:
>>>>> - fails always in FLIteratingCluster>>registerIndexesOn: when called from FLFixedObjectCluster with class TimeStamp (this might not actually be relevan but it's consistent)
>>>>> - the problem *might* be in FLLargeIdentityDictionary>>at:put: (or further up the stack)
>>>>>
>>>>> I've done excessive logging to a file but even with flushing after every write the results are not consistent. Sometimes the image locks after leaving #at:put: sometimes it does somewhere in the middle or in #registerIndexesOn: (but remember: the logging might not be precise).
>>>>>
>>>>> It's probably not the size of the objects in the cluster (the graph is big but not overly large), since there are other clusters with more objects.
>>>>>
>>>>> What I did find is that the #grow operation for HashedCollections can be *very* slow, up to 20 seconds or more, at other times the snapshot runs through within no time.
>>>>>
>>>>> So here's my theory: There migth be a VM problem with HashedCollections.
>>>>> Now, the VM is a rather old one and I haven't had the possibility to test this with a newer one (but I'll probably have to). The version is Squeak4.0.3-2202 running on 32-bit Debian Squeeze.
>>>>>
>>>>> I'll try some more but if anyone has any ideas I'd be very happy :)
>>>>>
>>>>> Cheers,
>>>>> Max
>>>>> _______________________________________________
>>>>> Pharo-fuel mailing list
>>>>> Pharo-fuel at lists.gforge.inria.fr
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mariano
>>>> http://marianopeck.wordpress.com
>>>> _______________________________________________
>>>> Pharo-fuel mailing list
>>>> Pharo-fuel at lists.gforge.inria.fr
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-fuel mailing list
>>>> Pharo-fuel at lists.gforge.inria.fr
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>>
>>>
>>>
>>>
>>> --
>>> Mariano
>>> http://marianopeck.wordpress.com
>>> _______________________________________________
>>> Pharo-fuel mailing list
>>> Pharo-fuel at lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-fuel mailing list
>>> Pharo-fuel at lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>> _______________________________________________
>> Pharo-fuel mailing list
>> Pharo-fuel at lists.gforge.inria.fr
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
>>
>>
>>
>
> _______________________________________________
> Pharo-fuel mailing list
> Pharo-fuel at lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
More information about the Vm-dev
mailing list