[Vm-dev] Linux 4.4.7.2357 VM crash under memory pressure

Bert Freudenberg bert at freudenbergs.de
Mon May 21 09:14:39 UTC 2012


On 21.05.2012, at 04:32, David T. Lewis wrote:

> 
> On Sun, May 20, 2012 at 08:55:18PM +0200, Bert Freudenberg wrote:
>> 
>> 
>> On 20.05.2012, at 19:35, David T. Lewis wrote:
>> 
>>> 
>>> On Sun, May 20, 2012 at 03:08:20PM +0200, Bert Freudenberg wrote:
>>>> 
>>>> Hilaire discovered that his newest DrGeo segfaults on the XO-1. It works fine elsewhere, including the XO-1.5, which has pretty much the same OS.
>>>> 
>>>> We narrowed down the problem to the XO-1 having only 256 MB of RAM and no swap space. I can reproduce the crash in a virtual Ubuntu 12 with 768 MB RAM (!) but no swap. Top reports:
>>>> 
>>>> Mem:    766204k total,   601588k used,   164616k free,    45624k buffers
>>>> Swap:        0k total,        0k used,        0k free,   277024k cached
>>>> 
>>>> but DrGeo still crashes. Etoys runs fine using the same Squeak VM on the same system (and on XO-1). DrGeo is based on Pharo 1.4, using a closure image. Etoys still is pre-closure. 
>>>> 
>>> 
>>> I recall some recent discussion on the Pharo list about some "strange objects"
>>> that had entered the image for a period of time. It was something to do with
>>> a mismatch in the number of instance variable slots. The VM crash is happening
>>> in a method that is stepping through the fields of an object, so if something
>>> was out of whack there it might well lead to problems.
>>> 
>>> The discussion started here:
>>> 
>>> http://lists.gforge.inria.fr/pipermail/pharo-project/2012-May/064539.html
>>> 
>>> It would be worth checking if the DrGeo image might have this issue, in
>>> case those objects might for some reason be interacting badly with the
>>> garbage collector.
>>> 
>>> Dave
>> 
>> Interesting. However, it's strange that this would manifest only on machines with less memory - shouldn't the VM topple over at the first GC no matter what?
>> 
> 
> I don't see any obvious reason why the small memory would make a difference
> either. I'm really just trying to think of things that might be different in
> this case. The failure seems to be happening in GC code that has not changed
> in recent years, but it's happening after calling a method that depends on
> object header format (#lastPointerWhileForwarding:), and it is happening in
> code the iterates over the fields of an object. Beyond that I'm just totally
> guessing.
> 
> Dumb question - do you know if any other closure-enabled images have this
> problem on small memory systems? I'd have thought someone would have noticed
> by now, but maybe not.
> 
> Dave

Reportedly Pharo 1.3 works fine. Only Pharo 1.4 has been seen crashing so far:

	http://lists.sugarlabs.org/archive/sugar-devel/2012-May/037490.html

And I just tried Squeak 4.3. No problem.

- Bert -



More information about the Vm-dev mailing list