[Seaside] Fwd: [Vm-dev] VM stability issue on unix

Wed Sep 6 07:56:23 UTC 2006

Begin forwarded message:

> From: John M McIntosh <johnmci at smalltalkconsulting.com>
> Date: September 5, 2006 11:18:17 PM GMT+02:00
> To: The general-purpose Squeak developers list <squeak- 
> dev at lists.squeakfoundation.org>
> Cc: adi at netstyle.ch
> Subject: Re: [Vm-dev] VM stability issue on unix
> Reply-To: johnmci at smalltalkconsulting.com
>
> Ok, if you can forward this to the Seaside list that would be good.
>
> It's possible that you run in the GC bug that I talk about  in
>
> http://minnow.cc.gatech.edu/squeak/3710
>
> Basically when the GC logic approaches a decision to grow memory it  
> first does a full GC event, if that GC event
> return just enough memory, then it will not grow the memory used by  
> the VM, however the amount of space required to
> do anything is insufficient, and after a few message sends we again  
> try to grow memory and do a full GC and recover
> just enough bytes not to force the grow.  Repeat a few million  
> times...  CPU goes to 100% always in GC logic no real work happens,  
> normally
> it only interates a few thousand, 10 or 100 thousand times, sucking  
> CPU uselessly...
>
> First of all you need a new Unix VM build with the latest VM maker  
> that has the needed code and primitive API.
>
> The look at my change sets.
>
> Somewhere at startup time you need to invoke:
>
> Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024.
> Smalltalk setGCBiasToGrow: 1.
> GCMonitor runActive.
>
> setGCBiasToGrow alters the VM to grow versus doing a full GC and  
> deciding to grow.
>
> setGCBiasToGrowGCLimit alters the limit to force a full GC after we  
> grow by this much to ensure growth is not unbound.
>
> The GCMonitor class allows you to collect statistical data from the  
> GC, either at the end of each GC cycle, or on a timer.
> This data can be then saved to a file for review to allow one to  
> intelligently adjust the GC parms .
> The supplied GCMonitor more a template than a finished product.
>
>
> It more importantly much like VisualWorks looks at some of the data  
> and makes runtime decisions like:
>
> a) force a tenure if we find we are doing too much root table  
> scanning, this happens if you allocate a large collection and it  
> gets put in to
> the root table, when looking for intergenerational references we  
> scan the entire million entry object since the GC knows the object  
> contains a pointer to an object that is a root, but which one?  
> Historically people would force a GC after allocating a large  
> collection to avoid this problem.   However if I say 	
>
> (statMarkCount ) > (statAllocationCount*2)
> 		ifTrue: [Smalltalk forceTenure].  "Tenure if we think too much  
> root table marking is going on"
>
> where we look at the mark count total versus the allocation count  
> total we can decide if we need to force a tenure to solve this  
> problem automatically.
>
> The code also has some example code (not used) to  alter the size  
> of the allocation and tenure targets to adjust a GC cycle to 1  
> millisecond.
> Those values where picked, oh 10  years back for 25 Mhz machines,  
> I'd guess 3Ghz machines can increment GC much more memory.
>
> 	(statIGCDeltaTime = 0) ifTrue:
> 		[target _ (Smalltalk vmParameterAt: 5)+21.
> 		Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
> after this many allocations"
> 		Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
> than this many objects survive the GC"].
> 	(statIGCDeltaTime > 0) ifTrue:
> 		[target _ ((Smalltalk vmParameterAt: 5)-27) max: 2000.
> 		Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
> after this many allocations"
> 		Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
> than this many objects survive the GC"].
>
> 	
> 	(statIGCDeltaTime < 1) ifTrue:
> 		[target _ (Smalltalk vmParameterAt: 5)+21.
> 		Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
> after this many allocations"
> 		Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
> than this many objects survive the GC"].
> 	(statIGCDeltaTime > 1) ifTrue:
> 		[target _ ((Smalltalk vmParameterAt: 5)-27) max: 4000.
> 		Smalltalk vmParameterAt: 5 put: target.  "do an incremental GC  
> after this many allocations"
> 		Smalltalk vmParameterAt: 6 put: target*3//4.  "tenure when more  
> than this many objects survive the GC"].! !
>
>
> I'll further note that Sophie has a SophieMemoryPolicy now to tune  
> Sophie GC behavior, perhaps SeaSide requires a SeasideMemoryPolicy to
> best tune the GC at runtime like more sophisticated VMs, like  
> VisualWorks?
>
> Lastly I'd welcome a statistical file or two to look at from a  
> large seaside image just to understand what's happening, and as  
> always for a fee I
> can perform a GC memory audit on any large scale VW or Squeak  
> application.
>
>
> On 5-Sep-06, at 3:06 AM, Adrian Lienhard wrote:
>
>> I've recently brought up the following issue on the VM mailing  
>> list, but got no reply so far for whatever reason...
>>
>> In short, the problem is that the unix VM blocks after the memory  
>> consumption exceeds about 120MB. I think its a critical bug,  
>> likely affecting many Seaside users who deploy on unix systems.
>>
>> I've filed the following Mantis report: http://bugs.impara.de/ 
>> view.php?id=4709
>>
>> Adrian
>>
>>
>> Begin forwarded message:
>>
>>> From: Adrian Lienhard <adi at netstyle.ch>
>>> Date: August 31, 2006 8:23:20 AM GMT+02:00
>>> To: vm-dev at lists.squeakfoundation.org
>>> Subject: [Vm-dev] VM stability issue on unix
>>> Reply-To: Squeak Virtual Machine Development Discussion <vm- 
>>> dev at lists.squeakfoundation.org>
>>>
>>> Hi VM maintainers,
>>>
>>> We have run into the following problem with 3.7/3.9 unix VMs (but  
>>> not with version 3.6). The VM hoggs the CPU and does not respond  
>>> anymore after consuming more than about 120MB of memory. The  
>>> problem is reproducible independently of the image version  
>>> (simply by instantiating enough objects).
>>>
>>> This is the VM version we are using:
>>>
>>> 3.7-7 #1 Don Okt 20 11:25:27 CEST 2005 gcc-Version (Debian  
>>> Squeak3.7 of '4 September 2004' [latest update: #5989] Linux  
>>> 2.6.10 #1 Tue Dec 28 21:16:21 CET 2004 i686 GNU/Linux
>>>
>>> Inspecting the Squeak process stacks with gdb does not show  
>>> anything unusual, however, one process does not seem to get back  
>>> from calling t	he new: primitive.
>>>
>>> The call stack of the VM looks like this:
>>>
>>> #0  updatePointersInRangeFromto (memStart=231866373,  
>>> memEnd=2138759076)
>>>     at gnu-interp.c:21562
>>> #1  0x0805d500 in incCompBody () at gnu-interp.c:4650
>>> #2  0x0805d2ed in fullGC () at gnu-interp.c:4500
>>> #3  0x0806d576 in sufficientSpaceAfterGC (minFree=202536) at gnu- 
>>> interp.c:21275
>>> #4  0x08067d5e in primitiveNewWithArg () at gnu-interp.c:16045
>>> #5  0x080614fa in interpret () at gnu-interp.c:7249
>>> #6  0x0805a5fb in main (argc=0, argv=0xbfeff8a4, envp=0x0)
>>>     at /usr/src/Squeak-3.7-7/platforms/unix/vm/sqUnixMain.c:1367
>>>
>>> It looks like the same issue has been discussed in the following  
>>> thread already: http://lists.squeakfoundation.org/pipermail/ 
>>> seaside/2005-October/005897.html.
>>> The proposed workaround of explicitly setting -memory with a high  
>>> enough value works, i.e., the vm does not stop working when the  
>>> memory consumption exceeds 120MB.
>>>
>>> Since in Seaside applications images often grow up to 200MB or  
>>> even more, this is a real show stopper...
>>>
>>> Cheers,
>>> Adrian
>>
>>
>
> --
> ====================================================================== 
> =====
> John M. McIntosh <johnmci at smalltalkconsulting.com>
> Corporate Smalltalk Consulting Ltd.  http:// 
> www.smalltalkconsulting.com
> ====================================================================== 
> =====
>
>