[Vm-dev] I think I've found a bug in the 64-bit Squeak VM memory compactor

John M McIntosh johnmci at smalltalkconsulting.com
Tue Jun 15 07:49:51 UTC 2010


David, I built the Squeak 64-32 5.4b2 which is part of the objective-c rewrite last year for the os-x VM and merging of the iPhone/iPad source tree. 

This gets bug report 
http://bugs.squeak.org/view.php?id=7549


The SQ_FAKE_MEMORY_OFFSET is only used in the unix builds for debugging, it ensures that 16 bytes is subtracted from the mmap address and that sqMemoryBase is set to 16 in order to calculate the same address to debug the use of memory address calculations. 
It's ignored for windows and the os-x builds. 

Now the Squeak 64-32 5.4b2  is a fat binary for 32 and 64bit CPUS using a 32bit image.  If you are running as a 64bit macintel machine under 10.6 then the VM is 64bit mode, and oops memory address are calculated using the 64bit sqMemoryBase plus 32bit oops address to calculate a 64bit memory address. The memory offset start sqMemoryBase is usually 0x117000000. mmap chooses this to supply a memory address > 4GB so that issues with 32bit versus 64bit address become apparent as the first 4GB of memory isn't read/writeable.  

In 64bit with a 64bit image sqMemoryBase then is zero and the 0x117000000 address is used to represent "zero" in the Oops space which is the start of the image. 

I note that GCC does not handle the   
static inline char *pointerForOop(usqInt oop)			{ return sqMemoryBase + oop; }
very well where sqMemoryBase is a non-zero constant or dynamic value. 

This results in a significant slowdown. 
Your choice is to use the Intel compiler, or to compile just for 32bit mode. 

Note that sqMemoryBase is either a zero constant if in 32bit VM and 32bit image mode then compiler optimized away, or 0x117000000 in 64bit vm, 32bit image or a dynamic value of zero if in 64bit vm, 64 image mode

So given all that we end up with 

sqMemoryBase equal to 0x117000000 and the start of the oops space a 32bit value of zero. 

Mmm freeChunk is a signed integer, but when it runs the "freeChunk := oop - hdrBytes." does freechunk then equal  zero? 

> ifFalse: ["start a new free chunk"
> 							freeChunk := oop - hdrBytes.
> 							"chunk may start 4 or 8 bytes before oop"
> 							freeChunkSize := oopSize + (oop - freeChunk).
> 							"adjust size for possible extra header bytes"
> 							firstFree = nil ifTrue: [firstFree := freeChunk]]]


Likely of course this signage issue is a concern because what if freeChunk :=  0x9000000 making a negative number then later it does the freeChunkSize := oopSize + (oop - freeChunk) and I think the math would be wrong, well assuming the (oop - freeChunk) is a signed operatin, I'd guess both firstFree and freeChunk should be unsigned ints just to clarify. 

Er isn't freeChunkSize := oopSize + (oop - freeChunk).  actually  freeChunkSize := oopSize + hdrBytes.  Or am I confused tonight? 

Anyway the problem is that freeChunk = zero is also being used as a test for freeChunk has no value versus freeChunk is zero is a valid value. 

Therefore to fix I think we have to set freeChunk to 0xFFFFFFFF then check for freeChunk ~= 0xFFFFFFFF
Also we need to set firstFree to 0xFFFFFFFF and check both firstFree = 0xFFFFFFFF ifTrue:

In 64bit mode we need to use 0xFFFFFFFFFFFFFFFF   

That would allow 0x0 as a valid value and where 0xFFFFFFFF... is a non-value. 

On 2010-06-14, at 11:31 PM, ungar at mac.com wrote:

> 
> Hello Dave,
> 
> You are welcome. I am indeed reading  in a snapshot with 32-bit oops into the stock ST Mac OSX VM I downloaded from the website. The file name is: Squeak 64-32 5.4b2. I don't know what its setting of SQ_FAKE_MEMORY_OFFSET is, since I am not building it, just using it. I do know that the first word of the heap is called "0" when running in 64-bit mode. (It is a fat binary that will run on an Intel CPU either in 32 or 64 bit mode.)
> 
> I am not (perhaps yet) working with a 64-bit oops size, but may do so in the future, and if so, your pointer will be helpful. Thank you!
> 
> Do you know who might be in a position to look at the bug I found and decide what is the right thing to do for the Squeak community? I am trying to figure out who "holds the baton" for that system.
> 
> Thank you,
> 
> - David
> 
> 
> On Jun 14, 2010, at 12:06 PM, David T. Lewis wrote:
> 
>> 
>> Hello David and thank you for reporting this.
>> 
>> I am assuming that you are working with a 32-bit word size object
>> memory (not the Squeak 64-bit object memory), and that "64 bit VM"
>> refers to compiling for 64-bit systems (64-bit machine pointer size).
>> If so, the difference that you are seeing between the 32-bit and 64-bit
>> version may be due to the setting of SQ_FAKE_MEMORY_OFFSET in the
>> platforms/Cross/vm/sqMemoryAccess.h header file. This is set to a non-zero
>> value for the the combination of 32-bit object work and 64-bit host machine,
>> specifically to help flush out bugs like the one you have found here.
>> 
>> I suspect that if you set SQ_FAKE_MEMORY_OFFSET to 0, the problem may
>> appear to go away.
>> 
>> If you are also working with 64 bit object memories, then you will want
>> to be aware of the (probably unrelated) issue that is documented at
>> <http://bugs.squeak.org/view.php?id=7455>. The bug is harmless at the
>> moment, but if you are making any changes to the image snapshot file
>> header you will want to keep it in mind. The summary is:
>> 
>> 7455: VM overwrites extraVmMemory value with junk on 64 bit image 
>> A 64 bit image (wordSize 8) is currently using a 128 byte image file header,
>> with all the integer values in the header stored as 8 bit ints. The VM is
>> writing the image data at position 64 rather than 128, resulting the the
>> value of extraVmMemory being overwritten with garbage. ImageTracer64
>> produces the correct 128 byte offset, but this is undone by the VM after
>> the first image save.
>> 
>> Dave
>> 
>> On Mon, Jun 14, 2010 at 10:00:48AM -0700, ungar at mac.com wrote:
>>> 
>>> Friends, Romans, and SqueakVMers,
>>> 
>>> As some of you know, I am working on my own Squeak VM, and I recently ran into a strange situation where the 64-bit Squeak VM would loop forever reading one of my snapshots. The 32-bit version--that is, just running it in 32-bit mode--worked fine. I spent some time tracking this down, and believe that the problem was caused by a bug in the 64-bit Squeak VM GC code that is only excited when the first word in the heap is the start of a free chunk. I don't know where I should be sending this report, so I'm sending it to you, in hopes that you may be able to forward it to the right place and/or people.
>>> 
>>> 
>>> Here is the explanation:
>>> 
>>> 
>>> 
>>> Since my object format uses extra words per object preceding the standard ST header, when I write out a snapshot, I write those words out as "Free chunks", the official ST way of saying there are X words not part of any object.
>>> In the 32-bit version of the real Squeak VM, Oops are (tagged) addresses of the target objects.
>>> The 64-bit version of the real Squeak VM, each 32-bit  oop in the image is added to an offset to product the (64-bit) address of the object. Thus, oops in the image are relative to the start of the heap.
>>> 
>>> The sweep phase of the Squeak VM GC is this:
>>> 
>>> 
>>>> sweepPhase
>>>> 	"Sweep memory from youngStart through the end of memory. Free all
>>>> 	inaccessible objects and coalesce adjacent free chunks. Clear the mark
>>>> 	bits of accessible objects. Compute the starting point for the first pass of
>>>> 	incremental compaction (compStart). Return the number of surviving
>>>> 	objects. "
>>>> 	"Details: Each time a non-free object is encountered, decrement the
>>>> 	number of available forward table entries. If all entries are spoken for
>>>> 	(i.e., entriesAvailable reaches zero), set compStart to the last free
>>>> 	chunk before that object or, if there is no free chunk before the given
>>>> 	object, the first free chunk after it. Thus, at the end of the sweep
>>>> 	phase, compStart through compEnd spans the highest collection of
>>>> 	non-free objects that can be accomodated by the forwarding table. This
>>>> 	information is used by the first pass of incremental compaction to
>>>> 	ensure that space is initially freed at the end of memory. Note that
>>>> 	there should always be at least one free chunk--the one at the end of
>>>> 	the heap."
>>>> 	| entriesAvailable survivors freeChunk firstFree oop oopHeader oopHeaderType hdrBytes oopSize freeChunkSize endOfMemoryLocal |
>>>> 	self inline: false.
>>>> 	self var: #oop type: 'usqInt'.
>>>> 	self var: #endOfMemoryLocal type: 'usqInt'.
>>>> 	entriesAvailable := self fwdTableInit: self bytesPerWord * 2.
>>>> 	survivors := 0.
>>>> 	freeChunk := nil.
>>>> 	firstFree := nil.
>>>> 	"will be updated later"
>>>> 	endOfMemoryLocal := endOfMemory.
>>>> 	oop := self oopFromChunk: youngStart.
>>>> 	[oop<  endOfMemoryLocal]
>>>> 		whileTrue: ["get oop's header, header type, size, and header size"
>>>> 			statSweepCount := statSweepCount + 1.
>>>> 			oopHeader := self baseHeader: oop.
>>>> 			oopHeaderType := oopHeader bitAnd: TypeMask.
>>>> 			hdrBytes := headerTypeBytes at: oopHeaderType.
>>>> 			(oopHeaderType bitAnd: 1) = 1
>>>> 				ifTrue: [oopSize := oopHeader bitAnd: self sizeMask]
>>>> 				ifFalse: [oopHeaderType = HeaderTypeSizeAndClass
>>>> 						ifTrue: [oopSize := (self sizeHeader: oop) bitAnd: self longSizeMask]
>>>> 						ifFalse: ["free chunk" oopSize := oopHeader bitAnd: self longSizeMask]].
>>>> 			(oopHeader bitAnd: self markBit) = 0
>>>> 				ifTrue: ["object is not marked; free it"
>>>> 					"<-- Finalization support: We need to mark each oop chunk as free -->"
>>>> 					self longAt: oop - hdrBytes put: HeaderTypeFree.
>>>> 					freeChunk ~= nil
>>>> 						ifTrue: ["enlarge current free chunk to include this oop"
>>>> 							freeChunkSize := freeChunkSize + oopSize + hdrBytes]
>>>> 						ifFalse: ["start a new free chunk"
>>>> 							freeChunk := oop - hdrBytes.
>>>> 							"chunk may start 4 or 8 bytes before oop"
>>>> 							freeChunkSize := oopSize + (oop - freeChunk).
>>>> 							"adjust size for possible extra header bytes"
>>>> 							firstFree = nil ifTrue: [firstFree := freeChunk]]]
>>>> 				ifFalse: ["object is marked; clear its mark bit and possibly adjust
>>>> 					the compaction start"
>>>> 					self longAt: oop put: (oopHeader bitAnd: self allButMarkBit).
>>>> 					"<-- Finalization support: Check if we're running about a weak class -->"
>>>> 					(self isWeakNonInt: oop) ifTrue: [self finalizeReference: oop].
>>>> 					entriesAvailable>  0
>>>> 						ifTrue: [entriesAvailable := entriesAvailable - 1]
>>>> 						ifFalse: ["start compaction at the last free chunk before this object"
>>>> 							firstFree := freeChunk].
>>>> 					freeChunk ~= nil "BUG cannot handle free start--dmu! ***********************************************"
>>>> 						ifTrue: ["record the size of the last free chunk"
>>>> 							self longAt: freeChunk put: ((freeChunkSize bitAnd: self longSizeMask) bitOr: HeaderTypeFree).
>>>> 							freeChunk := nil].
>>>> 					survivors := survivors + 1].
>>>> 			oop := self oopFromChunk: oop + oopSize].
>>>> 	freeChunk ~= nil
>>>> 		ifTrue: ["record size of final free chunk"
>>>> 			self longAt: freeChunk put: ((freeChunkSize bitAnd: self longSizeMask) bitOr: HeaderTypeFree)].
>>>> 	oop = endOfMemory
>>>> 		ifFalse: [self error: 'sweep failed to find exact end of memory'].
>>>> 	firstFree = nil
>>>> 		ifTrue: [self error: 'expected to find at least one free object']
>>>> 		ifFalse: [compStart := firstFree].
>>>> 
>>>> 	^ survivors
>>>> 
>>> 
>>> Notice the line with the comment I added, it has a lot of asterisks. It is trying to go back to the last free chunk found and update its length, now that its found non-free chunk.
>>> But, the test works by initializing the "freeChunk" variable to zero (called "nil" in this code)!!
>>> If the first free chunk is at the start of the heap, it's "oop" will be zero in the 64-bit case, and it's length will NEVER be initialized.
>>> As a result, when the compaction algorithm scans the heap, it gets stuck on this zero-length free chunk and loops forever.
>>> (I think the other nil tests are troublemakers, too, BTW).
>>> 
>>> The workaround for my VM is simply to skip those first free words, so that the first word in the heap is a non-free object. (The snapshot code does a GC so all real free objects are gone by this time).
>>> 
>>> However, you might want to redo the tests so the algorithm tolerates a free chunk at the start of the heap.
>>> 
>>> Thank you for all your efforts!
>>> 
>>> - David
>>> 
>>> 
>>> 
>>> 
> 

--
===========================================================================
John M. McIntosh <johnmci at smalltalkconsulting.com>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================




-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2445 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20100615/244d0da1/smime.bin


More information about the Vm-dev mailing list