[ENH][VM] HashBits, a lazy way

Wed Jul 16 01:09:59 UTC 2003

Hi John,

I've been looking over your changes and my feeling is that the complexity of
the implementation of having objects "magically aquire" a hash when they are
moved from young to old space just isn't worth the trouble of finding the
bugs if any mistake is ever made here.

Perhaps as an enlightening example, you have (at least) one bug in the hash
implementation. In #newObjectHash you try to avoid zero hashes, however, you
do this based on a 16 bit counter where we only use 12 bits later on. In
other words, if the lastHash is computed to 16rX000 then the hash actually
stored *will* be zero (btw, surprising you didn't find this - it happens all
4096 allocations or so ;-) The kind of extremely subtle problems we're
potentially introducing here is *way* beyound what I consider reasonable,
even given the potential speed advantages. In short, if any such change is
ever made, then we really should do so on a single consistent model (zero
hash means unassigned) and not based on a mix of space location, with the
need to assign those hashes when something moves between here and there. My
opinion: Not in VI3.

Here's another thought on this. Not sure if you realize this but our "random
number generator" has a periodicity of 4096, e.g., the values are spread
evenly so it's essentially nothing but a shuffled list of values from zero
to 4096. This, in turn, means that you don't have to do any computation if
you don't want to - you could precompute all the hashes and keep them in a
lookup table with the bits at the right position already. I don't know what
the implications on speed are (considering cache size effects etc.) but it
may very well be worth a try and it avoids the nastyness of the above. E.g.,
#newObjectHash might just look like:

	^hashTable at: (hashIndex := hashIndex + 1 bitAnd: 4095)

About the other changes, the refactoring for allocating unfilled objects
look reasonable but I don't understand why you changed the "other
primitives" to use the unfilled allocation. There is no reason for it except
from an (irrelevant) tiny increment in speed. It seems to me that we should
only optimize places where we can show that they have practical implications
- none of which is true for any of the "other primitives". Allocating filled
objects is the safe, recommended way and using unfilled allocations should
really only be done in places where we expect people to understand that they
are time-critical (such as the #floatObjectOf: or
#positive32BitIntegerFor:).

BTW, your measures below are all micro-benchmarks, did you ever try any of
the macros to see what the results are in practice? In particular, the
impact on tenuring would be interesting in the "assign hashes on tenure"
scheme. I'd like to see if there's a real-world improvement besides the
theoretical case. And, I would actually expect any measurable improvements
come from the unfilled allocations - any chance to factor those out and run
macros on them separately?

Cheers,
  - Andreas

PS. Here's the code I used for periodicity checking. I'm sure the effect can
be proven analytically:

bag := Bag new.
period := 4096.
0 to: 16rFFFF do:[:initialHash|
	(initialHash bitAnd: 16r0F) = 0 
		ifTrue:[initialHash printString displayAt: 0 at 0].
	lastHash := initialHash.
	"Compute period new hash values"
	1 to: period do:[:i| lastHash _ 13849 + (27181 * lastHash) bitAnd:
65535].
	"See if the last value is the same as the initial one."
	repeat := (lastHash bitAnd: 4095) = (initialHash bitAnd: 4095).
	bag add: repeat.
].
"If all of the initial values repeat it follows that the
generator repeats at (at most) period for the range of values used."
bag sortedCounts

=> a SortedCollection(65536->true)

> -----Original Message-----
> From: squeak-dev-bounces at lists.squeakfoundation.org 
> [mailto:squeak-dev-bounces at lists.squeakfoundation.org] On 
> Behalf Of John M McIntosh
> Sent: Tuesday, July 15, 2003 11:35 PM
> To: squeak-dev at lists.squeakfoundation.org
> Subject: [ENH][VM] HashBits, a lazy way 
> 
> 
> Attached is the changeset to:
> 
> a) Defer creating the identity hash until it is actually  needed by  
> someone calling the identity hash primitive.   This removes it from  
> clone:, instantiate class and instantiate small class.
> 
> b) Cleanup of the small class allocation, and instantiate class  
> allocation logic to streamline the process.
> 
> c) no fill of most small class objects (Points,float, context, etc)  
> Plus some other objects we immediate fill with data, such as the VM  
> clipboard, vmparms, vm image name, etc...
> 
> d) A test class to iterate over the possible (I think) set of  
> allocation requests to ensure before/after correctness.
> 
> This change set improves the allocation rate {based on allocating 50  
> objects in a loop}.
> 
> 1 asFloat, which now avoids generating a hash, and filling 
> the object  
> with zero bits before filling with the data.
> before 3.5.2b1 -> 133,153 allocations per second, versus now ->   
> 150,459 per second. 13% better.
> 
> 2 at 3
> before 3.5.2b1 -> 165,856, after -> 183,413. 11% better
> 
> Array new: 4, which now avoids generating a hash, plus a bit of  
> streamlining.
> before 3.5.2b1 -> 129,045, versus new 138,067 per second, 7% better.
> 
> VM folks should check all this to see if there is some issue I've  
> overlooked asking the questions...
> 
> a) Are the calls where we allocate small class objects with no-fill,  
> viable, and not likely to raise a core dump because
> of bad pointers? All calls, except for 
> Interpreter>>primitiveNewMethod,  
> and ObjectMemory>>instantiateContext:sizeInBytes:
> allocate bytes, those two allocate object. However  
> instantiateContext:sizeInBytes: did not do a fill before so 
> it should  
> be ok based on historical evidence.
> 
> b) Will objects with an identity hash of zero  leak into old space.  
> I've attempted to find all occurrences. The expected ones
> are at tenure time in the incremental GC, the tenure when 
> doing a full  
> GC. Keep in mind the rule is that Old space may already contain  
> identity hash values of zero, and image segments we load 
> might contain  
> hash values of zero. Also segments we create should of course have  
> non-zero hash values for all objects we store into that segment.
> 
> Other more exotic ones can be at
> ObjectMemory>>restoreHeadersAfterBecoming: with:
> ObjectMemory>>restoreHeadersAfterForwardBecome
> {For these I don't believe I need to calculate the identity hash at  
> this point, but I'm being paranoid, however someone can
> confirm this decision and perhaps the code can be removed from these  
> two methods)
> 
> Then just to complicate things we can expect issues when we store or  
> load an image segment.  IE load/save a project.
> To avoid issues here what I have done is in
> Interpreter>>primitiveLoadImageSegment
> Interpreter>>primitiveStoreImageSegment
> 
> I invoke a incremental GC and force the tenure of all the objects to  
> oldspace which triggers doing an identity hash
> creation on all survivors. Then the primitiveStoreImageSegment which  
> copies everything to the supplied chunk of memory  will of 
> course find  
> only Old Space objects with non-zero identity hash values.  The  
> primitiveLoadImageSegment will end up remapping
> the segment that contains the loaded objects into a set of 
> real objects  
> and as far as I know it doesn't invoke an allocation, also it
> will of course find the segment in oldspace because of the earlier  
> forced tenure event, and if it brings into objects with a 
> hash value of  
> zero they won't be converted.
> 
> Confirmation that I've managed to make this bit of code work 
> correctly  
> because I'm not an expert on how the image segment loading/unloading  
> works would be welcome. Missing in my test suite is  a before after  
> check on hash identity zero counts... Oh well tomorrow...
> 
> Note when we migrate to v4.x and have an image format change we can  
> switch to all identity hash values
> being non-zero when used, and remove the logic to hash when moving  
> objects out of youngSpace which saves a bit
> of CPU time and avoids complexity, by falling back on the rule of  
> identity hash = 0 is a marker to invoke the hash.
> 
> 
> PS some rewriting of New: might occur later this month, because I  
> noticed we check for space in youngSpace for the new,
> force a igc/fullgc etc if not enough, then if ok we invoke the same  
> space checks & logic before allocating the object. Seem we
> could do the check once, by changing the failure mode in the 
> allocation  
> logic to decide if the failure is a VM core dump, or a
> primitive fail.
> --
> ==============================================================
> ========== 
> ===
> John M. McIntosh <johnmci at smalltalkconsulting.com> 1-800-477-2659
> Corporate Smalltalk Consulting Ltd.  
> http://www.smalltalkconsulting.com
> 
> ==============================================================
> ========== 
> ===
>