[Vm-dev] Primitive to set an identityHash

Eliot Miranda eliot.miranda at gmail.com
Wed Jan 18 21:02:57 UTC 2012


On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen <
henrik.s.johansen at veloxit.no> wrote:

>
>
> On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
>
>
>
> On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <
> henrik.s.johansen at veloxit.no> wrote:
>
>>
>> I really don't see what good could come of it being available in general…
>>
>
> I can think of one good use, which my file tried to illustrate.  If Symbol
> instances identity hashes were derived from their string hash then they
> would be hashed the same in all images.  One can take advantage of this in
> e.g. method dictionary layout and hence binary class loading.  This happens
> in two steps.
>
> With modern machines, where linear search through a dictionary is fast,
> and with a JIT with inline cacheing (and even an interpreter with a large
> method lookup cache), where method dictionaries are not looked at much, one
> can save a significant amount of space by making method dictionaries flat
> pair-wise arrays of selector, method.  Most method dictionaries are small
> and linear search is faster than fetching the Symbol's identity hash and
> doing a hash probe.  By ordering method dictionaries  by selector
> identityHash, very large method dictionaries such as Object's are indexed
> using binary search.  We saved about 8% of the image size in VisualWorks by
> moving to this representation (one saves on eliminating the nils in the
> selector vector and the value vector, and in eliminating the value
> vector/method array, hence saving its header space; you still need the
> space for the method).  [The savings in Squeak look to be much less; I just
> found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
>
> Now, if in addition selector identityHashes are deterministic, derived
> from their string hash, then one does not need to rehash/reorder a method
> dictionary when loading it from a binary stream (e.g. Fuel), which is again
> a win.
>
> Now, I'm not suggesting we do either of these things now, but making
> Symbol identity hashes deterministic, derived from their string hash, can
> enable significant optimisation further down the road.
>
>
> And I wasn't suggesting it is a bad thing to use in all cases, but rather
> protesting having a method in the image screaming "use me if you need to
> change identityHash for whatever reason!" There's just no good general
> comment to put there on when it might be a good idea, and "Don't use this
> unless you know what you're doing!" never seems to stop anyone (well,
> speaking on my own behalf at least…)
>
> I'd rather see usage defined on a case-by-case basis, where you can more
> explicitly comment why using it in this particular case is a good idea,
> like you wrote above, and what Mariano mentioned for proxies.
>
> So rather than:
>
> Object setIdentityHashTo: aNumber
> <primitive: 161>
>
> you have:
>
> Symbol >> initialize
> self deriveIdentityHashFrom: self hash
>
> Symbol >> deriveIdentityHashFrom: aNumber
> "This should ONLY be called as part of object initialization!"
> "Symbols benefit from not using the default identityHash by *insert Eliots
> explanation here*"
>
> and similar for Mariano's Proxy class.
>
> Cheers,
> Henry
>

Good idea!

-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20120118/304caa1b/attachment-0001.htm


More information about the Vm-dev mailing list