[squeak-dev] true hash

Chris Muller asqueaker at gmail.com
Thu May 10 19:15:06 UTC 2012


Gentlemen.  Models extend outside the image and that's the context
I've been describing the issue from the start of the thread.  The
example for Bert was just a "sub-case" of the more general problem, I
was just trying to be illustrative with a more close-to-home scenario.

Please tell me what is your solution for the case where the legacy
persistent model is a MagmaDictionary hosted on-line with 60M elements
in it -- running right now in production?  rehash is no solution even
if it were possible and feasible, because each of hundreds of
*clients* might have different hash values for true, so the shared
model is being corrupted.  I argue they should be consistent.

I accept the counter-argument that this "probably wouldn't ever
happen" -- but I'm also concerned about the *severity* of the
punishment were it to occur, which I've already laid out.  The
solution is painless, I do not understand your objections..

Smalltalk is breaking out of the local image, expanding into the
network, so we should accept the idea of a universal "value" of true,
not just the true object local to the running image.  For that idea to
be safer, we should NOT continue to depend on true's #identityHash --
just as Bert said was a bad idea.  We should instead allow the
identityHash to vary independently from its #hash, a constant, in case
it needs to again (as it did on 12/1/2009).




On Thu, May 10, 2012 at 12:38 PM, Bert Freudenberg <bert at freudenbergs.de> wrote:
>
> On 10.05.2012, at 19:22, Eliot Miranda wrote:
>
>
>
> On Thu, May 10, 2012 at 10:16 AM, Chris Muller <asqueaker at gmail.com> wrote:
>>
>> >> I should add, it's already happened.  In 2009 Levente changed
>> >> Object>>#identityHash to answer the scaledIdentityHash.
>> >
>> > Not in Squeak. Our IdentityDictionary uses scaledIdentityHash nowadays,
>> > but identityHash itself is left alone, answering the primitive value
>> > directly.
>>
>> I meant to say Object>>#hash, not #identityHash.
>>
>> So, before 12/1/2009:
>>
>>     true hash  "2950"
>>
>> but after 12/1/2009
>>
>>     true hash  "773324800"
>>
>> So, any saved persistent EToys ReferenceStream object-models files
>> with true involved in the calculation of #hash prior to 2009 will now
>> be goofed up unless you remember to rehash all regular Dictionary's
>> after loading it.  The properties of this bug are:
>>
>>  - it is hidden, you had no idea it was there because no SUnit test
>> can possibly catch it.  It didn't show until production.
>>  - it is image-specific -- you load the file an image before
>> Levente's change and everything seems fine.  What's going on?
>>  - it is "intermittent" because there's a small possibility that, if
>> the Dictionary were small, you might get lucky with a "hit" anyway
>> when calculating the slot to start searching at
>>  - it could lead to corrupt data model, because perhaps the app does
>> something like #at:ifAbsentPut:, and maybe even on an
>> otherwise-equivalent object, so you end up with TWO of the "same"
>> object in the dictionary.  What a disaster!
>>
>> Now does it make sense?
>
>
> No.  One *always* has to rehash on loading binary since one cannot guarantee
> that identityHashes will be the same in the loading environment as the
> saving environment.  It s a non-issue.
>
> --
> best,
> Eliot
>
>
> Yep. See e.g. ImageSegment>>restoreEndianness (which does a bit more than
> the name suggests).
>
> - Bert -
>
>
>
>


More information about the Squeak-dev mailing list