Hi Chris,

On Fri, Nov 16, 2018 at 3:11 PM Chris Muller <ma.chris.m@gmail.com> wrote:
> Chris,
> If/when we implement this, the idea is to rehash the collection using:
>     HashedCollection rehashAll
> Will this adverse affect your MagmaDictionary's?

No, but it won't help them at all, either.

Does anyone know if ReferenceStream automatially rehashes Dictionary's
and Sets when it materializes them?  If not, then the impact of
changing the hash calculation is much higher than if they do.

It has to, because:  Since Object>>#= and Object>>#hash fall back on #== and #identityHash, then Dictionary, Set et al instances are potentially affected by identity.  Therefore on reconstructing a Dictionary,. Set et al instance an unpickler must rehash (*).


(*) unless it verifies that all elements implement their own #= and #hash, which is intractable in practice; the only ways I can see of verifying that an object does not use #== or #identityHash in its #= and #hash methods are

a) to analyze the code (impossible for a non-AI unpickler) or,
b) construct a shallow copy of an object (since lots of #= implementations short-cut via "^self == other or: [...") and simulate #= and #hash, t5o see if #== or #identityHash is sent

Either of these would slow down unpicking enormously; rehashing invokes #= and #hash any way, but at full execution speed.


Magma does, so no action should be needed for regular Sets and Dictionary's.

> Or will those be handled some other way?

However, Magma has another special dictionary called a MagmaDictionary
which is designed to be larger than RAM.  It maintains a reference to
its 'session', which communicates with the server to access the large
dictionary contents straight from the disk.  If I have Intervals in
any of these, I'll have to manually plan to rebuild them from scratch
with a utility script(s), because they don't support rehashing the way
small, in-memory HashedCollections do.

 - Chris



--
_,,,^..^,,,_
best, Eliot