Hi Chris,
On Fri, Nov 16, 2018 at 3:11 PM Chris Muller ma.chris.m@gmail.com wrote:
Chris, If/when we implement this, the idea is to rehash the collection using: HashedCollection rehashAll Will this adverse affect your MagmaDictionary's?
No, but it won't help them at all, either.
Does anyone know if ReferenceStream automatially rehashes Dictionary's and Sets when it materializes them? If not, then the impact of changing the hash calculation is much higher than if they do.
It has to, because: Since Object>>#= and Object>>#hash fall back on #== and #identityHash, then Dictionary, Set et al instances are potentially affected by identity. Therefore on reconstructing a Dictionary,. Set et al instance an unpickler must rehash (*).
(*) unless it verifies that all elements implement their own #= and #hash, which is intractable in practice; the only ways I can see of verifying that an object does not use #== or #identityHash in its #= and #hash methods are
a) to analyze the code (impossible for a non-AI unpickler) or, b) construct a shallow copy of an object (since lots of #= implementations short-cut via "^self == other or: [...") and simulate #= and #hash, t5o see if #== or #identityHash is sent
Either of these would slow down unpicking enormously; rehashing invokes #= and #hash any way, but at full execution speed.
Magma does, so no action should be needed for regular Sets and Dictionary's.
Or will those be handled some other way?
However, Magma has another special dictionary called a MagmaDictionary which is designed to be larger than RAM. It maintains a reference to its 'session', which communicates with the server to access the large dictionary contents straight from the disk. If I have Intervals in any of these, I'll have to manually plan to rebuild them from scratch with a utility script(s), because they don't support rehashing the way small, in-memory HashedCollections do.
- Chris