[squeak-dev] [Vm-dev] [Pharo-dev] Byte & String collection hash performance; a modest proposal for change.

Martin McClure martin at hand2mouse.com
Tue May 2 01:06:42 UTC 2017


On 05/01/2017 11:55 AM, Eliot Miranda wrote:
> 
> 
> On Mon, May 1, 2017 at 10:37 AM, Tony Garnock-Jones <tonyg at ccs.neu.edu
> <mailto:tonyg at ccs.neu.edu>> wrote:
> 
>     On 5/1/17 1:26 PM, Levente Uzonyi wrote:
>     > I presume that a general purpose in-image solution would be more
>     > complex. String already has too many subclasses (6 in Squeak), while at
>     > the same time other kind of new subclasses would be welcome too, e.g.
>     > Strings with 2-byte characters.
>     > Since these properties are orthogonal, there would be many new
>     > subclasses to cover all cases.
> 
>     A classic motivating case for Traits, right?
> 
> 
> Alas no.  The problem with String's subclasses is that they're binary
> objects.  They have no inst vars in which one can cache a hash.  They're
> juts a flat vector of bytes.  If one added a trait to them the system
> would fail because there's no way to add an inst var to a binary object
> in the current Smalltalk object representation.  hence Levente's very
> clever idea of hiding the hash in a hidden header word.
> 

That is a clever idea. I wouldn't actually *implement* that solution
though. It would add considerable complexity, considering that strings
are mutable and their hashes can change. And the problem that this would
solve is hypothetical isn't it? At least I haven't seen anyone pointing
to any actual applications that need to hash very large strings.

Regards,

-Martin



More information about the Squeak-dev mailing list