[squeak-dev] The Trunk: Collections-ul.798.mcz

Chris Muller asqueaker at gmail.com
Tue Jul 3 02:36:38 UTC 2018


Do existing Dictionary's need to be rehashed due to this?

On Mon, Jul 2, 2018 at 1:55 PM,  <commits at source.squeak.org> wrote:
> Levente Uzonyi uploaded a new version of Collections to project The Trunk:
> http://source.squeak.org/trunk/Collections-ul.798.mcz
>
> ==================== Summary ====================
>
> Name: Collections-ul.798
> Author: ul
> Time: 2 July 2018, 8:52:05.858045 pm
> UUID: 8ed17235-fb2c-4453-9e6e-977d6d838cac
> Ancestors: Collections-cmm.797
>
> HashedCollection changes:
> - added 3, 7 and 13 to #goodPrimes. These will mainly be used during compaction as the default grow sequence is 5, 11, 17, etc..
> - improved performance of #goodPrimeAtLeast: by reordering comparisons
> - finally deprecated #findElementOrNil: and #fullCheck
>
> SequenceableCollection changes:
> - improved performance of #findBinary* methods by reordering comparisons and postponing index calculation
> - did the same in SortedCollection >> #indexForInserting: along with using #ifNil:ifNotNil:
>
> =============== Diff against Collections-cmm.797 ===============
>
> Item was changed:
>   ----- Method: HashedCollection class>>goodPrimeAtLeast: (in category 'sizing') -----
>   goodPrimeAtLeast: lowerLimit
>         "Answer the next good prime >= lowerlimit.
>         If lowerLimit is larger than the largest known good prime,
>         just make it odd."
>
> +       | low mid high prime primes |
> -       | primes low mid high prime |
>         primes := self goodPrimes.
> -       low := 1.
>         high := primes size.
>         lowerLimit > (primes at: high) ifTrue: [
>                 ^lowerLimit bitOr: 1 ].
> +       low := 1.
>         [ high - low <= 1 ] whileFalse: [
>                 mid := high + low // 2.
>                 prime := primes at: mid.
> +               lowerLimit < prime
> +                       ifTrue: [ high := mid ]
> +                       ifFalse: [
> +                               lowerLimit > prime
> +                                       ifTrue: [ low := mid ]
> +                                       ifFalse: [ ^prime ] ] ].
> -               prime = lowerLimit ifTrue: [ ^prime ].
> -               prime < lowerLimit
> -                       ifTrue: [ low := mid ]
> -                       ifFalse: [ high := mid ] ].
>         (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>         ^primes at: high!
>
> Item was changed:
>   ----- Method: HashedCollection class>>goodPrimes (in category 'sizing') -----
>   goodPrimes
>         "Answer a sorted array of prime numbers less than one billion that make good
>         hash table sizes. Should be expanded as needed.  See comments below code"
>
> +       ^#(3 5 7 11 13 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
> -       ^#(
> -               5 11 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
>                 2237 2423 2617 2797 2999 3167 3359 3539 3727 3911
>                 4441 4787 5119 5471 5801 6143 6521 6827 7177 7517 7853
>                 8783 9601 10243 10867 11549 12239 12919 13679 14293 15013 15731
>                 17569 19051 20443 21767 23159 24611 25847 27397 28571 30047 31397
>                 35771 38201 40841 43973 46633 48989 51631 54371 57349 60139 62969
>                 70589 76091 80347 85843 90697 95791 101051 106261 111143 115777 120691 126311
>                 140863 150523 160969 170557 181243 190717 201653 211891 221251 232591 242873 251443
>                 282089 300869 321949 341227 362353 383681 401411 422927 443231 464951 482033 504011
>                 562621 605779 647659 681607 723623 763307 808261 844709 886163 926623 967229 1014617
>                 1121987 1201469 1268789 1345651 1429531 1492177 1577839 1651547 1722601 1800377 1878623 1942141 2028401
>                 2242727 2399581 2559173 2686813 2836357 3005579 3144971 3283993 3460133 3582923 3757093 3903769 4061261
>                 4455361 4783837 5068529 5418079 5680243 6000023 6292981 6611497 6884641 7211599 7514189 7798313 8077189
>                 9031853 9612721 10226107 10745291 11338417 11939203 12567671 13212697 13816333 14337529 14938571 15595673 16147291
>                 17851577 18993941 20180239 21228533 22375079 23450491 24635579 25683871 26850101 27921689 29090911 30153841 31292507 32467307
>                 35817611 37983761 40234253 42457253 44750177 46957969 49175831 51442639 53726417 55954637 58126987 60365939 62666977 64826669
>                 71582779 76039231 80534381 84995153 89500331 93956777 98470819 102879613 107400389 111856841 116365721 120819287 125246581 129732203
>                 143163379 152076289 161031319 169981667 179000669 187913573 196826447 205826729 214748357 223713691 232679021 241591901 250504801 259470131
>                 285162679 301939921 318717121 335494331 352271573 369148753 385926017 402603193 419480419 436157621 453034849 469712051 486589307 503366497 520043707
>                 570475349 603929813 637584271 671138659 704693081 738247541 771801929 805356457 838910803 872365267 905919671 939574117 973128521 1006682977 1040137411
>                 1073741833)
>
>   "The above primes past 2069 were chosen carefully so that they do not interact badly with 1664525 (used by hashMultiply), and so that gcd(p, (256^k) +/- a) = 1, for 0<a<=32 and 0<k<=8.  See Knuth's TAOCP for details."
>
>   "The above primes also try to map the values of ((0 to: 4095) collect: [ :each | each << 18 \\ prime ]) sort to an equidistant sequence of numbers. This helps to avoid the collision of chains in identity-based hashed collections. To do that  they were chosen to return a low value when the following block is evaluated with them as argument:
>    [ :prime |
> +       | n slots cost optimalDistance |
> +       n := 1 bitShift: 22.
> +       slots := Array new: n + 1.
> +       0 to: n - 1 do: [ :ea | slots at: ea + 1 put: (ea bitShift: 8) \\ prime ].
> +       slots at: n + 1 put: prime.
> -       | slots cost optimalDistance previous |
> -       slots := Array new: 4097.
> -       0 to: 4095 do: [ :ea | slots at: ea + 1 put: ea *  262144 \\ prime ].
> -       slots at: 4097 put: prime.
>         slots sort.
>         cost := 0.
> +       optimalDistance := prime // n.
> +       2 to: n + 1 do: [ :index |
> -       optimalDistance := prime // 4096.
> -       2 to: 4097 do: [ :index |
>                 | newCost |
>                 newCost := optimalDistance - ((slots at: index) - (slots at: index - 1)).
>                 newCost > cost ifTrue: [ cost := newCost ] ].
> +       result add: prime -> cost ]
> +
> + The shifts in the block relate to the numer of bits the #identityHash consists of (22) and the number of bits #scaledIdentityHash shifts it (8)"!
> -       cost ]."!
>
> Item was removed:
> - ----- Method: HashedCollection>>findElementOrNil: (in category 'compatibility') -----
> - findElementOrNil: anObject
> -       "This method has been superseeded by #scanFor:
> -       It is here for compatibility with external packages only."
> -       ^self scanFor: anObject!
>
> Item was removed:
> - ----- Method: HashedCollection>>fullCheck (in category 'compatibility') -----
> - fullCheck
> -       "This is a private method, formerly implemented in Set, that is no longer required.
> -       It is here for compatibility with external packages only."
> -       "Keep array at least 1/4 free for decent hash behavior"
> -
> -       array size * 3 < (tally * 4) ifTrue: [ self grow ]!
>
> Item was changed:
>   ----- Method: SequenceableCollection>>findBinaryIndex:do:ifNone: (in category 'enumerating') -----
>   findBinaryIndex: aBlock do: actionBlock ifNone: exceptionBlock
>         "Search for an element in the receiver using binary search.
>         The argument aBlock is a one-element block returning
>                 0       - if the element is the one searched for
>                 <0      - if the search should continue in the first half
>                 >0      - if the search should continue in the second half
>         If found, evaluate actionBlock with the index as argument
>         If no matching element is found, evaluate exceptionBlock,
>         with the indexes of the 'bounding' elements as optional
>         arguments.      Warning: Might give invalid indexes, see
>         examples below.
>         Examples:
>                 #(1 3 5 7 11 15 23)
>                         findBinaryIndex: [ :arg | 11 - arg ]
>                         do: [ :found | found ]
>                         ifNone: [ :a :b | ('between: ', {a. b} printString)]
>                 #(1 3 5 7 11 15 23)
>                         findBinaryIndex: [ :arg | 12 - arg ]
>                         do: [ :found | found ]
>                         ifNone: [ :a :b | ('between: ', {a. b} printString) ]
>                 #(1 3 5 7 11 15 23) d
>                         findBinaryIndex: [ :arg | 0.5 - arg ]
>                         do: [ :found | found ]
>                         ifNone: [ :a :b | ('between: ', {a. b} printString) ]
>                 #(1 3 5 7 11 15 23)
>                         findBinaryIndex: [ :arg | 25 - arg ]
>                         do: [ :found | found ]
>                         ifNone: [ :a :b | ('between: ',{a. b} printString) ]
>         "
> +       | index low high test |
> -       | index low high |
>         low := 1.
>         high := self size.
> +       [ high < low ] whileFalse: [
> +               index := high + low // 2.
> +               (test := aBlock value: (self at: index)) < 0
> +                       ifTrue: [ high := index - 1 ]
> +                       ifFalse: [
> +                               0 < test
> -       [
> -               index := high + low // 2.
> -               low > high ] whileFalse: [
> -                       | test |
> -                       test := aBlock value: (self at: index).
> -                       test = 0
> -                               ifTrue: [ ^actionBlock value: index ]
> -                               ifFalse: [ test > 0
>                                         ifTrue: [ low := index + 1 ]
> +                                       ifFalse: [ "test = 0"
> +                                               ^actionBlock value: index ] ] ].
> -                                       ifFalse: [ high := index - 1 ] ] ].
>         ^exceptionBlock cull: high cull: low!
>
> Item was changed:
>   ----- Method: SortedCollection>>indexForInserting: (in category 'private') -----
>   indexForInserting: newObject
>
>         | index low high |
>         low := firstIndex.
>         high := lastIndex.
> +       sortBlock
> +               ifNil: [
> +                       [ low > high ] whileFalse: [
> +                               index := (high + low) // 2.
> +                               (array at: index) <= newObject
> +                                       ifTrue: [ low := index + 1 ]
> +                                       ifFalse: [ high := index - 1 ] ] ]
> +               ifNotNil: [
> +                       [ low > high ] whileFalse: [
> +                               index := (high + low) // 2.
> +                               (sortBlock value: (array at: index) value: newObject)
> +                                       ifTrue: [ low := index + 1 ]
> +                                       ifFalse: [ high := index - 1 ] ] ].
> -       sortBlock isNil
> -               ifTrue: [[index := high + low // 2.  low > high]
> -                       whileFalse:
> -                               [((array at: index) <= newObject)
> -                                       ifTrue: [low := index + 1]
> -                                       ifFalse: [high := index - 1]]]
> -               ifFalse: [[index := high + low // 2.  low > high]
> -                       whileFalse:
> -                               [(sortBlock value: (array at: index) value: newObject)
> -                                       ifTrue: [low := index + 1]
> -                                       ifFalse: [high := index - 1]]].
>         ^low!
>
>


More information about the Squeak-dev mailing list