[squeak-dev] The Trunk: Collections-ul.798.mcz

Levente Uzonyi leves at caesar.elte.hu
Tue Jul 3 11:02:49 UTC 2018


No.

Levente

On Mon, 2 Jul 2018, Chris Muller wrote:

> Do existing Dictionary's need to be rehashed due to this?
>
> On Mon, Jul 2, 2018 at 1:55 PM,  <commits at source.squeak.org> wrote:
>> Levente Uzonyi uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/trunk/Collections-ul.798.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-ul.798
>> Author: ul
>> Time: 2 July 2018, 8:52:05.858045 pm
>> UUID: 8ed17235-fb2c-4453-9e6e-977d6d838cac
>> Ancestors: Collections-cmm.797
>>
>> HashedCollection changes:
>> - added 3, 7 and 13 to #goodPrimes. These will mainly be used during compaction as the default grow sequence is 5, 11, 17, etc..
>> - improved performance of #goodPrimeAtLeast: by reordering comparisons
>> - finally deprecated #findElementOrNil: and #fullCheck
>>
>> SequenceableCollection changes:
>> - improved performance of #findBinary* methods by reordering comparisons and postponing index calculation
>> - did the same in SortedCollection >> #indexForInserting: along with using #ifNil:ifNotNil:
>>
>> =============== Diff against Collections-cmm.797 ===============
>>
>> Item was changed:
>>   ----- Method: HashedCollection class>>goodPrimeAtLeast: (in category 'sizing') -----
>>   goodPrimeAtLeast: lowerLimit
>>         "Answer the next good prime >= lowerlimit.
>>         If lowerLimit is larger than the largest known good prime,
>>         just make it odd."
>>
>> +       | low mid high prime primes |
>> -       | primes low mid high prime |
>>         primes := self goodPrimes.
>> -       low := 1.
>>         high := primes size.
>>         lowerLimit > (primes at: high) ifTrue: [
>>                 ^lowerLimit bitOr: 1 ].
>> +       low := 1.
>>         [ high - low <= 1 ] whileFalse: [
>>                 mid := high + low // 2.
>>                 prime := primes at: mid.
>> +               lowerLimit < prime
>> +                       ifTrue: [ high := mid ]
>> +                       ifFalse: [
>> +                               lowerLimit > prime
>> +                                       ifTrue: [ low := mid ]
>> +                                       ifFalse: [ ^prime ] ] ].
>> -               prime = lowerLimit ifTrue: [ ^prime ].
>> -               prime < lowerLimit
>> -                       ifTrue: [ low := mid ]
>> -                       ifFalse: [ high := mid ] ].
>>         (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>         ^primes at: high!
>>
>> Item was changed:
>>   ----- Method: HashedCollection class>>goodPrimes (in category 'sizing') -----
>>   goodPrimes
>>         "Answer a sorted array of prime numbers less than one billion that make good
>>         hash table sizes. Should be expanded as needed.  See comments below code"
>>
>> +       ^#(3 5 7 11 13 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
>> -       ^#(
>> -               5 11 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
>>                 2237 2423 2617 2797 2999 3167 3359 3539 3727 3911
>>                 4441 4787 5119 5471 5801 6143 6521 6827 7177 7517 7853
>>                 8783 9601 10243 10867 11549 12239 12919 13679 14293 15013 15731
>>                 17569 19051 20443 21767 23159 24611 25847 27397 28571 30047 31397
>>                 35771 38201 40841 43973 46633 48989 51631 54371 57349 60139 62969
>>                 70589 76091 80347 85843 90697 95791 101051 106261 111143 115777 120691 126311
>>                 140863 150523 160969 170557 181243 190717 201653 211891 221251 232591 242873 251443
>>                 282089 300869 321949 341227 362353 383681 401411 422927 443231 464951 482033 504011
>>                 562621 605779 647659 681607 723623 763307 808261 844709 886163 926623 967229 1014617
>>                 1121987 1201469 1268789 1345651 1429531 1492177 1577839 1651547 1722601 1800377 1878623 1942141 2028401
>>                 2242727 2399581 2559173 2686813 2836357 3005579 3144971 3283993 3460133 3582923 3757093 3903769 4061261
>>                 4455361 4783837 5068529 5418079 5680243 6000023 6292981 6611497 6884641 7211599 7514189 7798313 8077189
>>                 9031853 9612721 10226107 10745291 11338417 11939203 12567671 13212697 13816333 14337529 14938571 15595673 16147291
>>                 17851577 18993941 20180239 21228533 22375079 23450491 24635579 25683871 26850101 27921689 29090911 30153841 31292507 32467307
>>                 35817611 37983761 40234253 42457253 44750177 46957969 49175831 51442639 53726417 55954637 58126987 60365939 62666977 64826669
>>                 71582779 76039231 80534381 84995153 89500331 93956777 98470819 102879613 107400389 111856841 116365721 120819287 125246581 129732203
>>                 143163379 152076289 161031319 169981667 179000669 187913573 196826447 205826729 214748357 223713691 232679021 241591901 250504801 259470131
>>                 285162679 301939921 318717121 335494331 352271573 369148753 385926017 402603193 419480419 436157621 453034849 469712051 486589307 503366497 520043707
>>                 570475349 603929813 637584271 671138659 704693081 738247541 771801929 805356457 838910803 872365267 905919671 939574117 973128521 1006682977 1040137411
>>                 1073741833)
>>
>>   "The above primes past 2069 were chosen carefully so that they do not interact badly with 1664525 (used by hashMultiply), and so that gcd(p, (256^k) +/- a) = 1, for 0<a<=32 and 0<k<=8.  See Knuth's TAOCP for details."
>>
>>   "The above primes also try to map the values of ((0 to: 4095) collect: [ :each | each << 18 \\ prime ]) sort to an equidistant sequence of numbers. This helps to avoid the collision of chains in identity-based hashed collections. To do that  they were chosen to return a low value when the following block is evaluated with them as argument:
>>    [ :prime |
>> +       | n slots cost optimalDistance |
>> +       n := 1 bitShift: 22.
>> +       slots := Array new: n + 1.
>> +       0 to: n - 1 do: [ :ea | slots at: ea + 1 put: (ea bitShift: 8) \\ prime ].
>> +       slots at: n + 1 put: prime.
>> -       | slots cost optimalDistance previous |
>> -       slots := Array new: 4097.
>> -       0 to: 4095 do: [ :ea | slots at: ea + 1 put: ea *  262144 \\ prime ].
>> -       slots at: 4097 put: prime.
>>         slots sort.
>>         cost := 0.
>> +       optimalDistance := prime // n.
>> +       2 to: n + 1 do: [ :index |
>> -       optimalDistance := prime // 4096.
>> -       2 to: 4097 do: [ :index |
>>                 | newCost |
>>                 newCost := optimalDistance - ((slots at: index) - (slots at: index - 1)).
>>                 newCost > cost ifTrue: [ cost := newCost ] ].
>> +       result add: prime -> cost ]
>> +
>> + The shifts in the block relate to the numer of bits the #identityHash consists of (22) and the number of bits #scaledIdentityHash shifts it (8)"!
>> -       cost ]."!
>>
>> Item was removed:
>> - ----- Method: HashedCollection>>findElementOrNil: (in category 'compatibility') -----
>> - findElementOrNil: anObject
>> -       "This method has been superseeded by #scanFor:
>> -       It is here for compatibility with external packages only."
>> -       ^self scanFor: anObject!
>>
>> Item was removed:
>> - ----- Method: HashedCollection>>fullCheck (in category 'compatibility') -----
>> - fullCheck
>> -       "This is a private method, formerly implemented in Set, that is no longer required.
>> -       It is here for compatibility with external packages only."
>> -       "Keep array at least 1/4 free for decent hash behavior"
>> -
>> -       array size * 3 < (tally * 4) ifTrue: [ self grow ]!
>>
>> Item was changed:
>>   ----- Method: SequenceableCollection>>findBinaryIndex:do:ifNone: (in category 'enumerating') -----
>>   findBinaryIndex: aBlock do: actionBlock ifNone: exceptionBlock
>>         "Search for an element in the receiver using binary search.
>>         The argument aBlock is a one-element block returning
>>                 0       - if the element is the one searched for
>>                 <0      - if the search should continue in the first half
>>                 >0      - if the search should continue in the second half
>>         If found, evaluate actionBlock with the index as argument
>>         If no matching element is found, evaluate exceptionBlock,
>>         with the indexes of the 'bounding' elements as optional
>>         arguments.      Warning: Might give invalid indexes, see
>>         examples below.
>>         Examples:
>>                 #(1 3 5 7 11 15 23)
>>                         findBinaryIndex: [ :arg | 11 - arg ]
>>                         do: [ :found | found ]
>>                         ifNone: [ :a :b | ('between: ', {a. b} printString)]
>>                 #(1 3 5 7 11 15 23)
>>                         findBinaryIndex: [ :arg | 12 - arg ]
>>                         do: [ :found | found ]
>>                         ifNone: [ :a :b | ('between: ', {a. b} printString) ]
>>                 #(1 3 5 7 11 15 23) d
>>                         findBinaryIndex: [ :arg | 0.5 - arg ]
>>                         do: [ :found | found ]
>>                         ifNone: [ :a :b | ('between: ', {a. b} printString) ]
>>                 #(1 3 5 7 11 15 23)
>>                         findBinaryIndex: [ :arg | 25 - arg ]
>>                         do: [ :found | found ]
>>                         ifNone: [ :a :b | ('between: ',{a. b} printString) ]
>>         "
>> +       | index low high test |
>> -       | index low high |
>>         low := 1.
>>         high := self size.
>> +       [ high < low ] whileFalse: [
>> +               index := high + low // 2.
>> +               (test := aBlock value: (self at: index)) < 0
>> +                       ifTrue: [ high := index - 1 ]
>> +                       ifFalse: [
>> +                               0 < test
>> -       [
>> -               index := high + low // 2.
>> -               low > high ] whileFalse: [
>> -                       | test |
>> -                       test := aBlock value: (self at: index).
>> -                       test = 0
>> -                               ifTrue: [ ^actionBlock value: index ]
>> -                               ifFalse: [ test > 0
>>                                         ifTrue: [ low := index + 1 ]
>> +                                       ifFalse: [ "test = 0"
>> +                                               ^actionBlock value: index ] ] ].
>> -                                       ifFalse: [ high := index - 1 ] ] ].
>>         ^exceptionBlock cull: high cull: low!
>>
>> Item was changed:
>>   ----- Method: SortedCollection>>indexForInserting: (in category 'private') -----
>>   indexForInserting: newObject
>>
>>         | index low high |
>>         low := firstIndex.
>>         high := lastIndex.
>> +       sortBlock
>> +               ifNil: [
>> +                       [ low > high ] whileFalse: [
>> +                               index := (high + low) // 2.
>> +                               (array at: index) <= newObject
>> +                                       ifTrue: [ low := index + 1 ]
>> +                                       ifFalse: [ high := index - 1 ] ] ]
>> +               ifNotNil: [
>> +                       [ low > high ] whileFalse: [
>> +                               index := (high + low) // 2.
>> +                               (sortBlock value: (array at: index) value: newObject)
>> +                                       ifTrue: [ low := index + 1 ]
>> +                                       ifFalse: [ high := index - 1 ] ] ].
>> -       sortBlock isNil
>> -               ifTrue: [[index := high + low // 2.  low > high]
>> -                       whileFalse:
>> -                               [((array at: index) <= newObject)
>> -                                       ifTrue: [low := index + 1]
>> -                                       ifFalse: [high := index - 1]]]
>> -               ifFalse: [[index := high + low // 2.  low > high]
>> -                       whileFalse:
>> -                               [(sortBlock value: (array at: index) value: newObject)
>> -                                       ifTrue: [low := index + 1]
>> -                                       ifFalse: [high := index - 1]]].
>>         ^low!
>>
>>


More information about the Squeak-dev mailing list