[squeak-dev] The Trunk: Collections-ul.798.mcz
Chris Muller
asqueaker at gmail.com
Tue Jul 3 02:36:38 UTC 2018
Do existing Dictionary's need to be rehashed due to this?
On Mon, Jul 2, 2018 at 1:55 PM, <commits at source.squeak.org> wrote:
> Levente Uzonyi uploaded a new version of Collections to project The Trunk:
> http://source.squeak.org/trunk/Collections-ul.798.mcz
>
> ==================== Summary ====================
>
> Name: Collections-ul.798
> Author: ul
> Time: 2 July 2018, 8:52:05.858045 pm
> UUID: 8ed17235-fb2c-4453-9e6e-977d6d838cac
> Ancestors: Collections-cmm.797
>
> HashedCollection changes:
> - added 3, 7 and 13 to #goodPrimes. These will mainly be used during compaction as the default grow sequence is 5, 11, 17, etc..
> - improved performance of #goodPrimeAtLeast: by reordering comparisons
> - finally deprecated #findElementOrNil: and #fullCheck
>
> SequenceableCollection changes:
> - improved performance of #findBinary* methods by reordering comparisons and postponing index calculation
> - did the same in SortedCollection >> #indexForInserting: along with using #ifNil:ifNotNil:
>
> =============== Diff against Collections-cmm.797 ===============
>
> Item was changed:
> ----- Method: HashedCollection class>>goodPrimeAtLeast: (in category 'sizing') -----
> goodPrimeAtLeast: lowerLimit
> "Answer the next good prime >= lowerlimit.
> If lowerLimit is larger than the largest known good prime,
> just make it odd."
>
> + | low mid high prime primes |
> - | primes low mid high prime |
> primes := self goodPrimes.
> - low := 1.
> high := primes size.
> lowerLimit > (primes at: high) ifTrue: [
> ^lowerLimit bitOr: 1 ].
> + low := 1.
> [ high - low <= 1 ] whileFalse: [
> mid := high + low // 2.
> prime := primes at: mid.
> + lowerLimit < prime
> + ifTrue: [ high := mid ]
> + ifFalse: [
> + lowerLimit > prime
> + ifTrue: [ low := mid ]
> + ifFalse: [ ^prime ] ] ].
> - prime = lowerLimit ifTrue: [ ^prime ].
> - prime < lowerLimit
> - ifTrue: [ low := mid ]
> - ifFalse: [ high := mid ] ].
> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
> ^primes at: high!
>
> Item was changed:
> ----- Method: HashedCollection class>>goodPrimes (in category 'sizing') -----
> goodPrimes
> "Answer a sorted array of prime numbers less than one billion that make good
> hash table sizes. Should be expanded as needed. See comments below code"
>
> + ^#(3 5 7 11 13 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
> - ^#(
> - 5 11 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
> 2237 2423 2617 2797 2999 3167 3359 3539 3727 3911
> 4441 4787 5119 5471 5801 6143 6521 6827 7177 7517 7853
> 8783 9601 10243 10867 11549 12239 12919 13679 14293 15013 15731
> 17569 19051 20443 21767 23159 24611 25847 27397 28571 30047 31397
> 35771 38201 40841 43973 46633 48989 51631 54371 57349 60139 62969
> 70589 76091 80347 85843 90697 95791 101051 106261 111143 115777 120691 126311
> 140863 150523 160969 170557 181243 190717 201653 211891 221251 232591 242873 251443
> 282089 300869 321949 341227 362353 383681 401411 422927 443231 464951 482033 504011
> 562621 605779 647659 681607 723623 763307 808261 844709 886163 926623 967229 1014617
> 1121987 1201469 1268789 1345651 1429531 1492177 1577839 1651547 1722601 1800377 1878623 1942141 2028401
> 2242727 2399581 2559173 2686813 2836357 3005579 3144971 3283993 3460133 3582923 3757093 3903769 4061261
> 4455361 4783837 5068529 5418079 5680243 6000023 6292981 6611497 6884641 7211599 7514189 7798313 8077189
> 9031853 9612721 10226107 10745291 11338417 11939203 12567671 13212697 13816333 14337529 14938571 15595673 16147291
> 17851577 18993941 20180239 21228533 22375079 23450491 24635579 25683871 26850101 27921689 29090911 30153841 31292507 32467307
> 35817611 37983761 40234253 42457253 44750177 46957969 49175831 51442639 53726417 55954637 58126987 60365939 62666977 64826669
> 71582779 76039231 80534381 84995153 89500331 93956777 98470819 102879613 107400389 111856841 116365721 120819287 125246581 129732203
> 143163379 152076289 161031319 169981667 179000669 187913573 196826447 205826729 214748357 223713691 232679021 241591901 250504801 259470131
> 285162679 301939921 318717121 335494331 352271573 369148753 385926017 402603193 419480419 436157621 453034849 469712051 486589307 503366497 520043707
> 570475349 603929813 637584271 671138659 704693081 738247541 771801929 805356457 838910803 872365267 905919671 939574117 973128521 1006682977 1040137411
> 1073741833)
>
> "The above primes past 2069 were chosen carefully so that they do not interact badly with 1664525 (used by hashMultiply), and so that gcd(p, (256^k) +/- a) = 1, for 0<a<=32 and 0<k<=8. See Knuth's TAOCP for details."
>
> "The above primes also try to map the values of ((0 to: 4095) collect: [ :each | each << 18 \\ prime ]) sort to an equidistant sequence of numbers. This helps to avoid the collision of chains in identity-based hashed collections. To do that they were chosen to return a low value when the following block is evaluated with them as argument:
> [ :prime |
> + | n slots cost optimalDistance |
> + n := 1 bitShift: 22.
> + slots := Array new: n + 1.
> + 0 to: n - 1 do: [ :ea | slots at: ea + 1 put: (ea bitShift: 8) \\ prime ].
> + slots at: n + 1 put: prime.
> - | slots cost optimalDistance previous |
> - slots := Array new: 4097.
> - 0 to: 4095 do: [ :ea | slots at: ea + 1 put: ea * 262144 \\ prime ].
> - slots at: 4097 put: prime.
> slots sort.
> cost := 0.
> + optimalDistance := prime // n.
> + 2 to: n + 1 do: [ :index |
> - optimalDistance := prime // 4096.
> - 2 to: 4097 do: [ :index |
> | newCost |
> newCost := optimalDistance - ((slots at: index) - (slots at: index - 1)).
> newCost > cost ifTrue: [ cost := newCost ] ].
> + result add: prime -> cost ]
> +
> + The shifts in the block relate to the numer of bits the #identityHash consists of (22) and the number of bits #scaledIdentityHash shifts it (8)"!
> - cost ]."!
>
> Item was removed:
> - ----- Method: HashedCollection>>findElementOrNil: (in category 'compatibility') -----
> - findElementOrNil: anObject
> - "This method has been superseeded by #scanFor:
> - It is here for compatibility with external packages only."
> - ^self scanFor: anObject!
>
> Item was removed:
> - ----- Method: HashedCollection>>fullCheck (in category 'compatibility') -----
> - fullCheck
> - "This is a private method, formerly implemented in Set, that is no longer required.
> - It is here for compatibility with external packages only."
> - "Keep array at least 1/4 free for decent hash behavior"
> -
> - array size * 3 < (tally * 4) ifTrue: [ self grow ]!
>
> Item was changed:
> ----- Method: SequenceableCollection>>findBinaryIndex:do:ifNone: (in category 'enumerating') -----
> findBinaryIndex: aBlock do: actionBlock ifNone: exceptionBlock
> "Search for an element in the receiver using binary search.
> The argument aBlock is a one-element block returning
> 0 - if the element is the one searched for
> <0 - if the search should continue in the first half
> >0 - if the search should continue in the second half
> If found, evaluate actionBlock with the index as argument
> If no matching element is found, evaluate exceptionBlock,
> with the indexes of the 'bounding' elements as optional
> arguments. Warning: Might give invalid indexes, see
> examples below.
> Examples:
> #(1 3 5 7 11 15 23)
> findBinaryIndex: [ :arg | 11 - arg ]
> do: [ :found | found ]
> ifNone: [ :a :b | ('between: ', {a. b} printString)]
> #(1 3 5 7 11 15 23)
> findBinaryIndex: [ :arg | 12 - arg ]
> do: [ :found | found ]
> ifNone: [ :a :b | ('between: ', {a. b} printString) ]
> #(1 3 5 7 11 15 23) d
> findBinaryIndex: [ :arg | 0.5 - arg ]
> do: [ :found | found ]
> ifNone: [ :a :b | ('between: ', {a. b} printString) ]
> #(1 3 5 7 11 15 23)
> findBinaryIndex: [ :arg | 25 - arg ]
> do: [ :found | found ]
> ifNone: [ :a :b | ('between: ',{a. b} printString) ]
> "
> + | index low high test |
> - | index low high |
> low := 1.
> high := self size.
> + [ high < low ] whileFalse: [
> + index := high + low // 2.
> + (test := aBlock value: (self at: index)) < 0
> + ifTrue: [ high := index - 1 ]
> + ifFalse: [
> + 0 < test
> - [
> - index := high + low // 2.
> - low > high ] whileFalse: [
> - | test |
> - test := aBlock value: (self at: index).
> - test = 0
> - ifTrue: [ ^actionBlock value: index ]
> - ifFalse: [ test > 0
> ifTrue: [ low := index + 1 ]
> + ifFalse: [ "test = 0"
> + ^actionBlock value: index ] ] ].
> - ifFalse: [ high := index - 1 ] ] ].
> ^exceptionBlock cull: high cull: low!
>
> Item was changed:
> ----- Method: SortedCollection>>indexForInserting: (in category 'private') -----
> indexForInserting: newObject
>
> | index low high |
> low := firstIndex.
> high := lastIndex.
> + sortBlock
> + ifNil: [
> + [ low > high ] whileFalse: [
> + index := (high + low) // 2.
> + (array at: index) <= newObject
> + ifTrue: [ low := index + 1 ]
> + ifFalse: [ high := index - 1 ] ] ]
> + ifNotNil: [
> + [ low > high ] whileFalse: [
> + index := (high + low) // 2.
> + (sortBlock value: (array at: index) value: newObject)
> + ifTrue: [ low := index + 1 ]
> + ifFalse: [ high := index - 1 ] ] ].
> - sortBlock isNil
> - ifTrue: [[index := high + low // 2. low > high]
> - whileFalse:
> - [((array at: index) <= newObject)
> - ifTrue: [low := index + 1]
> - ifFalse: [high := index - 1]]]
> - ifFalse: [[index := high + low // 2. low > high]
> - whileFalse:
> - [(sortBlock value: (array at: index) value: newObject)
> - ifTrue: [low := index + 1]
> - ifFalse: [high := index - 1]]].
> ^low!
>
>
More information about the Squeak-dev
mailing list
|