[squeak-dev] The Trunk: Collections-ul.798.mcz
Levente Uzonyi
leves at caesar.elte.hu
Tue Jul 3 11:02:49 UTC 2018
No.
Levente
On Mon, 2 Jul 2018, Chris Muller wrote:
> Do existing Dictionary's need to be rehashed due to this?
>
> On Mon, Jul 2, 2018 at 1:55 PM, <commits at source.squeak.org> wrote:
>> Levente Uzonyi uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/trunk/Collections-ul.798.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-ul.798
>> Author: ul
>> Time: 2 July 2018, 8:52:05.858045 pm
>> UUID: 8ed17235-fb2c-4453-9e6e-977d6d838cac
>> Ancestors: Collections-cmm.797
>>
>> HashedCollection changes:
>> - added 3, 7 and 13 to #goodPrimes. These will mainly be used during compaction as the default grow sequence is 5, 11, 17, etc..
>> - improved performance of #goodPrimeAtLeast: by reordering comparisons
>> - finally deprecated #findElementOrNil: and #fullCheck
>>
>> SequenceableCollection changes:
>> - improved performance of #findBinary* methods by reordering comparisons and postponing index calculation
>> - did the same in SortedCollection >> #indexForInserting: along with using #ifNil:ifNotNil:
>>
>> =============== Diff against Collections-cmm.797 ===============
>>
>> Item was changed:
>> ----- Method: HashedCollection class>>goodPrimeAtLeast: (in category 'sizing') -----
>> goodPrimeAtLeast: lowerLimit
>> "Answer the next good prime >= lowerlimit.
>> If lowerLimit is larger than the largest known good prime,
>> just make it odd."
>>
>> + | low mid high prime primes |
>> - | primes low mid high prime |
>> primes := self goodPrimes.
>> - low := 1.
>> high := primes size.
>> lowerLimit > (primes at: high) ifTrue: [
>> ^lowerLimit bitOr: 1 ].
>> + low := 1.
>> [ high - low <= 1 ] whileFalse: [
>> mid := high + low // 2.
>> prime := primes at: mid.
>> + lowerLimit < prime
>> + ifTrue: [ high := mid ]
>> + ifFalse: [
>> + lowerLimit > prime
>> + ifTrue: [ low := mid ]
>> + ifFalse: [ ^prime ] ] ].
>> - prime = lowerLimit ifTrue: [ ^prime ].
>> - prime < lowerLimit
>> - ifTrue: [ low := mid ]
>> - ifFalse: [ high := mid ] ].
>> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>> ^primes at: high!
>>
>> Item was changed:
>> ----- Method: HashedCollection class>>goodPrimes (in category 'sizing') -----
>> goodPrimes
>> "Answer a sorted array of prime numbers less than one billion that make good
>> hash table sizes. Should be expanded as needed. See comments below code"
>>
>> + ^#(3 5 7 11 13 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
>> - ^#(
>> - 5 11 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 2069
>> 2237 2423 2617 2797 2999 3167 3359 3539 3727 3911
>> 4441 4787 5119 5471 5801 6143 6521 6827 7177 7517 7853
>> 8783 9601 10243 10867 11549 12239 12919 13679 14293 15013 15731
>> 17569 19051 20443 21767 23159 24611 25847 27397 28571 30047 31397
>> 35771 38201 40841 43973 46633 48989 51631 54371 57349 60139 62969
>> 70589 76091 80347 85843 90697 95791 101051 106261 111143 115777 120691 126311
>> 140863 150523 160969 170557 181243 190717 201653 211891 221251 232591 242873 251443
>> 282089 300869 321949 341227 362353 383681 401411 422927 443231 464951 482033 504011
>> 562621 605779 647659 681607 723623 763307 808261 844709 886163 926623 967229 1014617
>> 1121987 1201469 1268789 1345651 1429531 1492177 1577839 1651547 1722601 1800377 1878623 1942141 2028401
>> 2242727 2399581 2559173 2686813 2836357 3005579 3144971 3283993 3460133 3582923 3757093 3903769 4061261
>> 4455361 4783837 5068529 5418079 5680243 6000023 6292981 6611497 6884641 7211599 7514189 7798313 8077189
>> 9031853 9612721 10226107 10745291 11338417 11939203 12567671 13212697 13816333 14337529 14938571 15595673 16147291
>> 17851577 18993941 20180239 21228533 22375079 23450491 24635579 25683871 26850101 27921689 29090911 30153841 31292507 32467307
>> 35817611 37983761 40234253 42457253 44750177 46957969 49175831 51442639 53726417 55954637 58126987 60365939 62666977 64826669
>> 71582779 76039231 80534381 84995153 89500331 93956777 98470819 102879613 107400389 111856841 116365721 120819287 125246581 129732203
>> 143163379 152076289 161031319 169981667 179000669 187913573 196826447 205826729 214748357 223713691 232679021 241591901 250504801 259470131
>> 285162679 301939921 318717121 335494331 352271573 369148753 385926017 402603193 419480419 436157621 453034849 469712051 486589307 503366497 520043707
>> 570475349 603929813 637584271 671138659 704693081 738247541 771801929 805356457 838910803 872365267 905919671 939574117 973128521 1006682977 1040137411
>> 1073741833)
>>
>> "The above primes past 2069 were chosen carefully so that they do not interact badly with 1664525 (used by hashMultiply), and so that gcd(p, (256^k) +/- a) = 1, for 0<a<=32 and 0<k<=8. See Knuth's TAOCP for details."
>>
>> "The above primes also try to map the values of ((0 to: 4095) collect: [ :each | each << 18 \\ prime ]) sort to an equidistant sequence of numbers. This helps to avoid the collision of chains in identity-based hashed collections. To do that they were chosen to return a low value when the following block is evaluated with them as argument:
>> [ :prime |
>> + | n slots cost optimalDistance |
>> + n := 1 bitShift: 22.
>> + slots := Array new: n + 1.
>> + 0 to: n - 1 do: [ :ea | slots at: ea + 1 put: (ea bitShift: 8) \\ prime ].
>> + slots at: n + 1 put: prime.
>> - | slots cost optimalDistance previous |
>> - slots := Array new: 4097.
>> - 0 to: 4095 do: [ :ea | slots at: ea + 1 put: ea * 262144 \\ prime ].
>> - slots at: 4097 put: prime.
>> slots sort.
>> cost := 0.
>> + optimalDistance := prime // n.
>> + 2 to: n + 1 do: [ :index |
>> - optimalDistance := prime // 4096.
>> - 2 to: 4097 do: [ :index |
>> | newCost |
>> newCost := optimalDistance - ((slots at: index) - (slots at: index - 1)).
>> newCost > cost ifTrue: [ cost := newCost ] ].
>> + result add: prime -> cost ]
>> +
>> + The shifts in the block relate to the numer of bits the #identityHash consists of (22) and the number of bits #scaledIdentityHash shifts it (8)"!
>> - cost ]."!
>>
>> Item was removed:
>> - ----- Method: HashedCollection>>findElementOrNil: (in category 'compatibility') -----
>> - findElementOrNil: anObject
>> - "This method has been superseeded by #scanFor:
>> - It is here for compatibility with external packages only."
>> - ^self scanFor: anObject!
>>
>> Item was removed:
>> - ----- Method: HashedCollection>>fullCheck (in category 'compatibility') -----
>> - fullCheck
>> - "This is a private method, formerly implemented in Set, that is no longer required.
>> - It is here for compatibility with external packages only."
>> - "Keep array at least 1/4 free for decent hash behavior"
>> -
>> - array size * 3 < (tally * 4) ifTrue: [ self grow ]!
>>
>> Item was changed:
>> ----- Method: SequenceableCollection>>findBinaryIndex:do:ifNone: (in category 'enumerating') -----
>> findBinaryIndex: aBlock do: actionBlock ifNone: exceptionBlock
>> "Search for an element in the receiver using binary search.
>> The argument aBlock is a one-element block returning
>> 0 - if the element is the one searched for
>> <0 - if the search should continue in the first half
>> >0 - if the search should continue in the second half
>> If found, evaluate actionBlock with the index as argument
>> If no matching element is found, evaluate exceptionBlock,
>> with the indexes of the 'bounding' elements as optional
>> arguments. Warning: Might give invalid indexes, see
>> examples below.
>> Examples:
>> #(1 3 5 7 11 15 23)
>> findBinaryIndex: [ :arg | 11 - arg ]
>> do: [ :found | found ]
>> ifNone: [ :a :b | ('between: ', {a. b} printString)]
>> #(1 3 5 7 11 15 23)
>> findBinaryIndex: [ :arg | 12 - arg ]
>> do: [ :found | found ]
>> ifNone: [ :a :b | ('between: ', {a. b} printString) ]
>> #(1 3 5 7 11 15 23) d
>> findBinaryIndex: [ :arg | 0.5 - arg ]
>> do: [ :found | found ]
>> ifNone: [ :a :b | ('between: ', {a. b} printString) ]
>> #(1 3 5 7 11 15 23)
>> findBinaryIndex: [ :arg | 25 - arg ]
>> do: [ :found | found ]
>> ifNone: [ :a :b | ('between: ',{a. b} printString) ]
>> "
>> + | index low high test |
>> - | index low high |
>> low := 1.
>> high := self size.
>> + [ high < low ] whileFalse: [
>> + index := high + low // 2.
>> + (test := aBlock value: (self at: index)) < 0
>> + ifTrue: [ high := index - 1 ]
>> + ifFalse: [
>> + 0 < test
>> - [
>> - index := high + low // 2.
>> - low > high ] whileFalse: [
>> - | test |
>> - test := aBlock value: (self at: index).
>> - test = 0
>> - ifTrue: [ ^actionBlock value: index ]
>> - ifFalse: [ test > 0
>> ifTrue: [ low := index + 1 ]
>> + ifFalse: [ "test = 0"
>> + ^actionBlock value: index ] ] ].
>> - ifFalse: [ high := index - 1 ] ] ].
>> ^exceptionBlock cull: high cull: low!
>>
>> Item was changed:
>> ----- Method: SortedCollection>>indexForInserting: (in category 'private') -----
>> indexForInserting: newObject
>>
>> | index low high |
>> low := firstIndex.
>> high := lastIndex.
>> + sortBlock
>> + ifNil: [
>> + [ low > high ] whileFalse: [
>> + index := (high + low) // 2.
>> + (array at: index) <= newObject
>> + ifTrue: [ low := index + 1 ]
>> + ifFalse: [ high := index - 1 ] ] ]
>> + ifNotNil: [
>> + [ low > high ] whileFalse: [
>> + index := (high + low) // 2.
>> + (sortBlock value: (array at: index) value: newObject)
>> + ifTrue: [ low := index + 1 ]
>> + ifFalse: [ high := index - 1 ] ] ].
>> - sortBlock isNil
>> - ifTrue: [[index := high + low // 2. low > high]
>> - whileFalse:
>> - [((array at: index) <= newObject)
>> - ifTrue: [low := index + 1]
>> - ifFalse: [high := index - 1]]]
>> - ifFalse: [[index := high + low // 2. low > high]
>> - whileFalse:
>> - [(sortBlock value: (array at: index) value: newObject)
>> - ifTrue: [low := index + 1]
>> - ifFalse: [high := index - 1]]].
>> ^low!
>>
>>
More information about the Squeak-dev
mailing list
|