[squeak-dev] Re: Hash changes

Levente Uzonyi leves at elte.hu
Mon Nov 30 23:54:15 UTC 2009


On Mon, 30 Nov 2009, Andreas Raab wrote:

>> | test array |
>> Transcript open; cr.
>> test := Dictionary new.
>
> "There used to be an old issue with growth of Sets that were initially
> sized with goodSizeFor: 3. Try working around it by starting at 5."
> test := Dictionary new: 5.
>

This shouldn't be an issue in this case. The capacity of 
new HashedCollections are already modified to be carefully selected 
primes. AFAIK only hashMultiply doesn't like 5 as capacity but it's not 
involved here. Anyway I added this to the new test.

> "Having sequences of objects with strictly consecutive hashes is not 
> representative for real systems. Instead, create 4096 objects and choose 100 
> atRandom."
>
> objSet := Array new: 4096 streamContents: [ :stream |
> 	4096 timesRepeat: [ stream nextPut: Object new ] ].
> array := (1 to: 100) collect:[:i| objSet atRandom].
>

Good point but the hashes are not consecutive (at least with my vm). Since 
we need 10000 different Objects I added this idea to the test with some 
modifications. 10000 Objects are created, then shuffled and used when needed.

The modified test:

| test array objects |
Transcript open; cr.
test := Dictionary new: 5.
objects := ((1 to: 10000) collect: [ :each | Object new ]) shuffled readStream.
[ test size >= 10000 ] whileFalse: [
 	Smalltalk garbageCollect.
 	array := Array new: 100 streamContents: [ :stream |
 		100 timesRepeat: [ stream nextPut: objects next  ] ]. "Create 100 new Objects"
 	Transcript
 		print: [ "Add the 100 new Objects to the Dictionary as keys with nil values"
 			1 to: array size do: [ :index |
 				test at: (array at: index) put: nil ] ] timeToRun;
 		tab;
 		print: test size;
 		tab;
 		print: [ "Access the added Objects 1000 times"
 			1 to: 1000 do: [ :iteration |
 				1 to: array size do: [ :index |
 					test at: (array at: index) ] ] ] timeToRun;
 		tab;
 		print: [ "This should be substracted from the previous value"
 			1 to: 1000 do: [ :iteration |
 				1 to: array size do: [ :index |
 					array at: index ] ] ] timeToRun;
 		cr;
 		flush ]

The results:
Updated image:
0	100	93	4
0	200	144	4
0	300	125	4
0	400	105	3
0	500	111	4
0	600	150	4
0	700	115	3
-- snip --
0	9600	297	3
0	9700	296	4
0	9800	299	5
1	9900	312	4
0	10000	309	4

Values in the third column are increasing somewhat, but not 
significantly, no spikes etc.

Current image:
0	100	76	3
1	200	119	4
0	300	105	4
0	400	80	4
0	500	112	4
0	600	111	3
1	700	98	4
-- snip --
1	3700	1279	4
2	3800	1998	4
3	3900	3196	4
5	4000	4473	4
11	4100	11301	4
19	4200	18823	4
36	4300	36391	3
93	4400	45294	4
47	4500	46674	3
...

Around 4000 things get really slow. There are also smaller spikes near 
2500. I didn't wait for 4600. As you can see in the first column #at:put: 
is consuming more time every iteration.


Levente



More information about the Squeak-dev mailing list