[ENH] BehaviorHashEnh

Stephan Rudlof sr at evolgo.de
Fri Oct 1 18:04:31 UTC 2004


Hello Stef,

now I've tested with Squeak3.8alpha #6285:
and the results remain the same for me as for Squeak3.7!
With fileIn of the cs or doIt of the postscript in the cs directly:
always the same.

I don't understand your results.


Please give me your results exactly as I've done them below (two doIts
before and two other ones after fileIn of the cs, the *first* and
*second* doIt (of the two) are *different*!).

I have tried to be very explicit in my explanations, sine I just have
the idea that we are testing different things.


Any other takers?


Greetings,
Stephan


My results:

*first* doIt
	| allClasses allClassesSet block |
	allClasses := Smalltalk allClasses.
	block _ [allClassesSet _ allClasses asSet.
			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
	{block value. block value. block value}
 #(6840 6863 6823) #(6988 6907 6862) "before"
 #(199 204 200) #(203 219 206) "after fileIn"
	
*second* doIt
	| allClasses allClassesSet block |
	allClasses := Smalltalk allClasses.
	block _ [allClassesSet _ allClasses asIdentitySet.
			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
	{block value. block value. block value}.
 #(3112 3166 3108) #(3152 3160 3129) "before"
 #(3196 3225 3186) #(3193 3181 3163) "after fileIn"


My cs has been (you can also doIt the postscript manually):

'From Squeak3.7beta of ''1 April 2004'' [latest update: #5963] on 22
June 2004 at 8:51:57 pm'!
"Change Set:		BehaviorHashEnh v1.1
Date:			22 June 2004
Author:			Stephan Rudlof

Improves the default Object>>hash for Behaviors by installing
Behavior>>hash. String>>hash has been changed a little to avoid infinite
recursion (without changing its semantics).
All is done in the postscript.

Important
-----------
This is a special changeset: Do not export and import this changeset
again after importing it the first time!! Then the methods are not
installed alone in the postscript anymore, leading to serious problems!!
-----------

Rationale: Object>>hash calling ProtoObject>>identityHash gives poor
results for Behaviors. Therefore a new Behavior>>hash using Symbol>>hash
or String>>hash (the latter slightly changed to avoide infinite
recursion) will be installed.

Consequences:
- It speeds up Set/Dictionary operations with Behaviors a lot (see below).
- The main consequence for other objects as Behaviors seems to be a
changed hash if they use
	self species hash
as a start value for computing their hash.
But AFAICS this doesn't hurt, since in most cases (non meta classes as
species) it maps to Symbol>>hash, which is fast.

Test:
doIt
	| allClasses allClassesSet block |
	allClasses := Smalltalk allClasses.
	block _ [allClassesSet _ allClasses asSet.
			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
	{block value. block value. block value}
before and after filing in this cs.
To see the problem again just doIt
	| allClasses allClassesSet block |
	allClasses := Smalltalk allClasses.
	block _ [allClassesSet _ allClasses asIdentitySet.
			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
	{block value. block value. block value}.

Future: Best would probably be a better identityHash with more bits
(possibly in V4?).

PS: I've been stumbled over BrowserEnvironmentTest>>testAllClassesDo,
which is dog slow without this cs.

History
--------
- v1.1: changed cs comment
- without v no: original version
--------"!

"Postscript:
Compile String and Behavior >>hash here, since compilation has to be
tight together with rehashing Sets possibly containing objects with
changed >>hash."

String compile:
'hash
	"#hash is implemented, because #= is implemented"

	^ByteArray
		hashBytes: self
		startingWith: self species identityHash'.
			
Set quickRehashAllSets.

Behavior compile:
'hash
	^ self name hash'.
	
Set quickRehashAllSets.
!



stéphane ducasse wrote:
> hi Stefan
> 
> with 3.8 6273
> 
> I get | allClasses allClassesSet block |
> 	allClasses := Smalltalk allClasses.
> 	block _ [allClassesSet _ allClasses asSet.
> 			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
> 	{block value. block value. block value}
>   #(51 48 53)
> 
> and nearly the same in 6208
> 
> Stef
> 
> could you check because something certainly changed.
> 
> On 1 oct. 04, at 04:21, Stephan Rudlof wrote:
> 
> 
>>Hello Stef,
>>
>>nice to see this reviewed!
>>
>>For me there is a dramatic improvement with the cs, copied from a 
>>workspace:
>>
>>Test:
>>doIt
>>	| allClasses allClassesSet block |
>>	allClasses := Smalltalk allClasses.
>>	block _ [allClassesSet _ allClasses asSet.
>>			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
>>	{block value. block value. block value}
>> #(7127 7135 7120) #(7102 7161 7140) #(7341 7353 7300) "before fileIn"	
>> #(251 250 255) #(257 255 250) "after fileIn"
>>	
>>before and after filing in this cs.
>>To see the problem again just doIt
>>	| allClasses allClassesSet block |
>>	allClasses := Smalltalk allClasses.
>>	block _ [allClassesSet _ allClasses asIdentitySet.
>>			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
>>	{block value. block value. block value}.
>> #(3677 3681 3664) #(3650 3688 3654) #(3656 3695 3648) "before"
>> #(3662 3686 3663) #(3645 3710 3647) "after"
>>
>>
>>I'm using Squeak3.7#5988.
>>
>>ducasse at iam.unibe.ch wrote:
>>
>>>can somebody else have a look at this enh.
>>>
>>>I run the "tests"
>>>
>>>| allClasses allClassesSet block |
>>>	allClasses := Smalltalk allClasses.
>>>	block _ [allClassesSet _ allClasses asSet.
>>>			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
>>>	{block value. block value. block value}
>>>	
>>>
>>>before and after filing in this cs.
>>>To see the problem again just doIt
>>>	| allClasses allClassesSet block |
>>>	allClasses := Smalltalk allClasses.
>>>	block _ [allClassesSet _ allClasses asIdentitySet.
>>>			[allClasses do: [:class | allClassesSet remove: class]] timeToRun].
>>>	{block value. block value. block value}.
>>>	
>>>	
>>>after loading I got
>>>	-> #(64 57 63)
>>>
>>>To see the problem again just doIt	
>>>	-> #(34 33 39)
>>>	
>>>
>>
>>>So I'm confused because I thought this would be the opposite so I may 
>>>be
>>>too tired.
>>
>>For me it is the opposite: e.g. 251 to 3662!
>>
>>1. Observations:
>>  - My numbers for the first doIt after fileIn (trying to show the
>>improvement) seem to correspond to your numbers of *both* doIts in
>>magnitude, if you have a GHz machine (I have a slow 366MHz Pentium).
>>  - The proportion of your numbers *after* fileIn is similar to the
>>proportion of mine *before*, but then mine are *much* slower.
>>
>>2. Since you have been tired: the doIts differ, and the cs comment of
>>v1.1 says *not* to fileOut the cs to fileIn it later again.
>>
>>3. Questions:
>>  - What have you gotten before fileIn for both doIts?
>>  - What is the previous version stamp of String>>hash in your test
>>image? For me it is SqR 8/13/2002 10:52.
>>
>>4. Thoughts:
>>  - Has there been a change of SystemDictionary>>allClasses? For me it
>>returns the classes and not their names (symbols).
>>  - Not very probable: Do you have a better identityHash?
>>
>>My VM:
>>squeak -version
>>3.7b-5 #1 Mon Jun 14 18:05:35 CEST 2004 gcc 3.3.3
>>Squeak3.7beta of '1 April 2004' [latest update: #5954]
>>Linux karl 2.4.26sr #1 Sun May 23 19:28:24 CEST 2004 i686 GNU/Linux
>>default plugin location: /usr/local/lib/squeak/3.7b-5/*.so
>>
>>
>>Hope that helps,
>>Stephan
>>
>>
>>>Stef
>>>
>>
>>-- 
>>Stephan Rudlof (sr at evolgo.de)
>>   "Genius doesn't work on an assembly line basis.
>>    You can't simply say, 'Today I will be brilliant.'"
>>    -- Kirk, "The Ultimate Computer", stardate 4731.3
>>
> 
> 
> 
> 

-- 
Stephan Rudlof (sr at evolgo.de)
   "Genius doesn't work on an assembly line basis.
    You can't simply say, 'Today I will be brilliant.'"
    -- Kirk, "The Ultimate Computer", stardate 4731.3



More information about the Squeak-dev mailing list