Thue-Morse and performance: Squeak v.s. Strongtalk v.s. VisualWorks

Sun Dec 17 19:32:51 UTC 2006

Klaus D. Witzel wrote:
>> The claim is about "Smalltalk code", not about "Klaus Witzel 
>> Benchmarks" (the difference between the two should be obvious).
> 
> Not that I see any difference, I posted Smalltalk code (perhaps you 
> meant something else?)

Yes, clearly you don't see the difference and this seems to be at the 
heart of the problem. You are running a micro-benchmark with specific 
performance characteristics, that are not typical for Smalltalk code in 
the large. Of course, you are free to make up your own performance 
characteristics and measure these but that's what I call "Klaus Witzel 
Benchmarks" - code that has been chosen because it has performance 
characteristics that you want to measure not the performance 
characteristics that "Smalltalk code" *typically* has.

The Strongtalk claims are about *typical* Smalltalk performance 
characteristics, nobody has ever claimed that Strongtalk would run any 
code with any performance characteristic that anyone could ever come up 
with faster than other Smalltalks. In particular, there is no claim 
about "faster polymorphic send performance than any other Smalltalk".

Nevertheless, solely based on this benchmark (which, again, do not 
reflect typical Smalltalk performance characteristics) you are making 
outrageous claims like: "I'm sorry to tell that Strongtalk is NOT that 
fast." or "I'm disappointed, Strongtalk was always advertised as being 
the fastest Smalltalk available "...executes Smalltalk much faster than 
any other Smalltalk implementation...", and now it shows to be in almost 
the same class as Squeak is".

That's what I object to. Your benchmark is absolutely no basis for such 
far-reaching and (once you do some real benchmarking) obviously false 
claims. A single micro-benchmark is simply not enough to judge overall 
performance.

>> And as I am saying in the above the *actual* code has "Smalltalk 
>> performance characteristics" whereas your made-up micro-benchmark 
>> doesn't.
> 
> C'mon. Sending messages to elements of collections _is_ characteristic 
> for the Smalltalks.

Yes, sending messages to elements of collections is characteristic. But 
sending messages to elements of *highly polymorphic* collections (which 
you specifically constructed for the benchmark) is not.

Fortunately, it is very easy to show just how non-characteristic your 
choice of collection is by looking at an actual image:

	lastObj := Object new.
	nextObj := nil someObject.
	bag := Bag new.
	[nextObj == lastObj] whileFalse:[
		nextObj isCollection ifTrue:[
			set := Set new.
			nextObj do:[:each| set add: each class].
			bag add: set size.
		].
		nextObj := nextObj nextObject.
	].
	max := bag size.
	bag sortedCounts do:[:assoc|
		Transcript crtab; show: assoc key.
		Transcript show: ' (', ((100.0 * assoc key / max) truncateTo: 0.01) 
asString,'%): '.
		Transcript show: assoc value.
	].

The result of which is (in a Croquet image I'm doing my work in):

	306384 (85.12%): 1
	31278 (8.69%): 2
	19377 (5.38%): 0
	2487 (0.69%): 3
	178 (0.04%): 4
	51 (0.01%): 5
	38 (0.01%): 6
	18 (0.0%): 10
	17 (0.0%): 7
	14 (0.0%): 8
	8 (0.0%): 9
	[...etc...]

In other words, more than 90% of all the collections (some 350,000 so 
it's a nice big sample) have at most a single receiver type. 8% have two 
receiver types. Everything else is noise. If you keep in mind that good 
amount of the 8% are due to monomorphic collections using Arrays 
utilizing nil to indicate empty slots the practical percentage of 
monomorphic collections is probably somewhere between 95-98%.

So no, your benchmark is not characteristic for Smalltalk code.

Cheers,
   - Andreas