[Vm-dev] Fwd: MiscPrimitive plugin: plan for next 6 months, please read

Levente Uzonyi leves at caesar.elte.hu
Sat Dec 23 21:49:43 UTC 2017


On Fri, 22 Dec 2017, Clément Bera wrote:

> Hi all,

snip

> 1. translate: aString from: start  to: stop  table: table
>
> This primitive seems to be unused. I suggest we move it from a primitive 
to plain Smalltalk code.

That primitive is currently used to convert ByteStrings to lower and 
upper case, and it's also used to convert between cr and lf line endings. 
The primitive fails in Pharo for some reason, but it's still used in the 
6.0 image I have.
It works properly in Squeak and Cuis.

>
> 2. stringHash: aString initialHash: speciesHash
>
> Since we now have hashMultiply as a primitive, the stringHash primitive 
is now faster for very small strings with a Smalltalk version using 
hashMultiply, and slower (2x) on medium to large strings. I suggest we 
move it to plain Smalltalk code.

In my 64-bit Squeak Spur image, the primitive is faster than the pure 
Smalltalk code (which uses the new hashMultiply primitive) for strings of 
length 4 and longer. The longer the string, the more significant the 
different becomes, probably due to the more frequent use of the 
hashMultiply primitive (for a string of length 1000, it's used 1000 
times).

Here's the benchmark I wrote:

data := #(0 1 2 3 4 5 10 20 50 100 200 500 1000) collect: [ :size |
 	| s primitive smalltalk overhead iterations |
 	Smalltalk garbageCollect.
 	s := String new: size withAll: $a.
 	iterations := 100000000 // (size max: 1).
 	overhead := [ 1 to: iterations do: [ :i | ] ] timeToRun.
 	primitive := [ 1 to: iterations do: [ :i | ByteString stringHash: s initialHash: 1 ] ] timeToRun.
 	smalltalk := [ 1 to: iterations do: [ :i | ByteString stringHash2: s initialHash: 1 ] ] timeToRun.
 	{ size. primitive - overhead. smalltalk - overhead } ].

So, I suggest the primitive be kept in some form.


What I always wanted to see is a linear search primitive. Something like 
primitiveIndexOfAsciiInString, but more general:
- it should use #== for comparison
- it should return a number according to the following rules
   - return the (one-based) index of the first indexable field containing 
the value if such exists
   - return the (one-based) index of the first pointer field containing the 
value times -1 if such exists
   - return 0 otherwise
We actually have a primitive for this, primitive 132, but it returns a 
boolean value instead of the actual index, which makes it far less useful 
than what it could be.

Levente


More information about the Vm-dev mailing list