[Vm-dev] Fwd: MiscPrimitive plugin: plan for next 6 months, please read
Levente Uzonyi
leves at caesar.elte.hu
Sat Dec 23 21:49:43 UTC 2017
On Fri, 22 Dec 2017, Clément Bera wrote:
> Hi all,
snip
> 1. translate: aString from: start to: stop table: table
>
> This primitive seems to be unused. I suggest we move it from a primitive
to plain Smalltalk code.
That primitive is currently used to convert ByteStrings to lower and
upper case, and it's also used to convert between cr and lf line endings.
The primitive fails in Pharo for some reason, but it's still used in the
6.0 image I have.
It works properly in Squeak and Cuis.
>
> 2. stringHash: aString initialHash: speciesHash
>
> Since we now have hashMultiply as a primitive, the stringHash primitive
is now faster for very small strings with a Smalltalk version using
hashMultiply, and slower (2x) on medium to large strings. I suggest we
move it to plain Smalltalk code.
In my 64-bit Squeak Spur image, the primitive is faster than the pure
Smalltalk code (which uses the new hashMultiply primitive) for strings of
length 4 and longer. The longer the string, the more significant the
different becomes, probably due to the more frequent use of the
hashMultiply primitive (for a string of length 1000, it's used 1000
times).
Here's the benchmark I wrote:
data := #(0 1 2 3 4 5 10 20 50 100 200 500 1000) collect: [ :size |
| s primitive smalltalk overhead iterations |
Smalltalk garbageCollect.
s := String new: size withAll: $a.
iterations := 100000000 // (size max: 1).
overhead := [ 1 to: iterations do: [ :i | ] ] timeToRun.
primitive := [ 1 to: iterations do: [ :i | ByteString stringHash: s initialHash: 1 ] ] timeToRun.
smalltalk := [ 1 to: iterations do: [ :i | ByteString stringHash2: s initialHash: 1 ] ] timeToRun.
{ size. primitive - overhead. smalltalk - overhead } ].
So, I suggest the primitive be kept in some form.
What I always wanted to see is a linear search primitive. Something like
primitiveIndexOfAsciiInString, but more general:
- it should use #== for comparison
- it should return a number according to the following rules
- return the (one-based) index of the first indexable field containing
the value if such exists
- return the (one-based) index of the first pointer field containing the
value times -1 if such exists
- return 0 otherwise
We actually have a primitive for this, primitive 132, but it returns a
boolean value instead of the actual index, which makes it far less useful
than what it could be.
Levente
More information about the Vm-dev
mailing list