Slang question (was: Re: [squeak-dev] The Trunk: Collections-ul.685.mcz)

David T. Lewis lewis at mail.msen.com
Sun Apr 3 22:00:13 UTC 2016


Many people will not be familiar with this optimization, so a few words of
explanation are in order for background:

There is a mechanism for translating certain methods in the image directly
into primitives for performance. This is normally not noticed, but you will
find occasional refernces such at this one in the image:

  findSubstring: key in: body startingAt: start matchTable: matchTable
       <primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>

This is a method that first tries to call a version of itself that has
been translated to C. If the primitive fails, the normal Smalltalk method
is executed. The hidden tricky bit is that MiscPrimitivePlugin, which
lives in the VMMaker package, arranges for the #findSubstring:in:startingAt:matchTable:
method in the image to be translated to C and installed as a primitive.
This means that the generated VM is actually governed partly my methods
in the main image (not in the VMMaker package).

It also means that if one of these methods is changed in the image, it
might accidentally result in unexpected (and possibly wrong) code being
generated for the plugin in the VM. That is the reason that Levente is
asking about the slang code generation in this case.

Dave


On Sun, Apr 03, 2016 at 07:45:13PM +0200, Levente Uzonyi wrote:
> On Sun, 3 Apr 2016, commits at source.squeak.org wrote:
> 
> >Levente Uzonyi uploaded a new version of Collections to project The Trunk:
> >http://source.squeak.org/trunk/Collections-ul.685.mcz
> >
> >==================== Summary ====================
> >
> >Name: Collections-ul.685
> >Author: ul
> >Time: 3 April 2016, 3:21:51.035808 am
> >UUID: db8fd391-2306-4ccc-87d5-b5dee96a78ab
> >Ancestors: Collections-eem.684
> >
> >Use Spur's new character comparison abilities in some String methods.
> >
> >=============== Diff against Collections-eem.684 ===============
> >
> >Item was changed:
> > ----- Method: ByteString>>findSubstring:in:startingAt:matchTable: (in 
> > category 'comparing') -----
> > findSubstring: key in: body startingAt: start matchTable: matchTable
> > 	"Answer the index in the string body at which the substring key 
> > 	first occurs, at or beyond start.  The match is determined using 
> > 	matchTable, which can be used to effect, eg, case-insensitive 
> > 	matches.  If no match is found, zero will be returned.
> >
> > 	The algorithm below is not optimum -- it is intended to be 
> > 	translated to C which will go so fast that it wont matter."
> > 	| index |
> > 	<primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
> > 	<var: #key declareC: 'unsigned char *key'>
> > 	<var: #body declareC: 'unsigned char *body'>
> > 	<var: #matchTable declareC: 'unsigned char *matchTable'>
> >
> > 	key size = 0 ifTrue: [^ 0].
> >+ 	matchTable ifNil: [
> >+ 		start to: body size - key size + 1 do: [ :startIndex |
> >+ 			index := 0.
> >+ 			[ (body at: startIndex + index) == (key at: (index 
> >:= index + 1)) ] whileTrue: [
> >+ 				index = key size ifTrue: [ ^startIndex ] ] ].
> >+ 		^0 ].
> 
> I wonder if the #ifNil: check is correct in Slang. The generated code is
> 
> ...
>     unsigned char *matchTable;
> ...
>         matchTable = arrayValueOf(stackValue(0));
>         matchTable -= 1;
> ...
>         if (!(matchTable)) {
> ...
> 
> So, will the value of matchTable actually be NULL or not, when nil is 
> passed to the primitive. That -= 1 makes me think that it won't be NULL.
> 
> Another thing is that the loop condition, while it's order of evaluation 
> is well defined in Smalltalk, will compile to C code containing undefined 
> behavior:
> 
>                         while ((body[startIndex + index]) == (key[index += 
>                         1])) {
> 
> Since in C the evaluation of the left and right side of == can happen in 
> any order, the generated machine code may or may not be correct.
> Is this is a difference between Smalltalk and Slang, or is it just a slip 
> in the code generator?
> 
> Levente
> 
> > 	(start max: 1) to: body size - key size + 1 do:
> > 		[:startIndex |
> > 		index := 1.
> > 			[(matchTable at: (body at: startIndex+index-1) 
> > 			asciiValue + 1)
> > 				= (matchTable at: (key at: index) asciiValue 
> > 				+ 1)]
> > 				whileTrue:
> > 				[index = key size ifTrue: [^ startIndex].
> > 				index := index+1]].
> > 	^ 0
> > "
> > ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 1 matchTable: 
> > CaseSensitiveOrder 1
> > ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 2 matchTable: 
> > CaseSensitiveOrder 7
> > ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 8 matchTable: 
> > CaseSensitiveOrder 0
> > ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: 
> > CaseSensitiveOrder 0
> > ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: 
> > CaseInsensitiveOrder 7
> > "!


More information about the Squeak-dev mailing list