Slang question (was: Re: [squeak-dev] The Trunk: Collections-ul.685.mcz)

Levente Uzonyi leves at caesar.elte.hu
Thu Apr 7 21:19:26 UTC 2016


Hi Dave,

On Tue, 5 Apr 2016, David T. Lewis wrote:

> I just got around to checking this for code generation for the interpreter VM
> (VMMaker, not VMMaker.oscog).  The #ifNil: is not translated at all, so I replaced
> it with this:

That's pretty interesting. The Cog branch behaves differently

 	matchTable ifNil: [

will become

 	if (!matchTable) {

while

 	matchTable = nil ifTrue: [
and
 	matchTable == nil ifTrue: [

will both compile to

 	if (matchTable == null) {

just like in the main branch.
All of these are correct, because null is just a macro to 0, but it's a 
bit surprising that "ifNil:" and "== nil ifTrue:" are not the same in 
Slang, while they are in Squeak.

My main concern is how matchTable is initialized. The generated code is

         matchTable = arrayValueOf(stackValue(0));
         matchTable -= 1;
         if (failed()) {
                 return null;
         }

which makes me think that matchTable will not be the C null pointer at 
all when the argument nil is passed to the primitive, because

 	matchTable -= 1;

would not yield an unsigned char* with 0 value unless arrayValueOf 
returns a pointer with the value 1 (assuming that sizeof(unsigned char) is 
1).
I haven't found where arrayValueOf was defined, so I can't tell if it has 
a chance to work at all.

Levente

P.S.: Since we're talking about C here, I CC'd the vm-dev list.

>
> 	matchTable = nil ifTrue:
>
> This produces the expected result, and can be compiled.
>
> There are other issues in the resulting VM, but they do not seem to be
> related to Collections-ul.685 (I reverted to Collections-ul.684 with the
> same results). I don't know the cause of those issues, but I will follow
> up in a separate thread as needed.
>
> With respect to Collections-ul.685, replacing the ifNil: test as above
> seems like the right thing to do.
>
> Dave
>
>
> On Sun, Apr 03, 2016 at 06:00:13PM -0400, David T. Lewis wrote:
>> Many people will not be familiar with this optimization, so a few words of
>> explanation are in order for background:
>>
>> There is a mechanism for translating certain methods in the image directly
>> into primitives for performance. This is normally not noticed, but you will
>> find occasional refernces such at this one in the image:
>>
>>   findSubstring: key in: body startingAt: start matchTable: matchTable
>>        <primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
>>
>> This is a method that first tries to call a version of itself that has
>> been translated to C. If the primitive fails, the normal Smalltalk method
>> is executed. The hidden tricky bit is that MiscPrimitivePlugin, which
>> lives in the VMMaker package, arranges for the #findSubstring:in:startingAt:matchTable:
>> method in the image to be translated to C and installed as a primitive.
>> This means that the generated VM is actually governed partly my methods
>> in the main image (not in the VMMaker package).
>>
>> It also means that if one of these methods is changed in the image, it
>> might accidentally result in unexpected (and possibly wrong) code being
>> generated for the plugin in the VM. That is the reason that Levente is
>> asking about the slang code generation in this case.
>>
>> Dave
>>
>>
>> On Sun, Apr 03, 2016 at 07:45:13PM +0200, Levente Uzonyi wrote:
>>> On Sun, 3 Apr 2016, commits at source.squeak.org wrote:
>>>
>>>> Levente Uzonyi uploaded a new version of Collections to project The Trunk:
>>>> http://source.squeak.org/trunk/Collections-ul.685.mcz
>>>>
>>>> ==================== Summary ====================
>>>>
>>>> Name: Collections-ul.685
>>>> Author: ul
>>>> Time: 3 April 2016, 3:21:51.035808 am
>>>> UUID: db8fd391-2306-4ccc-87d5-b5dee96a78ab
>>>> Ancestors: Collections-eem.684
>>>>
>>>> Use Spur's new character comparison abilities in some String methods.
>>>>
>>>> =============== Diff against Collections-eem.684 ===============
>>>>
>>>> Item was changed:
>>>> ----- Method: ByteString>>findSubstring:in:startingAt:matchTable: (in
>>>> category 'comparing') -----
>>>> findSubstring: key in: body startingAt: start matchTable: matchTable
>>>> 	"Answer the index in the string body at which the substring key
>>>> 	first occurs, at or beyond start.  The match is determined using
>>>> 	matchTable, which can be used to effect, eg, case-insensitive
>>>> 	matches.  If no match is found, zero will be returned.
>>>>
>>>> 	The algorithm below is not optimum -- it is intended to be
>>>> 	translated to C which will go so fast that it wont matter."
>>>> 	| index |
>>>> 	<primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
>>>> 	<var: #key declareC: 'unsigned char *key'>
>>>> 	<var: #body declareC: 'unsigned char *body'>
>>>> 	<var: #matchTable declareC: 'unsigned char *matchTable'>
>>>>
>>>> 	key size = 0 ifTrue: [^ 0].
>>>> + 	matchTable ifNil: [
>>>> + 		start to: body size - key size + 1 do: [ :startIndex |
>>>> + 			index := 0.
>>>> + 			[ (body at: startIndex + index) == (key at: (index
>>>> := index + 1)) ] whileTrue: [
>>>> + 				index = key size ifTrue: [ ^startIndex ] ] ].
>>>> + 		^0 ].
>>>
>>> I wonder if the #ifNil: check is correct in Slang. The generated code is
>>>
>>> ...
>>>     unsigned char *matchTable;
>>> ...
>>>         matchTable = arrayValueOf(stackValue(0));
>>>         matchTable -= 1;
>>> ...
>>>         if (!(matchTable)) {
>>> ...
>>>
>>> So, will the value of matchTable actually be NULL or not, when nil is
>>> passed to the primitive. That -= 1 makes me think that it won't be NULL.
>>>
>>> Another thing is that the loop condition, while it's order of evaluation
>>> is well defined in Smalltalk, will compile to C code containing undefined
>>> behavior:
>>>
>>>                         while ((body[startIndex + index]) == (key[index +=
>>>                         1])) {
>>>
>>> Since in C the evaluation of the left and right side of == can happen in
>>> any order, the generated machine code may or may not be correct.
>>> Is this is a difference between Smalltalk and Slang, or is it just a slip
>>> in the code generator?
>>>
>>> Levente
>>>
>>>> 	(start max: 1) to: body size - key size + 1 do:
>>>> 		[:startIndex |
>>>> 		index := 1.
>>>> 			[(matchTable at: (body at: startIndex+index-1)
>>>> 			asciiValue + 1)
>>>> 				= (matchTable at: (key at: index) asciiValue
>>>> 				+ 1)]
>>>> 				whileTrue:
>>>> 				[index = key size ifTrue: [^ startIndex].
>>>> 				index := index+1]].
>>>> 	^ 0
>>>> "
>>>> ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 1 matchTable:
>>>> CaseSensitiveOrder 1
>>>> ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 2 matchTable:
>>>> CaseSensitiveOrder 7
>>>> ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 8 matchTable:
>>>> CaseSensitiveOrder 0
>>>> ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable:
>>>> CaseSensitiveOrder 0
>>>> ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable:
>>>> CaseInsensitiveOrder 7
>>>> "!
>
>


More information about the Squeak-dev mailing list