[squeak-dev] [Vm-dev] Sign-bit bug in character literals > 16r7FFF ... related to SistaV1?

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Mar 9 12:13:27 UTC 2022


VM fix (workaround) proposed in VMMakeInbox/VMMaker.oscog-nice.3174

Le mer. 9 mars 2022 à 10:25, Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> a écrit :

> IOW, the Character value being unsigned, it would be preferable to use
> extend A rather than extend B in #genPushCharacter:
> My understanding is that this would require a VM change too...
>
> Le mer. 9 mars 2022 à 10:08, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> a écrit :
>
>> Oups forgot to respond to squeak-dev too...
>>
>> in #interpretNextSistaV1InstructionFor: we se that extB is interpreted as
>> signed char
>>
>> extB := (extB = 0 and: [extByte > 127])
>> ifTrue: [extByte - 256]
>> ifFalse: [(extB bitShift: 8) + extByte]
>>
>> Then in interpretNext2ByteSistaV1Instruction: bytecode for: client extA:
>> extA extB: extB startPC: startPC
>>
>> ^client pushSpecialConstant: (Character value: (extB bitShift: 8) + byte)
>>
>> In our case, extA=0, extB=-128, bytecode=233
>>
>> Le mer. 9 mars 2022 à 10:02, Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>
>>> Hi Marcel,
>>> yes, I agree, the bug is in bytecode encoding/decoding of immediate
>>> Character value,
>>> I stepped into (Compiler evaluate: (String with: $$ with: (Character
>>> value: 16r8000))), and if we step into executeMethod, we can inspect what
>>> is going on.
>>>
>>>
>>> Le mer. 9 mars 2022 à 08:39, Marcel Taeumel <marcel.taeumel at hpi.de> a
>>> écrit :
>>>
>>>>
>>>> Hi Nicolas --
>>>>
>>>> There is a bug in the EncoderForSistaV1. The behavior is okay for
>>>> EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
>>>>
>>>> CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1.
>>>> CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
>>>>
>>>> If you do send #halt instead of #asInteger, you get another interesting
>>>> debugger when trying to start debugging:
>>>>
>>>>
>>>>
>>>> Best,
>>>> Marcel
>>>>
>>>> Am 09.03.2022 08:34:11 schrieb Nicolas Cellier <
>>>> nicolas.cellier.aka.nice at gmail.com>:
>>>> Ah OK, I see it on macos too
>>>> It remains to determine which operation exactly is involved...
>>>> The TextMorph holding the printed result is correct - a WideString,
>>>> whose
>>>> last Character is (Character value: 32768).
>>>>
>>>> Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
>>>> écrit :
>>>>
>>>> >
>>>> > Hi Dave, hi Nicolas --
>>>> >
>>>> > I am working in Windows 10.
>>>> >
>>>> > > I cannot reproduce on Linux 64 bit either:
>>>> > > (Character value: 16r8000) asInteger hex ==> '16r8000'
>>>> >
>>>> > That's not how you would reproduce it. The bug affects character
>>>> literals,
>>>> > not character objects/instances. You have to evaluate code on that
>>>> > character literal.
>>>> >
>>>> > Maybe this picture helps:
>>>> >
>>>> >
>>>> >
>>>> > Best,
>>>> > Marcel
>>>> >
>>>> > Am 08.03.2022 18:56:09 schrieb David T. Lewis :
>>>> >
>>>> > I cannot reproduce on Linux 64 bit either:
>>>> >
>>>> > (Character value: 16r8000) asInteger hex ==> '16r8000'
>>>> >
>>>> > Dave
>>>> >
>>>> >
>>>> > On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
>>>> > >
>>>> > > Hi Marcel,
>>>> > > which OS ?
>>>> > > I cannot reproduce on macos 64,
>>>> > >
>>>> > > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
>>>> > > 5.20211023.2003
>>>> > > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1
>>>> Compatible
>>>> > Apple
>>>> > > LLVM 10.0.1 (clang-1001.0.46.4)
>>>> > > platform sources revision VM: 202110232003
>>>> > >
>>>> > > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
>>>> > > ??crit :
>>>> > >
>>>> > > >
>>>> > > > Hi Eliot, hi all --
>>>> > > >
>>>> > > > I think we have an sign-bit bug for character literals with code
>>>> > points >
>>>> > > > 16r7FFF.
>>>> > > >
>>>> > > > Steps to reproduce:
>>>> > > >
>>>> > > > 1. Print it: "Character value: 16r8000"
>>>> > > > 2. Inspect the result by evaluating the character literal or send
>>>> > > > #asInteger to it. It will most likely not render in a standard
>>>> Squeak
>>>> > and
>>>> > > > show up like "$? asInteger".
>>>> > > >
>>>> > > > In a 32-bit VM, I will get the (positive) integer value
>>>> 16r3FFF8000.
>>>> > > > In a 64-bit VM, I will get the (negative) integer value
>>>> '-16r8000'.
>>>> > > >
>>>> > > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1.
>>>> In
>>>> > 64-bit,
>>>> > > > this means a negative number. Not sure about bits 30 and 31 here.
>>>> > > >
>>>> > > > Is there a bug in the upper tag bits of immediate characters?
>>>> > > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
>>>> > > >
>>>> > > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar.
>>>> Mine was
>>>> > 0
>>>> > > > in this experiment.)
>>>> > > >
>>>> > > > VM: 202112201228 (VMMaker.oscog-eem.3116)
>>>> > > >
>>>> > > > Best,
>>>> > > > Marcel
>>>> > > >
>>>> >
>>>> >
>>>> Ah OK, I see it on macos too
>>>> It remains to determine which operation exactly is involved...
>>>> The TextMorph holding the printed result is correct - a WideString,
>>>> whose last Character is (Character value: 32768).
>>>>
>>>> Le mer. 9 mars 2022 à 08:08, Marcel Taeumel <marcel.taeumel at hpi.de> a
>>>> écrit :
>>>>
>>>>>
>>>>>
>>>>> Hi Dave, hi Nicolas --
>>>>>
>>>>> I am working in Windows 10.
>>>>>
>>>>> > I cannot reproduce on Linux 64 bit either:
>>>>> > (Character value: 16r8000) asInteger hex ==> '16r8000'
>>>>>
>>>>> That's not how you would reproduce it. The bug affects character
>>>>> literals, not character objects/instances. You have to evaluate code on
>>>>> that character literal.
>>>>>
>>>>> Maybe this picture helps:
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>> Marcel
>>>>>
>>>>>
>>>>> Am 08.03.2022 18:56:09 schrieb David T. Lewis <lewis at mail.msen.com>:
>>>>>
>>>>> I cannot reproduce on Linux 64 bit either:
>>>>>
>>>>> (Character value: 16r8000) asInteger hex ==> '16r8000'
>>>>>
>>>>> Dave
>>>>>
>>>>>
>>>>> On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
>>>>> >
>>>>> > Hi Marcel,
>>>>> > which OS ?
>>>>> > I cannot reproduce on macos 64,
>>>>> >
>>>>> > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
>>>>> > 5.20211023.2003
>>>>> > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
>>>>> Apple
>>>>> > LLVM 10.0.1 (clang-1001.0.46.4)
>>>>> > platform sources revision VM: 202110232003
>>>>> >
>>>>> > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
>>>>> > ??crit :
>>>>> >
>>>>> > >
>>>>> > > Hi Eliot, hi all --
>>>>> > >
>>>>> > > I think we have an sign-bit bug for character literals with code
>>>>> points >
>>>>> > > 16r7FFF.
>>>>> > >
>>>>> > > Steps to reproduce:
>>>>> > >
>>>>> > > 1. Print it: "Character value: 16r8000"
>>>>> > > 2. Inspect the result by evaluating the character literal or send
>>>>> > > #asInteger to it. It will most likely not render in a standard
>>>>> Squeak and
>>>>> > > show up like "$? asInteger".
>>>>> > >
>>>>> > > In a 32-bit VM, I will get the (positive) integer value
>>>>> 16r3FFF8000.
>>>>> > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
>>>>> > >
>>>>> > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
>>>>> 64-bit,
>>>>> > > this means a negative number. Not sure about bits 30 and 31 here.
>>>>> > >
>>>>> > > Is there a bug in the upper tag bits of immediate characters?
>>>>> > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
>>>>> > >
>>>>> > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
>>>>> was 0
>>>>> > > in this experiment.)
>>>>> > >
>>>>> > > VM: 202112201228 (VMMaker.oscog-eem.3116)
>>>>> > >
>>>>> > > Best,
>>>>> > > Marcel
>>>>> > >
>>>>>
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220309/06f03501/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 14513 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220309/06f03501/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 90427 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220309/06f03501/attachment-0003.png>


More information about the Squeak-dev mailing list