[Vm-dev] Sign-bit bug in character literals > 16r7FFF ... related to SistaV1?

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Mar 9 14:41:30 UTC 2022


Or just restrict EncoderForSistaV1>>#genPushCharacter:

...snip...
     (code < 0 or: [code > 16r7FFF]) ifTrue:
         [^self outOfRangeError: 'character' index: code range: 0 to:
16r7FFF].
...snip...

Le mer. 9 mars 2022 à 14:16, Marcel Taeumel <marcel.taeumel at hpi.de> a
écrit :

>
> Hi Nicolas --
>
> Thanks! Also for the proposed workaround in VMMaker.oscog-nice.3174.
>
> For what it's worth, one can always replace the character-literal syntax
> with string access:
>
> $x.
> 'x' first.
>
> Or store the code point if the optical appearance is not relevant:
>
> Character value: 16r78.
>
> Best,
> Marcel
>
> Am 09.03.2022 10:02:46 schrieb Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com>:
> Hi Marcel,
> yes, I agree, the bug is in bytecode encoding/decoding of immediate
> Character value,
> I stepped into (Compiler evaluate: (String with: $$ with: (Character
> value:
> 16r8000))), and if we step into executeMethod, we can inspect what is
> going
> on.
>
>
> Le mer. 9 mars 2022 à 08:39, Marcel Taeumel a
> écrit :
>
> >
> > Hi Nicolas --
> >
> > There is a bug in the EncoderForSistaV1. The behavior is okay for
> > EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I
> suppose.
> >
> > CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1.
> > CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
> >
> > If you do send #halt instead of #asInteger, you get another interesting
> > debugger when trying to start debugging:
> >
> >
> >
> > Best,
> > Marcel
> >
> > Am 09.03.2022 08:34:11 schrieb Nicolas Cellier <
> > nicolas.cellier.aka.nice at gmail.com>:
> > Ah OK, I see it on macos too
> > It remains to determine which operation exactly is involved...
> > The TextMorph holding the printed result is correct - a WideString,
> whose
> > last Character is (Character value: 32768).
> >
> > Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
> > écrit :
> >
> > >
> > > Hi Dave, hi Nicolas --
> > >
> > > I am working in Windows 10.
> > >
> > > > I cannot reproduce on Linux 64 bit either:
> > > > (Character value: 16r8000) asInteger hex ==> '16r8000'
> > >
> > > That's not how you would reproduce it. The bug affects character
> > literals,
> > > not character objects/instances. You have to evaluate code on that
> > > character literal.
> > >
> > > Maybe this picture helps:
> > >
> > >
> > >
> > > Best,
> > > Marcel
> > >
> > > Am 08.03.2022 18:56:09 schrieb David T. Lewis :
> > >
> > > I cannot reproduce on Linux 64 bit either:
> > >
> > > (Character value: 16r8000) asInteger hex ==> '16r8000'
> > >
> > > Dave
> > >
> > >
> > > On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
> > > >
> > > > Hi Marcel,
> > > > which OS ?
> > > > I cannot reproduce on macos 64,
> > > >
> > > > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
> > > > 5.20211023.2003
> > > > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
> > > Apple
> > > > LLVM 10.0.1 (clang-1001.0.46.4)
> > > > platform sources revision VM: 202110232003
> > > >
> > > > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
> > > > ??crit :
> > > >
> > > > >
> > > > > Hi Eliot, hi all --
> > > > >
> > > > > I think we have an sign-bit bug for character literals with code
> > > points >
> > > > > 16r7FFF.
> > > > >
> > > > > Steps to reproduce:
> > > > >
> > > > > 1. Print it: "Character value: 16r8000"
> > > > > 2. Inspect the result by evaluating the character literal or send
> > > > > #asInteger to it. It will most likely not render in a standard
> > Squeak
> > > and
> > > > > show up like "$? asInteger".
> > > > >
> > > > > In a 32-bit VM, I will get the (positive) integer value
> 16r3FFF8000.
> > > > > In a 64-bit VM, I will get the (negative) integer value
> '-16r8000'.
> > > > >
> > > > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
> > > 64-bit,
> > > > > this means a negative number. Not sure about bits 30 and 31 here.
> > > > >
> > > > > Is there a bug in the upper tag bits of immediate characters?
> > > > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
> > > > >
> > > > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
> > was
> > > 0
> > > > > in this experiment.)
> > > > >
> > > > > VM: 202112201228 (VMMaker.oscog-eem.3116)
> > > > >
> > > > > Best,
> > > > > Marcel
> > > > >
> > >
> > >
> > Ah OK, I see it on macos too
> > It remains to determine which operation exactly is involved...
> > The TextMorph holding the printed result is correct - a WideString,
> whose
> > last Character is (Character value: 32768).
> >
> > Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
> > écrit :
> >
> >>
> >>
> >> Hi Dave, hi Nicolas --
> >>
> >> I am working in Windows 10.
> >>
> >> > I cannot reproduce on Linux 64 bit either:
> >> > (Character value: 16r8000) asInteger hex ==> '16r8000'
> >>
> >> That's not how you would reproduce it. The bug affects character
> >> literals, not character objects/instances. You have to evaluate code on
> >> that character literal.
> >>
> >> Maybe this picture helps:
> >>
> >>
> >>
> >> Best,
> >> Marcel
> >>
> >>
> >> Am 08.03.2022 18:56:09 schrieb David T. Lewis :
> >>
> >> I cannot reproduce on Linux 64 bit either:
> >>
> >> (Character value: 16r8000) asInteger hex ==> '16r8000'
> >>
> >> Dave
> >>
> >>
> >> On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
> >> >
> >> > Hi Marcel,
> >> > which OS ?
> >> > I cannot reproduce on macos 64,
> >> >
> >> > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
> >> > 5.20211023.2003
> >> > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
> >> Apple
> >> > LLVM 10.0.1 (clang-1001.0.46.4)
> >> > platform sources revision VM: 202110232003
> >> >
> >> > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
> >> > ??crit :
> >> >
> >> > >
> >> > > Hi Eliot, hi all --
> >> > >
> >> > > I think we have an sign-bit bug for character literals with code
> >> points >
> >> > > 16r7FFF.
> >> > >
> >> > > Steps to reproduce:
> >> > >
> >> > > 1. Print it: "Character value: 16r8000"
> >> > > 2. Inspect the result by evaluating the character literal or send
> >> > > #asInteger to it. It will most likely not render in a standard
> Squeak
> >> and
> >> > > show up like "$? asInteger".
> >> > >
> >> > > In a 32-bit VM, I will get the (positive) integer value
> 16r3FFF8000.
> >> > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
> >> > >
> >> > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
> >> 64-bit,
> >> > > this means a negative number. Not sure about bits 30 and 31 here.
> >> > >
> >> > > Is there a bug in the upper tag bits of immediate characters?
> >> > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
> >> > >
> >> > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
> >> was 0
> >> > > in this experiment.)
> >> > >
> >> > > VM: 202112201228 (VMMaker.oscog-eem.3116)
> >> > >
> >> > > Best,
> >> > > Marcel
> >> > >
> >>
> >>
> >
> Hi Marcel,
> yes, I agree, the bug is in bytecode encoding/decoding of immediate
> Character value,
> I stepped into (Compiler evaluate: (String with: $$ with: (Character
> value: 16r8000))), and if we step into executeMethod, we can inspect what
> is going on.
>
>
> Le mer. 9 mars 2022 à 08:39, Marcel Taeumel <marcel.taeumel at hpi.de> a
> écrit :
>
>>
>>
>> Hi Nicolas --
>>
>>
>> There is a bug in the EncoderForSistaV1. The behavior is okay for
>> EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
>>
>> CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1.
>> CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
>>
>> If you do send #halt instead of #asInteger, you get another interesting
>> debugger when trying to start debugging:
>>
>>
>>
>> Best,
>> Marcel
>>
>>
>> Am 09.03.2022 08:34:11 schrieb Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com>:
>> Ah OK, I see it on macos too
>>
>> It remains to determine which operation exactly is involved...
>>
>> The TextMorph holding the printed result is correct - a WideString, whose
>>
>> last Character is (Character value: 32768).
>>
>>
>>
>> Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
>>
>> écrit :
>>
>>
>>
>> >
>>
>> > Hi Dave, hi Nicolas --
>>
>> >
>>
>> > I am working in Windows 10.
>>
>> >
>>
>> > > I cannot reproduce on Linux 64 bit either:
>>
>> > > (Character value: 16r8000) asInteger hex ==> '16r8000'
>>
>> >
>>
>> > That's not how you would reproduce it. The bug affects character
>> literals,
>>
>> > not character objects/instances. You have to evaluate code on that
>>
>> > character literal.
>>
>> >
>>
>> > Maybe this picture helps:
>>
>> >
>>
>> >
>>
>> >
>>
>> > Best,
>>
>> > Marcel
>>
>> >
>>
>> > Am 08.03.2022 18:56:09 schrieb David T. Lewis :
>>
>> >
>>
>> > I cannot reproduce on Linux 64 bit either:
>>
>> >
>>
>> > (Character value: 16r8000) asInteger hex ==> '16r8000'
>>
>> >
>>
>> > Dave
>>
>> >
>>
>> >
>>
>> > On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
>>
>> > >
>>
>> > > Hi Marcel,
>>
>> > > which OS ?
>>
>> > > I cannot reproduce on macos 64,
>>
>> > >
>>
>> > > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
>>
>> > > 5.20211023.2003
>>
>> > > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
>>
>> > Apple
>>
>> > > LLVM 10.0.1 (clang-1001.0.46.4)
>>
>> > > platform sources revision VM: 202110232003
>>
>> > >
>>
>> > > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
>>
>> > > ??crit :
>>
>> > >
>>
>> > > >
>>
>> > > > Hi Eliot, hi all --
>>
>> > > >
>>
>> > > > I think we have an sign-bit bug for character literals with code
>>
>> > points >
>>
>> > > > 16r7FFF.
>>
>> > > >
>>
>> > > > Steps to reproduce:
>>
>> > > >
>>
>> > > > 1. Print it: "Character value: 16r8000"
>>
>> > > > 2. Inspect the result by evaluating the character literal or send
>>
>> > > > #asInteger to it. It will most likely not render in a standard
>> Squeak
>>
>> > and
>>
>> > > > show up like "$? asInteger".
>>
>> > > >
>>
>> > > > In a 32-bit VM, I will get the (positive) integer value
>> 16r3FFF8000.
>>
>> > > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
>>
>> > > >
>>
>> > > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
>>
>> > 64-bit,
>>
>> > > > this means a negative number. Not sure about bits 30 and 31 here.
>>
>> > > >
>>
>> > > > Is there a bug in the upper tag bits of immediate characters?
>>
>> > > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
>>
>> > > >
>>
>> > > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
>> was
>>
>> > 0
>>
>> > > > in this experiment.)
>>
>> > > >
>>
>> > > > VM: 202112201228 (VMMaker.oscog-eem.3116)
>>
>> > > >
>>
>> > > > Best,
>>
>> > > > Marcel
>>
>> > > >
>>
>> >
>>
>> >
>>
>> Ah OK, I see it on macos too
>> It remains to determine which operation exactly is involved...
>> The TextMorph holding the printed result is correct - a WideString, whose
>> last Character is (Character value: 32768).
>>
>> Le mer. 9 mars 2022 à 08:08, Marcel Taeumel <marcel.taeumel at hpi.de> a
>> écrit :
>>
>>>
>>>
>>>
>>> Hi Dave, hi Nicolas --
>>>
>>> I am working in Windows 10.
>>>
>>> > I cannot reproduce on Linux 64 bit either:
>>> > (Character value: 16r8000) asInteger hex ==> '16r8000'
>>>
>>> That's not how you would reproduce it. The bug affects character
>>> literals, not character objects/instances. You have to evaluate code on
>>> that character literal.
>>>
>>> Maybe this picture helps:
>>>
>>>
>>>
>>> Best,
>>> Marcel
>>>
>>>
>>>
>>>
>>> Am 08.03.2022 18:56:09 schrieb David T. Lewis <lewis at mail.msen.com>:
>>>
>>> I cannot reproduce on Linux 64 bit either:
>>>
>>> (Character value: 16r8000) asInteger hex ==> '16r8000'
>>>
>>> Dave
>>>
>>>
>>> On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
>>> >
>>> > Hi Marcel,
>>> > which OS ?
>>> > I cannot reproduce on macos 64,
>>> >
>>> > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
>>> > 5.20211023.2003
>>> > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
>>> Apple
>>> > LLVM 10.0.1 (clang-1001.0.46.4)
>>> > platform sources revision VM: 202110232003
>>> >
>>> > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
>>> > ??crit :
>>> >
>>> > >
>>> > > Hi Eliot, hi all --
>>> > >
>>> > > I think we have an sign-bit bug for character literals with code
>>> points >
>>> > > 16r7FFF.
>>> > >
>>> > > Steps to reproduce:
>>> > >
>>> > > 1. Print it: "Character value: 16r8000"
>>> > > 2. Inspect the result by evaluating the character literal or send
>>> > > #asInteger to it. It will most likely not render in a standard
>>> Squeak and
>>> > > show up like "$? asInteger".
>>> > >
>>> > > In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000.
>>> > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
>>> > >
>>> > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
>>> 64-bit,
>>> > > this means a negative number. Not sure about bits 30 and 31 here.
>>> > >
>>> > > Is there a bug in the upper tag bits of immediate characters?
>>> > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
>>> > >
>>> > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
>>> was 0
>>> > > in this experiment.)
>>> > >
>>> > > VM: 202112201228 (VMMaker.oscog-eem.3116)
>>> > >
>>> > > Best,
>>> > > Marcel
>>> > >
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220309/7f0ed6e2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 14513 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220309/7f0ed6e2/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 90427 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220309/7f0ed6e2/attachment-0003.png>


More information about the Vm-dev mailing list