Or just restrict EncoderForSistaV1>>#genPushCharacter:
...snip... (code < 0 or: [code > 16r7FFF]) ifTrue: [^self outOfRangeError: 'character' index: code range: 0 to: 16r7FFF]. ...snip...
Le mer. 9 mars 2022 à 14:16, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
Thanks! Also for the proposed workaround in VMMaker.oscog-nice.3174.
For what it's worth, one can always replace the character-literal syntax with string access:
$x. 'x' first.
Or store the code point if the optical appearance is not relevant:
Character value: 16r78.
Best, Marcel
Am 09.03.2022 10:02:46 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I
suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString,
whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString,
whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too
It remains to determine which operation exactly is involved...
The TextMorph holding the printed result is correct - a WideString, whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that
character literal.
Maybe this picture helps:
Best,
Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel,
which OS ?
I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
5.20211023.2003
Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4)
platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters?
Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best,
Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel