[squeak-dev] Sign-bit bug in character literals > 16r7FFF ... related to SistaV1?

Marcel Taeumel marcel.taeumel at hpi.de
Wed Mar 9 07:47:31 UTC 2022


Hi Dave --

> The issue will not be related to the upper tag bits of immediate
> characters, and it will not be related to the 2-byte or 3-byte byte
> codes in SistaV1. It's just some sort of type declaration issue in
> the VM code, that's all.


Well, it seems to be a bug in EncoderForSistaV1, as it does not occur in EncoderForV3PlusClosures. :-)

> And it may be an issue related to integer data types in Windows versus unix-based systems.

Maybe there are two different bugs here? =)

Best,
Marcel
Am 09.03.2022 00:39:51 schrieb David T. Lewis <lewis at mail.msen.com>:
On Tue, Mar 08, 2022 at 05:57:32PM +0100, Marcel Taeumel wrote:
> Hi Eliot, hi all --
>
> I think we have an sign-bit bug for character literals with code points > 16r7FFF.
>
> Steps to reproduce:
>
> 1. Print it: "Character value: 16r8000"
> 2. Inspect the result by evaluating the character literal or send #asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
>
> In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000.
> In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
>
> Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
>
> Is there a bug in the upper tag bits of immediate characters?
> Is this related to the 2-byte or 3-byte byte codes in SistaV1?
>
> Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
>
> VM:??202112201228 (VMMaker.oscog-eem.3116)
>
> Best,
> Marcel
>

Hi Marcel,

This integer type declaration stuff is enough to give anybody a headache,
so here is a tip to make it slightly less obscure without leaving the
comfort of the Squeak image.

First load package TwosComplement, either from SqueakMap or from
http://www.squeaksource.com/TwosComplement.

Then take a look at the two suspicious integer values that you get from
your 32-bit VM and 64-bit VM, rendering them as 32-bit twos complement
(the common case for C int on most platforms):

{ 16r3FFF8000 asRegister: 32 . -16r8000 asRegister: 32} inspect.

or the simpler version (since 32 bits is the default):

{ 16r3FFF8000 asRegister . -16r8000 asRegister} inspect.

This shows the low order 16 bits (the actual character value) as valid
in both cases, and the high order 16 bits as garbage related to integer
type declaration and/or sign extension in the VM.

Very likely this will turn out to be an issue in primitive 171,
InterpreterPrimitives>>primitiveImmediateAsInteger. And it may be an
issue related to integer data types in Windows versus unix-based systems.

The issue will not be related to the upper tag bits of immediate
characters, and it will not be related to the 2-byte or 3-byte byte
codes in SistaV1. It's just some sort of type declaration issue in
the VM code, that's all.

Dave


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220309/6dbcf228/attachment.html>


More information about the Squeak-dev mailing list