[Vm-dev] 3 Bugs in LargeInteger primitives

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Aug 29 16:38:01 UTC 2012


I checked, unsigned int overflow behaviour is a well defined standard
in both C/C++ and result in a modulo 2^ (sizeof( uint_type )
*CHAR_BIT)
So I'm now convinced that sign/magnitude decomposition is the way to go.
Anyway, you can currently observe sign dissertation to check overflow
in post condition, so we already pay the same price as sign/magnitude
solution, except that code is currently relying on broken C signed
arithmetic model.

I may post corrected primitives for basic arithmetic ops when I have time,..
But I won't have any frustration if a true VM hacker or someone more
available than me could do it, I don't even know our own little name
for an unsigned int 64, usqLong ?

Nicolas

2012/8/29 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> For the (1<<63) negated bug itself, one very simple solution would be
> to just refuse it as a valid int64...
> We spent many effort to handle it in #signed64BitValueOf: , and our
> rewards are many bug popping out in the primitives where it is used.
> Regarding efficiency, we will have to protect code with
> inefficient-UB-broken defensive if or very-inefficient-portable-C, so
> the best choice is to just filter it out right at the beginning...
> I also wonder if handling a sign-magnitude wouldn't just be easier in
> that case (except maybe for + and -).
>
> Of course the other UB are remaining, but one thing at a time.
>
> Nicolas
>
> 2012/8/29 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> Originally, C was very close to machine instructions and was conceived
>> as a generic assembler.
>> But it's more and more distant and becoming abstract.
>>
>> Unfortunately, the abstract arithmetic model is completely broken with
>> UB: it's un-reliable.
>> So what does this kind of abstraction serves?
>> Really, it makes me wonder...
>>
>> If portable C code becomes both long and inefficient, I guess that
>> cost of maintaining assembler for some part that we know are broken is
>> an option indeed.
>>
>> I also see that Andreas had a hard time with slang, some intermediate
>> operations being cast to uint32 instead of int64, he finally had to
>> use many cCode: hacks... Going both thru slang and C intermediate
>> sounds like too much work and too few safety for implementation basic
>> arithmetic (obviously we didn't and we can't easily model broken C
>> behaviour in Slang, it's too complex !).
>>
>> Nicolas
>>
>> 2012/8/29 Stefan Marr <smalltalk at stefan-marr.de>:
>>>
>>> Hi Nicolas:
>>>
>>> On 29 Aug 2012, at 12:18, Nicolas Cellier wrote:
>>>
>>>>
>>>> Beside these bugs, when I read the code, I'm quite sure it's a nest of
>>>> future bugs because there are many other attempts to catch overflow in
>>>> post-condition (like testing that addition of two positive is negative
>>>> when an underflow occurs) that technically rely on explicitely
>>>> Undefined Behaviour (UB).
>>>
>>> I guess http://forum.world.st/Is-bytecodePrimMultiply-correct-td3869580.html
>>> is related too.
>>> I am not sure whether that got changed in the VMs, but sounds very much like the same kind of problem. (undefined behavior and overflows)
>>>
>>> Since C is undefined in that regard, what are the options?
>>> Hand-crafted assembly for all relevant platforms?
>>> Are there libraries that abstract from these things?
>>>
>>> I think Clang has a compiler switch to warn at compile-time, or trigger a runtime warning/error for these issues with undefined behavior. That might help for a thorough sweep through the code.
>>>
>>> Best regards
>>> Stefan
>>>
>>>
>>> --
>>> Stefan Marr
>>> Software Languages Lab
>>> Vrije Universiteit Brussel
>>> Pleinlaan 2 / B-1050 Brussels / Belgium
>>> http://soft.vub.ac.be/~smarr
>>> Phone: +32 2 629 2974
>>> Fax:   +32 2 629 3525
>>>


More information about the Vm-dev mailing list