[Vm-dev] 3 Bugs in LargeInteger primitives

Thu Aug 30 08:24:25 UTC 2012

Also note that I introduced one future-flaw by hardcoding a 64 bits constant...

1) 16rFFFFFFFFFFFFFFFF should be replace with a proper macro
(USQLONG_MAX or something) in pre-condition tests

I did not correct the other flaws already spreaded in code:
- SmallInteger maxVal/linVal are hardcoded everywhere
And we have these implicit requirements:
- SmallInteger maxVal/minVal has to fit in a sqInt
- (0 - SmallInteger minVal) has to fit in a sqInt too (currently
possible, it has only 31 bits)

2012/8/30 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> 2012/8/30 David T. Lewis <lewis at mail.msen.com>:
>>
>> On Wed, Aug 29, 2012 at 01:24:39PM +0200, Stefan Marr wrote:
>>>
>>> Hi Nicolas:
>>>
>>> On 29 Aug 2012, at 12:18, Nicolas Cellier wrote:
>>>
>>> >
>>> > Beside these bugs, when I read the code, I'm quite sure it's a nest of
>>> > future bugs because there are many other attempts to catch overflow in
>>> > post-condition (like testing that addition of two positive is negative
>>> > when an underflow occurs) that technically rely on explicitely
>>> > Undefined Behaviour (UB).
>>
>> See below. Tests such as this are essential, and they they do *not* rely on
>> undefined behavior if the C variables are properly declared.
>>
>>>
>>> I guess http://forum.world.st/Is-bytecodePrimMultiply-correct-td3869580.html
>>> is related too.
>>> I am not sure whether that got changed in the VMs, but sounds very much like the same kind of problem. (undefined behavior and overflows)
>>>
>>> Since C is undefined in that regard, what are the options?
>>> Hand-crafted assembly for all relevant platforms?
>>> Are there libraries that abstract from these things?
>>
>> A good general solution is to perform the arithmetic using variables declared
>> as, or cast to, unsigned. The ambiguity in C language pertains only to signed
>> twos complement arithmetic, so if the operations are performed on twos complement
>> values that are declared unsigned, then no compiler optimization is possible and
>> the results are unambiguous regardless of compiler behavior. Results of the
>> unsigned operations may be tested for overflow, then cast back to signed integer
>> if the result is intended to be interpreted as a signed integer.
>>
>> Dave
>>
>
> For LargeInt, I prefer proper sign/unsigned magnitude handling to
> signed arithmetic hacks for several reasons:
>
> With signed arithmetic:
> 1) in conversion to/from LargeInt we have to discuss sign anyway and
> we have that magnitude fits into INT_MIN to INT_MAX
> 2) in overflow post-condition test, we have to discuss sign again -
> for example, prim(Add/Subtract/Div/Quo/Mod)LargeIntegers
> 3) we still need defensive protection against (0 - INT_MIN)
> 4) all casts rely on 2-complement representation which is
> implementation defined (though universal nowadays)
>
> with sign/magnitude:
> 1) conversion from/to is simplified
> 2) overflow are easy to test in pre-condition like a > (UINT_MAX - b)
> for addition, a > (UINT_MAX/b) for *, and this is absolutely safe even
> with a bogus compiler, and as efficient as broken post-condition tests
> 3) no protection needed against INT_MIN
> 4) no reliance on 2-complement at all
> 5) primitives work 1 bit further
> 6) code is simpler (less if)
>
> The only thing missing in my proposition (see attachment in
> http://code.google.com/p/cog/issues/detail?id=92) is a function
> retrieving sign and magnitude in a single function call (it's easy in
> C, but I don't know slang enough, how to return a struct or pass a
> pointer is beyond my knowledge).
>
> In a word, sign/magnitude is easier/cleaner and not less performant
> (once single function call is resolved)
>
> For SmallInt, computing result in signed 64 bits as proposed by Stefan
> is the simplest thing we can do IMHO.
> I'm quite sure it won't ruin efficiency.
>
> Nicolas