# [Vm-dev] 3 Bugs in LargeInteger primitives

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Aug 29 21:09:21 UTC 2012

```Here is an exemple of what I propose:

"Primitive arithmetic operations for large integers in 64 bit range"
| a b result oopResult aIsNegative bIsNegative resultIsNegative
oopArg oopRcvr |
<export: true>
<var: 'a' type: 'usqLong'>
<var: 'b' type: 'usqLong'>
<var: 'result' type: 'usqLong'>
<var: 'oopArg' type: 'oop'>
<var: 'oopRcvr' type: 'oop'>
<var: 'oopResult' type: 'oop'>
<var: 'aIsNegative' type: 'usqInt'>
<var: 'bIsNegative' type: 'usqInt'>
<var: 'resultIsNegative' type: 'usqInt'>

oopArg := self stackValue: 0.
oopRcvr := self stackValue: 1.
aIsNegative := self isNegativeIntegerValueOf: oopRcvr.
bIsNegative := self isNegativeIntegerValueOf: oopArg.
a := self magnitude64BitValueOf: oopRcvr.
b := self magnitude64BitValueOf: oopArg.
successFlag ifFalse:[^nil].
(aIsNegative == bIsNegative)
ifTrue:
["Protect against overflow"
a > (16rFFFFFFFFFFFFFFFF - b) ifTrue: [self primitiveFail. ^nil].
result := a + b.
resultIsNegative := aIsNegative]
ifFalse:
[(a >= b)
ifTrue:
[result := a - b.
resultIsNegative := aIsNegative]
ifFalse:
[result := b - a.
resultIsNegative := bIsNegative]].
oopResult := self magnitude64BitIntegerFor: result neg: resultIsNegative.
successFlag ifTrue:[self pop: 2 thenPush: oopResult].

IMHO, code is not much worse than previous one:

"Primitive arithmetic operations for large integers in 64 bit range"
| integerRcvr integerArg result oopResult |
<export: true>
<var: 'integerRcvr' type: 'sqLong'>
<var: 'integerArg' type: 'sqLong'>
<var: 'result' type: 'sqLong'>

integerArg := self signed64BitValueOf: (self stackValue: 0).
integerRcvr := self signed64BitValueOf: (self stackValue: 1).
successFlag ifFalse:[^nil].

"Compute the preliminary result (which may overflow)"
result := integerRcvr + integerArg.

"Now check overflow conditions. First is whether rcvr and arg are of
the same sign.
If they are we need to check for overflow more carefully."
(integerRcvr bitXor: integerArg) < 0 ifFalse:[
"Second is whether rcvr and result are of the same sign. If not, we
have an overflow."
(integerRcvr bitXor: result) < 0 ifTrue:[self primitiveFail]].

successFlag ifFalse:[^nil].

oopResult := self signed64BitIntegerFor: result.
successFlag ifTrue:[self pop: 2 thenPush: oopResult].

And conversion with sign/magnitude is simpler...

isNegativeIntegerValueOf: oop
"Answer 1 if integer object is negative, 0 otherwise.
Fail if object pointed by oop i not an integer."
| largeClass smallInt |
<returnTypeC: #usqInt>
<var: #smallInt type: #sqInt>
(self isIntegerObject: oop)
ifTrue:
[smallInt := self integerValueOf: oop.
^smallInt < 0].
largeClass := self fetchClassOfNonInt: oop.
largeClass = self classLargePositiveInteger ifTrue: [^0].
largeClass = self classLargePositiveInteger ifTrue: [^0].
^self primitiveFail

magnitude64BitValueOf: oop
"Convert the given object into an integer value.
The object may be either a ST integer or a eight-byte LargeInteger."
| sz value largeClass szsqLong smallIntValue |
<inline: false>
<returnTypeC: #usqLong>
<var: #value type: #usqLong>
<var: #smallIntValue type: #sqInt>
(self isIntegerObject: oop)
ifTrue:
[smallIntValue := self integerValueOf: oop.
smallIntValue < 0 ifTrue: [smallIntValue := 0 - smallIntValue].
^self cCoerce: smallIntValue to: #usqLong].
largeClass := self fetchClassOfNonInt: oop.
largeClass = self classLargePositiveInteger
ifFalse:[largeClass = self classLargeNegativeInteger
ifFalse:[^self primitiveFail]].
szsqLong := self sizeof: #usqLong asSymbol.
sz := self lengthOf: oop.
sz > szsqLong
ifTrue: [^ self primitiveFail].
value := 0.
0 to: sz - 1 do: [:i |
value := value + ((self cCoerce: (self fetchByte: i ofObject: oop)
to: #usqLong) <<  (i*8))].
^value

magnitude64BitIntegerFor: integerValue neg: isNegative
"Return a Large Integer object for the given integer magnitude and sign"
| newLargeInteger largeClass intValue highWord sz isSmall smallVal |
<inline: false>
<var: 'integerValue' type: 'usqLong'>
<var: 'isNegative' type: 'usqInt'>
<var: 'highWord' type: 'usqInt'>
<var: 'isSmall' type: 'usqInt'>
<var: 'smallVal' type: 'sqInt'>

isSmall := isNegative
ifTrue: [integerValue <= 16r40000000]
ifFalse: [integerValue < 16r40000000].
isSmall
ifTrue:
[smallVal := self cCoerce: integerValue to: #sqInt.
isNegative	ifTrue: [smallVal := 0 - smallVal].
^self integerObjectOf: smallVal].
isNegative < 0
ifTrue:[	largeClass := self classLargeNegativeInteger]
ifFalse:[	largeClass := self classLargePositiveInteger].
highWord := self cCode: 'magnitude >> 32'. "shift is coerced to
usqInt otherwise"
highWord = 0
ifTrue: [sz := 4]
ifFalse:[
sz := 5.
(highWord := highWord >> 8) = 0 ifFalse:[sz := sz + 1].
(highWord := highWord >> 8) = 0 ifFalse:[sz := sz + 1].
(highWord := highWord >> 8) = 0 ifFalse:[sz := sz + 1].
].
newLargeInteger := self instantiateClass: largeClass indexableSize:  sz.
0 to: sz-1 do: [:i |
intValue := self cCode: '(magnitude >> (i * 8)) & 255'.
self storeByte: i ofObject: newLargeInteger withValue: intValue].
^ newLargeInteger

Nicolas

2012/8/29 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> I checked, unsigned int overflow behaviour is a well defined standard
> in both C/C++ and result in a modulo 2^ (sizeof( uint_type )
> *CHAR_BIT)
> So I'm now convinced that sign/magnitude decomposition is the way to go.
> Anyway, you can currently observe sign dissertation to check overflow
> in post condition, so we already pay the same price as sign/magnitude
> solution, except that code is currently relying on broken C signed
> arithmetic model.
>
> I may post corrected primitives for basic arithmetic ops when I have time,..
> But I won't have any frustration if a true VM hacker or someone more
> available than me could do it, I don't even know our own little name
> for an unsigned int 64, usqLong ?
>
> Nicolas
>
> 2012/8/29 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> For the (1<<63) negated bug itself, one very simple solution would be
>> to just refuse it as a valid int64...
>> We spent many effort to handle it in #signed64BitValueOf: , and our
>> rewards are many bug popping out in the primitives where it is used.
>> Regarding efficiency, we will have to protect code with
>> inefficient-UB-broken defensive if or very-inefficient-portable-C, so
>> the best choice is to just filter it out right at the beginning...
>> I also wonder if handling a sign-magnitude wouldn't just be easier in
>> that case (except maybe for + and -).
>>
>> Of course the other UB are remaining, but one thing at a time.
>>
>> Nicolas
>>
>> 2012/8/29 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>>> Originally, C was very close to machine instructions and was conceived
>>> as a generic assembler.
>>> But it's more and more distant and becoming abstract.
>>>
>>> Unfortunately, the abstract arithmetic model is completely broken with
>>> UB: it's un-reliable.
>>> So what does this kind of abstraction serves?
>>> Really, it makes me wonder...
>>>
>>> If portable C code becomes both long and inefficient, I guess that
>>> cost of maintaining assembler for some part that we know are broken is
>>> an option indeed.
>>>
>>> I also see that Andreas had a hard time with slang, some intermediate
>>> operations being cast to uint32 instead of int64, he finally had to
>>> use many cCode: hacks... Going both thru slang and C intermediate
>>> sounds like too much work and too few safety for implementation basic
>>> arithmetic (obviously we didn't and we can't easily model broken C
>>> behaviour in Slang, it's too complex !).
>>>
>>> Nicolas
>>>
>>> 2012/8/29 Stefan Marr <smalltalk at stefan-marr.de>:
>>>>
>>>> Hi Nicolas:
>>>>
>>>> On 29 Aug 2012, at 12:18, Nicolas Cellier wrote:
>>>>
>>>>>
>>>>> Beside these bugs, when I read the code, I'm quite sure it's a nest of
>>>>> future bugs because there are many other attempts to catch overflow in
>>>>> post-condition (like testing that addition of two positive is negative
>>>>> when an underflow occurs) that technically rely on explicitely
>>>>> Undefined Behaviour (UB).
>>>>
>>>> I guess http://forum.world.st/Is-bytecodePrimMultiply-correct-td3869580.html
>>>> is related too.
>>>> I am not sure whether that got changed in the VMs, but sounds very much like the same kind of problem. (undefined behavior and overflows)
>>>>
>>>> Since C is undefined in that regard, what are the options?
>>>> Hand-crafted assembly for all relevant platforms?
>>>> Are there libraries that abstract from these things?
>>>>
>>>> I think Clang has a compiler switch to warn at compile-time, or trigger a runtime warning/error for these issues with undefined behavior. That might help for a thorough sweep through the code.
>>>>
>>>> Best regards
>>>> Stefan
>>>>
>>>>
>>>> --
>>>> Stefan Marr
>>>> Software Languages Lab
>>>> Vrije Universiteit Brussel
>>>> Pleinlaan 2 / B-1050 Brussels / Belgium
>>>> http://soft.vub.ac.be/~smarr
>>>> Phone: +32 2 629 2974
>>>> Fax:   +32 2 629 3525
>>>>
```