[Vm-dev] 3 Bugs in LargeInteger primitives

Wed Aug 29 10:18:28 UTC 2012

As posted on squeak-dev
http://lists.squeakfoundation.org/pipermail/squeak-dev/2012-August/165608.html
I found 3 bugs in LargeInteger primitives

(1<<63) negated quo: -1.
(1<<63) negated / -1.
(1<<63) negated * -1.

They are all related to the impossible task of taking absolute value
of INT_MIN (or more exactly it's 64 bits equivalent).
Currently, it takes the form (0 - INT_MIN) whose behaviour is
undefined according to C standards but generally answer INT_MIN.
See for example
http://stackoverflow.com/questions/2539178/why-is-abs0x80000000-0x80000000

Surprisingly this one works:
(1<<63) negated // -1.

Most probably because gcc has a license to ignore undefined behaviour
and perform some optimizations that don't take overflow side effects
into account.
For example 0 - (0 - a)/b can be simplified into a/b, UB case of
INT_MIN apart...

---------------------------------

Beside these bugs, when I read the code, I'm quite sure it's a nest of
future bugs because there are many other attempts to catch overflow in
post-condition (like testing that addition of two positive is negative
when an underflow occurs) that technically rely on explicitely
Undefined Behaviour (UB).
OK, by now many Arithmetic Units do behave like exploited in these
post-conditions, though it's not strictly future-proof.
But we unfortunately rely on optimizing C compilers, and its behaviour
is much more fragile than hardware...

I invite every VM hacker to read
http://stackoverflow.com/questions/199333/best-way-to-detect-integer-overflow-in-c-c
And various links like
https://www.securecoding.cert.org/confluence/display/seccode/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow?showComments=false

For example, in large integer subtract, we have a protection against
(0 - INT_MIN) like:
   y = 0 - x;
   if ( y==x ) { primitiveFail(); }
an optimizing compiler having a licence to ignore INT_MIN Undefined
Behaviour case could mathematically solve the equation as x==0, y==0
and transform code into
  if( ! (y=0-x) ) { primitiveFail(); }
(directly use a jz and save a comparison)

or if we have such branch
c = a + b;
if( a >0) {
  if(b > 0) {
     if (c < 0 ) { primitiveFail(); }
  }
}
Again, a good compiler could remove the if( c < 0) test, since it does
not have to care about UB...

OK, pragmatically, most of these post-condition hacks are fast and
work with some version of gcc, but think about portability (llvm ?)
and future pain (you can debug such code only at asembler level).

Do it right > do it fast.

Nicolas