[Vm-dev] 3 Bugs in LargeInteger primitives

Sat Sep 1 15:52:17 UTC 2012

I'm not sure, but I would not expect any differences for jit. The changes
here are in the primitives themselves, so I would not expect any impact
on jitted methods.

It would be good to take a similar measurement with Cog of course. If
there is any real difference in the primitive performance due to these
changes, it should be easier to measure with Cog because relatively
less time would be spent in the method execution as opposed to the
primitives themselves.

Dave

On Sat, Sep 01, 2012 at 05:41:44PM +0200, Igor Stasenko wrote:
>  
> i wonder, if any changes are needed for jit, because i suppose some of
> semantics for bigints is jitted.
> 
> On 1 September 2012 16:44, Levente Uzonyi <leves at elte.hu> wrote:
> >
> > On Sat, 1 Sep 2012, David T. Lewis wrote:
> >
> >>
> >> I have integrated these fixes into trunk VMMaker for testing. The
> >> new #testMinimumNegativeIntegerArithmetic unit test passes, and a VM
> >> compiled in 64-bit mode no longer crashed. I can find no significant
> >> change in performance with these changes applied. This looks really
> >> good to me, and if there are no other comments or issues I will go
> >> ahead and post the update to trunk VMMaker.
> >
> >
> > That sounds great, thanks Dave.
> >
> >
> >>
> >> Here are the performance measurements that I did to compare VM performance
> >> before and after applying the changes. This is done with an interpreter
> >> VM compiled in 64 bit mode on Linux:
> >>
> >>
> >>  testBlock := [(1 to: 1000000) do: [:e | | largeNegativeInt |
> >>                 largeNegativeInt := -9223372036854775808 + e.
> >>                 largeNegativeInt >> 3.
> >>                 largeNegativeInt + 1.
> >>                 largeNegativeInt - -1.
> >>                 largeNegativeInt // -1.
> >>                 largeNegativeInt \\ -1.
> >>                 largeNegativeInt rem: -1.
> >>                 largeNegativeInt quo: -1.
> >>                 largeNegativeInt * -1.
> >>                 largeNegativeInt / -1]].
> >
> >
> > What's the reason for using #to: and #do: instead of #to:do:? The latter
> > seems to be much more appropriate in such tests, because it doesn't create
> > any objects and message sends, so garbage collection has less effect on the
> > measurements.
> >
> >
> > Levente
> >
> >
> >>
> >> Before Nicolas' changes:
> >>
> >>  (LargeNegativeIntegerTest selector:
> >> #testMinimumNegativeIntegerArithmetic) run
> >>     ==> 1 run, 0 passes, 0 expected failures, 1 failures, 0 errors, 0
> >> unexpected passes
> >>
> >>  Time millisecondsToRun: [(1 to: 1000000) do: [:e | e +
> >> -9223372036854775808 // -1]] ==> 1565
> >>  Time millisecondsToRun: [(1 to: 1000000) do: [:e | e +
> >> -9223372036854775808 // -1]] ==> 1550
> >>  Time millisecondsToRun: [(1 to: 1000000) do: [:e | e +
> >> -9223372036854775808 // -1]] ==> 1544
> >>
> >>  0 tinyBenchmarks ==> '437981180 bytecodes/sec; 14163414 sends/sec'
> >>  0 tinyBenchmarks ==> '442141623 bytecodes/sec; 14141708 sends/sec'
> >>  0 tinyBenchmarks ==> '442141623 bytecodes/sec; 14120068 sends/sec'
> >>  0 tinyBenchmarks ==> '440240756 bytecodes/sec; 14163414 sends/sec'
> >>  0 tinyBenchmarks ==> '442141623 bytecodes/sec; 14823236 sends/sec'
> >>
> >>  Time millisecondsToRun: testBlock ==> 4698
> >>  Time millisecondsToRun: testBlock ==> 4671
> >>  Time millisecondsToRun: testBlock ==> 4668
> >>  Time millisecondsToRun: testBlock ==> 4612
> >>  Time millisecondsToRun: testBlock ==> 4657
> >>
> >> After applying the fixes:
> >>
> >>  (LargeNegativeIntegerTest selector:
> >> #testMinimumNegativeIntegerArithmetic) run
> >>     ==> 1 run, 1 passes, 0 expected failures, 0 failures, 0 errors, 0
> >> unexpected passes
> >>
> >>  Time millisecondsToRun: [(1 to: 1000000) do: [:e | e +
> >> -9223372036854775808 // -1]] ==> 1616
> >>  Time millisecondsToRun: [(1 to: 1000000) do: [:e | e +
> >> -9223372036854775808 // -1]] ==> 1581
> >>  Time millisecondsToRun: [(1 to: 1000000) do: [:e | e +
> >> -9223372036854775808 // -1]] ==> 1577
> >>
> >>  0 tinyBenchmarks ==> '456735057 bytecodes/sec; 14002222 sends/sec'
> >>  0 tinyBenchmarks ==> '461677186 bytecodes/sec; 14023502 sends/sec'
> >>  0 tinyBenchmarks ==> '462929475 bytecodes/sec; 14002222 sends/sec'
> >>  0 tinyBenchmarks ==> '459192825 bytecodes/sec; 14002222 sends/sec'
> >>  0 tinyBenchmarks ==> '458781362 bytecodes/sec; 13970423 sends/sec'
> >>
> >>  Time millisecondsToRun: testBlock ==> 4680
> >>  Time millisecondsToRun: testBlock ==> 4672
> >>  Time millisecondsToRun: testBlock ==> 4722
> >>  Time millisecondsToRun: testBlock ==> 4675
> >>  Time millisecondsToRun: testBlock ==> 4679
> >>
> >> Dave
> >>
> >>
> >> On Thu, Aug 30, 2012 at 01:14:54AM +0200, Nicolas Cellier wrote:
> >>>
> >>>
> >>> See also http://code.google.com/p/cog/issues/detail?id=92 where I
> >>> attached a fix for large int
> >>> It's untested yet and to review carefully !
> >>>
> >>> As Stefan told, there is UB-reliance in SmallInteger primitives too,
> >>> but I did not fix them.
> >>> We should simply compute result as signed 64 bits as proposed by
> >>> Stefan (except bitShift)
> >>>
> >>> Nicolas
> >>>
> >>> 2012/8/30 David T. Lewis <lewis at mail.msen.com>:
> >>>>
> >>>>
> >>>> This is on Mantis at http://bugs.squeak.org/view.php?id=7705
> >>>>
> >>>> Note last comment in related issue 6987.
> >>>>
> >>>> This issue will crash the VM when compiled for 64-bit platforms.
> >>>>
> >>>> Dave
> >>>>
> >>>> On Wed, Aug 29, 2012 at 12:18:28PM +0200, Nicolas Cellier wrote:
> >>>>>
> >>>>>
> >>>>> As posted on squeak-dev
> >>>>>
> >>>>> http://lists.squeakfoundation.org/pipermail/squeak-dev/2012-August/165608.html
> >>>>> I found 3 bugs in LargeInteger primitives
> >>>>>
> >>>>> (1<<63) negated quo: -1.
> >>>>> (1<<63) negated / -1.
> >>>>> (1<<63) negated * -1.
> >>>>>
> >>>>> They are all related to the impossible task of taking absolute value
> >>>>> of INT_MIN (or more exactly it's 64 bits equivalent).
> >>>>> Currently, it takes the form (0 - INT_MIN) whose behaviour is
> >>>>> undefined according to C standards but generally answer INT_MIN.
> >>>>> See for example
> >>>>>
> >>>>> http://stackoverflow.com/questions/2539178/why-is-abs0x80000000-0x80000000
> >>>>>
> >>>>> Surprisingly this one works:
> >>>>> (1<<63) negated // -1.
> >>>>>
> >>>>> Most probably because gcc has a license to ignore undefined behaviour
> >>>>> and perform some optimizations that don't take overflow side effects
> >>>>> into account.
> >>>>> For example 0 - (0 - a)/b can be simplified into a/b, UB case of
> >>>>> INT_MIN apart...
> >>>>>
> >>>>> ---------------------------------
> >>>>>
> >>>>> Beside these bugs, when I read the code, I'm quite sure it's a nest of
> >>>>> future bugs because there are many other attempts to catch overflow in
> >>>>> post-condition (like testing that addition of two positive is negative
> >>>>> when an underflow occurs) that technically rely on explicitely
> >>>>> Undefined Behaviour (UB).
> >>>>> OK, by now many Arithmetic Units do behave like exploited in these
> >>>>> post-conditions, though it's not strictly future-proof.
> >>>>> But we unfortunately rely on optimizing C compilers, and its behaviour
> >>>>> is much more fragile than hardware...
> >>>>>
> >>>>> I invite every VM hacker to read
> >>>>>
> >>>>> http://stackoverflow.com/questions/199333/best-way-to-detect-integer-overflow-in-c-c
> >>>>> And various links like
> >>>>>
> >>>>> https://www.securecoding.cert.org/confluence/display/seccode/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow?showComments=false
> >>>>>
> >>>>> For example, in large integer subtract, we have a protection against
> >>>>> (0 - INT_MIN) like:
> >>>>>    y = 0 - x;
> >>>>>    if ( y==x ) { primitiveFail(); }
> >>>>> an optimizing compiler having a licence to ignore INT_MIN Undefined
> >>>>> Behaviour case could mathematically solve the equation as x==0, y==0
> >>>>> and transform code into
> >>>>>   if( ! (y=0-x) ) { primitiveFail(); }
> >>>>> (directly use a jz and save a comparison)
> >>>>>
> >>>>> or if we have such branch
> >>>>> c = a + b;
> >>>>> if( a >0) {
> >>>>>   if(b > 0) {
> >>>>>      if (c < 0 ) { primitiveFail(); }
> >>>>>   }
> >>>>> }
> >>>>> Again, a good compiler could remove the if( c < 0) test, since it does
> >>>>> not have to care about UB...
> >>>>>
> >>>>> OK, pragmatically, most of these post-condition hacks are fast and
> >>>>> work with some version of gcc, but think about portability (llvm ?)
> >>>>> and future pain (you can debug such code only at asembler level).
> >>>>>
> >>>>> Do it right > do it fast.
> >>>>>
> >>>>> Nicolas
> >>
> >>
> >
> 
> 
> 
> -- 
> Best regards,
> Igor Stasenko.