[Vm-dev] vm problem on cog and stack spur

Mon Mar 28 02:05:06 UTC 2016

Nicolas,

I updated VMM trunk with all of your changes for LargeIntegersPlugin copied from
the latest oscog branch. All number tests pass, and your ArbitraryPrecisionFloat
example works properly now.

Dave

On Mon, Mar 28, 2016 at 12:09:37AM +0200, Nicolas Cellier wrote:
>  
> OK, I think I get it.
> The problem lies in:
> 
>                 "Correct overestimate of q.
>                 Max of 2 iterations through loop -- see Knuth vol. 2"
>                 j < 3
>                     ifTrue: [r3 := 0]
>                     ifFalse: [r3 := pRem at: j - 3].
> 
>                 [(t < hi
>                     or: [t = hi and: [r3 < lo]])
>                     ifTrue:
>                         ["i.e. (t,r3) < (hi,lo)"
>                         q := q - 1.
>                         lo < dnh
>                             ifTrue:
>                                 [hi := hi - 1.
>                                 lo := lo + 16r100 - dnh]
>                             ifFalse:
>                                 [lo := lo - dnh].
>                         cond := hi >= dh]
>                     ifFalse: [cond := false].
>                 cond]
>                     whileTrue: [hi := hi - dh]].
> 
> What happens here is that hi may become negative.
> But since I declared it unsigned, t<hi and cond := hi >= dh remains false
> quite a long time and the quotient q is decremented way too many times...
> 
> Here we divide (r1,r2,r3,...) by (dh,dnh,...)
> we are trying to guess q based on 2 limbs
> (r1,r2) = q*dh + t    that is   (r1,r2-t) = q*dh
> (hi,lo) =q*dnh
> 
> Thus q*(dh,dnh)=(r1,r2-t+hi,lo)
> 
> If t<hi then q*(dh,dnh) is greater than (r1,r2) and we have an overestimate
> of q
> Same if t=hi and lo<r3.
> 
> Here we have the tricky case when t=0, r3=0 and lo>0, and hi=dnh+1
> In the first loop, h gets decremented, thus hi=dnh.
> Then it goes to 0, and gets decremented a second time...
> 
> The same algorithm is used with 32bits limbs, so it may also happen in my
> own VM branch. I believe it must be even rarest an event though (2^32
> possible values for hi makes hi=dnh+1 less probable than with 2^8
> combinations).
> 
> I will fix it tonight or tomorrow.
> 
> 
> 2016-03-27 19:03 GMT+02:00 Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com>:
> 
> > Thanks David.
> >
> > By running the snippet, instrumenting erf, printing each and every
> > intermediate value to a text file, and diffing:
> > {
> > (10 asArbitraryPrecisionFloatNumBits: 33) erf.
> > (10 asArbitraryPrecisionFloatNumBits: 34) erf.
> > }
> >
> > I reduced the failing test to:
> >
> >  a :=
> > 1598335257761788022467377781654101148543282249044465229239888363328190330275719997501596724768507889233831388734160190922469363547795602076820594918.
> > b := 49612.
> > self assert: a - ((a quo: b)*b) < b
> >
> > So it must be some nasty sign promotion (or lack of?) in division.
> > Normally, this could be simulated, but with signedness, I'm never sure,
> > maybe my next step will be a breakpoint in xcode/gdb...
> >
> >
> > Curiously, (a rem: b) is correct !
> >
> > 2016-03-27 17:36 GMT+02:00 David T. Lewis <lewis at mail.msen.com>:
> >
> >>
> >> I confirm the same results for your ArbitraryPrecisionFloat example below,
> >> testing on Linux with a 64 bit interpreter VM and 32 bit image.
> >>
> >>   Image
> >>   -----
> >>   /home/lewis/squeak/Squeak4.6/squeak.37.image
> >>   Squeak4.6
> >>   latest update: #15723
> >>   Current Change Set: Unnamed2
> >>   Image format 6504 (32 bit)
> >>
> >>   Virtual Machine
> >>   ---------------
> >>   /usr/local/lib/squeak/4.15.4-3634/squeakvm
> >>   Squeak4.5 of 10 December 2015 [latest update: #1195]
> >>   Unix built on Mar 25 2016 13:27:51 Compiler: 4.9.2
> >>   platform sources revision 3634
> >>   VMMaker versionString 4.15.4
> >>
> >> Dave
> >>
> >>
> >> On Sun, Mar 27, 2016 at 03:25:58PM +0200, Nicolas Cellier wrote:
> >> >
> >> > Hi Levente, thank you.
> >> > Here is a follow up:
> >> >
> >> > I was able to reproduce the same symptom on the classical interpreter
> >> VM.
> >> > <rant> It's just that for MacOSX, there is no recent VM available, and
> >> that
> >> > the MacOS build is somehow abandonware, so it took much more time than
> >> > necessary :( </rant>
> >> >
> >> > So the changes that have been integrated on Interpreter Vm in july 2014
> >> and
> >> > that I thought correct were not.
> >> > It's certainly related to a LargeInteger plugin primitive.
> >> > Those are triggered when nbits of magnitude > 64.
> >> > (otherwise, <= 64 bits, the LargeInteger numbered primitives do their
> >> job).
> >> >
> >> > Now the job is to produce/isolate an elementary failing LargeInteger
> >> test.
> >> > But you need to generate the VM from VMMaker.oscog head for COG/STACK
> >> > V3/SPUR 32/64 bits
> >> > or use a sufficiently recent interpreter VM (> july 2014) to trigger
> >> such
> >> > failure.
> >> >
> >> >
> >> > --------------------------------------------
> >> >
> >> > Here is my starting point:
> >> >
> >> > If you load ArbitraryPrecisionFloat and Tests from
> >> > http://www.squeaksource.com/ArbitraryPrecisionFl
> >> > then you'll see this strange pattern:
> >> >
> >> > (53 to: 70) collect: [:i |
> >> >     i -> (10 asArbitraryPrecisionFloatNumBits: i) erf ]
> >> >
> >> >  {53->(ArbitraryPrecisionFloat readFrom: '0.9999998981732067' readStream
> >> > numBits: 53) .
> >> >  54->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 54) .
> >> >  55->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 55) .
> >> >  56->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 56) .
> >> >  57->(ArbitraryPrecisionFloat readFrom: '0.99999989817320666' readStream
> >> > numBits: 57) .
> >> >  58->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 58) .
> >> >  59->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 59) .
> >> >  60->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 60) .
> >> >  61->(ArbitraryPrecisionFloat readFrom: '0.9999998981732066655'
> >> readStream
> >> > numBits: 61) .
> >> >  62->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 62) .
> >> >  63->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 63) .
> >> >  64->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 64) .
> >> >  65->(ArbitraryPrecisionFloat readFrom: '0.99999989817320666564'
> >> readStream
> >> > numBits: 65) .
> >> >  66->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 66) .
> >> >  67->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 67) .
> >> >  68->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 68) .
> >> >  69->(ArbitraryPrecisionFloat readFrom: '0.999999898173206665646'
> >> > readStream numBits: 69) .
> >> >  70->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits: 70)}
> >> >
> >> > The value seems incorrect every four numbits
> >> > The exact value is near 1- 2.0e-45, and given the insufficient precision
> >> > requested, the result should be rounded to 1.
> >> > We can confirm with online high precision erf calculator like found for
> >> > example with http://keisan.casio.com/exec/system/1180573449
> >> >
> >> > Or we can check with
> >> > (10 asArbitraryPrecisionFloatNumBits: 200) erf (ArbitraryPrecisionFloat
> >> > readFrom:
> >> '0.999999999999999999999999999999999999999999997911512416237455'
> >> > readStream numBits: 200)
> >> >
> >> > -----------------------------
> >> >
> >> > Note that when I change the LargeInteger digitLength to 32bits instead
> >> of 8
> >> > (my own VM branch) then the problem magically disappear.
> >> >
> >> > I plan to introduce these change in the VMMaker.oscog, so I can as well
> >> > ignore the problem and proceed, however I much prefer to isolate it, and
> >> > understand it for gain of knowledge :)
> >> >
> >> >
> >> > 2016-03-27 4:27 GMT+02:00 Levente Uzonyi <leves at caesar.elte.hu>:
> >> >
> >> > >
> >> > > Hi Nicolas,
> >> > >
> >> > > If you give us a bit more detailed instructions about what to test,
> >> then
> >> > > we might be able to help you get this sorted out.
> >> > >
> >> > > Levente
> >> > >
> >> > >
> >>
> >>
> >