[Vm-dev] vm problem on cog and stack spur

Mon Mar 28 20:37:14 UTC 2016

Hi Nicolas,

On Mon, Mar 28, 2016 at 12:43 PM, Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> wrote:

>
> Thanks David, good!
> I'm really happy about the decision to introduce changes by small steps.
> And we are lucky that I recently added the erf function in
> ArbitraryPrecisionFloat.
> Otherwise, with 32 bits limbs, the division bug could have been un-noticed
> for years.
>
> When I contemplate the division code, I wonder how to test such beast...
> Long method, high cyclomatic complexity...
> Given the very low probability of failure, testing with random values is
> not appropriate.
>

In general, testing with random values is unhelpful.  A success proves
nothing.  A failure is potentially unreproducible.  Surely one helpful
approach is constructing test cases that stress edge cases.  Another is
adding asserts that test internal invariants.

> I wonder if some kind of formal proof would be possible.
> The first thing would be to describe precisely the invariants.
> If I have an occasion, I'll try to submit the case to the frama-C guys (
> http://frama-c.com/about.html).
>

A formal proof would be great, but we don't have the necessary foundation,
replying on arbitrary C compilers, do we?  But adding asserts to the code
can allow you to verify any invariants you expect.

2016-03-28 4:05 GMT+02:00 David T. Lewis <lewis at mail.msen.com>:
>
>>
>> Nicolas,
>>
>> I updated VMM trunk with all of your changes for LargeIntegersPlugin
>> copied from
>> the latest oscog branch. All number tests pass, and your
>> ArbitraryPrecisionFloat
>> example works properly now.
>>
>> Dave
>>
>>
>> On Mon, Mar 28, 2016 at 12:09:37AM +0200, Nicolas Cellier wrote:
>> >
>> > OK, I think I get it.
>> > The problem lies in:
>> >
>> >                 "Correct overestimate of q.
>> >                 Max of 2 iterations through loop -- see Knuth vol. 2"
>> >                 j < 3
>> >                     ifTrue: [r3 := 0]
>> >                     ifFalse: [r3 := pRem at: j - 3].
>> >
>> >                 [(t < hi
>> >                     or: [t = hi and: [r3 < lo]])
>> >                     ifTrue:
>> >                         ["i.e. (t,r3) < (hi,lo)"
>> >                         q := q - 1.
>> >                         lo < dnh
>> >                             ifTrue:
>> >                                 [hi := hi - 1.
>> >                                 lo := lo + 16r100 - dnh]
>> >                             ifFalse:
>> >                                 [lo := lo - dnh].
>> >                         cond := hi >= dh]
>> >                     ifFalse: [cond := false].
>> >                 cond]
>> >                     whileTrue: [hi := hi - dh]].
>> >
>> > What happens here is that hi may become negative.
>> > But since I declared it unsigned, t<hi and cond := hi >= dh remains
>> false
>> > quite a long time and the quotient q is decremented way too many
>> times...
>> >
>> > Here we divide (r1,r2,r3,...) by (dh,dnh,...)
>> > we are trying to guess q based on 2 limbs
>> > (r1,r2) = q*dh + t    that is   (r1,r2-t) = q*dh
>> > (hi,lo) =q*dnh
>> >
>> > Thus q*(dh,dnh)=(r1,r2-t+hi,lo)
>> >
>> > If t<hi then q*(dh,dnh) is greater than (r1,r2) and we have an
>> overestimate
>> > of q
>> > Same if t=hi and lo<r3.
>> >
>> > Here we have the tricky case when t=0, r3=0 and lo>0, and hi=dnh+1
>> > In the first loop, h gets decremented, thus hi=dnh.
>> > Then it goes to 0, and gets decremented a second time...
>> >
>> > The same algorithm is used with 32bits limbs, so it may also happen in
>> my
>> > own VM branch. I believe it must be even rarest an event though (2^32
>> > possible values for hi makes hi=dnh+1 less probable than with 2^8
>> > combinations).
>> >
>> > I will fix it tonight or tomorrow.
>> >
>> >
>> > 2016-03-27 19:03 GMT+02:00 Nicolas Cellier <
>> > nicolas.cellier.aka.nice at gmail.com>:
>> >
>> > > Thanks David.
>> > >
>> > > By running the snippet, instrumenting erf, printing each and every
>> > > intermediate value to a text file, and diffing:
>> > > {
>> > > (10 asArbitraryPrecisionFloatNumBits: 33) erf.
>> > > (10 asArbitraryPrecisionFloatNumBits: 34) erf.
>> > > }
>> > >
>> > > I reduced the failing test to:
>> > >
>> > >  a :=
>> > >
>> 1598335257761788022467377781654101148543282249044465229239888363328190330275719997501596724768507889233831388734160190922469363547795602076820594918.
>> > > b := 49612.
>> > > self assert: a - ((a quo: b)*b) < b
>> > >
>> > > So it must be some nasty sign promotion (or lack of?) in division.
>> > > Normally, this could be simulated, but with signedness, I'm never
>> sure,
>> > > maybe my next step will be a breakpoint in xcode/gdb...
>> > >
>> > >
>> > > Curiously, (a rem: b) is correct !
>> > >
>> > > 2016-03-27 17:36 GMT+02:00 David T. Lewis <lewis at mail.msen.com>:
>> > >
>> > >>
>> > >> I confirm the same results for your ArbitraryPrecisionFloat example
>> below,
>> > >> testing on Linux with a 64 bit interpreter VM and 32 bit image.
>> > >>
>> > >>   Image
>> > >>   -----
>> > >>   /home/lewis/squeak/Squeak4.6/squeak.37.image
>> > >>   Squeak4.6
>> > >>   latest update: #15723
>> > >>   Current Change Set: Unnamed2
>> > >>   Image format 6504 (32 bit)
>> > >>
>> > >>   Virtual Machine
>> > >>   ---------------
>> > >>   /usr/local/lib/squeak/4.15.4-3634/squeakvm
>> > >>   Squeak4.5 of 10 December 2015 [latest update: #1195]
>> > >>   Unix built on Mar 25 2016 13:27:51 Compiler: 4.9.2
>> > >>   platform sources revision 3634
>> > >>   VMMaker versionString 4.15.4
>> > >>
>> > >> Dave
>> > >>
>> > >>
>> > >> On Sun, Mar 27, 2016 at 03:25:58PM +0200, Nicolas Cellier wrote:
>> > >> >
>> > >> > Hi Levente, thank you.
>> > >> > Here is a follow up:
>> > >> >
>> > >> > I was able to reproduce the same symptom on the classical
>> interpreter
>> > >> VM.
>> > >> > <rant> It's just that for MacOSX, there is no recent VM available,
>> and
>> > >> that
>> > >> > the MacOS build is somehow abandonware, so it took much more time
>> than
>> > >> > necessary :( </rant>
>> > >> >
>> > >> > So the changes that have been integrated on Interpreter Vm in july
>> 2014
>> > >> and
>> > >> > that I thought correct were not.
>> > >> > It's certainly related to a LargeInteger plugin primitive.
>> > >> > Those are triggered when nbits of magnitude > 64.
>> > >> > (otherwise, <= 64 bits, the LargeInteger numbered primitives do
>> their
>> > >> job).
>> > >> >
>> > >> > Now the job is to produce/isolate an elementary failing
>> LargeInteger
>> > >> test.
>> > >> > But you need to generate the VM from VMMaker.oscog head for
>> COG/STACK
>> > >> > V3/SPUR 32/64 bits
>> > >> > or use a sufficiently recent interpreter VM (> july 2014) to
>> trigger
>> > >> such
>> > >> > failure.
>> > >> >
>> > >> >
>> > >> > --------------------------------------------
>> > >> >
>> > >> > Here is my starting point:
>> > >> >
>> > >> > If you load ArbitraryPrecisionFloat and Tests from
>> > >> > http://www.squeaksource.com/ArbitraryPrecisionFl
>> > >> > then you'll see this strange pattern:
>> > >> >
>> > >> > (53 to: 70) collect: [:i |
>> > >> >     i -> (10 asArbitraryPrecisionFloatNumBits: i) erf ]
>> > >> >
>> > >> >  {53->(ArbitraryPrecisionFloat readFrom: '0.9999998981732067'
>> readStream
>> > >> > numBits: 53) .
>> > >> >  54->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 54) .
>> > >> >  55->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 55) .
>> > >> >  56->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 56) .
>> > >> >  57->(ArbitraryPrecisionFloat readFrom: '0.99999989817320666'
>> readStream
>> > >> > numBits: 57) .
>> > >> >  58->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 58) .
>> > >> >  59->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 59) .
>> > >> >  60->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 60) .
>> > >> >  61->(ArbitraryPrecisionFloat readFrom: '0.9999998981732066655'
>> > >> readStream
>> > >> > numBits: 61) .
>> > >> >  62->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 62) .
>> > >> >  63->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 63) .
>> > >> >  64->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 64) .
>> > >> >  65->(ArbitraryPrecisionFloat readFrom: '0.99999989817320666564'
>> > >> readStream
>> > >> > numBits: 65) .
>> > >> >  66->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 66) .
>> > >> >  67->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 67) .
>> > >> >  68->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 68) .
>> > >> >  69->(ArbitraryPrecisionFloat readFrom: '0.999999898173206665646'
>> > >> > readStream numBits: 69) .
>> > >> >  70->(ArbitraryPrecisionFloat readFrom: '1.0' readStream numBits:
>> 70)}
>> > >> >
>> > >> > The value seems incorrect every four numbits
>> > >> > The exact value is near 1- 2.0e-45, and given the insufficient
>> precision
>> > >> > requested, the result should be rounded to 1.
>> > >> > We can confirm with online high precision erf calculator like
>> found for
>> > >> > example with http://keisan.casio.com/exec/system/1180573449
>> > >> >
>> > >> > Or we can check with
>> > >> > (10 asArbitraryPrecisionFloatNumBits: 200) erf
>> (ArbitraryPrecisionFloat
>> > >> > readFrom:
>> > >> '0.999999999999999999999999999999999999999999997911512416237455'
>> > >> > readStream numBits: 200)
>> > >> >
>> > >> > -----------------------------
>> > >> >
>> > >> > Note that when I change the LargeInteger digitLength to 32bits
>> instead
>> > >> of 8
>> > >> > (my own VM branch) then the problem magically disappear.
>> > >> >
>> > >> > I plan to introduce these change in the VMMaker.oscog, so I can as
>> well
>> > >> > ignore the problem and proceed, however I much prefer to isolate
>> it, and
>> > >> > understand it for gain of knowledge :)
>> > >> >
>> > >> >
>> > >> > 2016-03-27 4:27 GMT+02:00 Levente Uzonyi <leves at caesar.elte.hu>:
>> > >> >
>> > >> > >
>> > >> > > Hi Nicolas,
>> > >> > >
>> > >> > > If you give us a bit more detailed instructions about what to
>> test,
>> > >> then
>> > >> > > we might be able to help you get this sorted out.
>> > >> > >
>> > >> > > Levente
>> > >> > >
>> > >> > >
>> > >>
>> > >>
>> > >
>>
>>
>
>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160328/af60007e/attachment-0001.htm