2017-03-10 15:14 GMT+01:00 David T. Lewis <lewis@mail.msen.com>:

On Fri, Mar 10, 2017 at 09:45:07AM +0100, Nicolas Cellier wrote:
>
> 2017-03-10 3:47 GMT+01:00 David T. Lewis <lewis@mail.msen.com>:
>
> >
> > On Thu, Mar 09, 2017 at 12:41:44PM -0800, Eliot Miranda wrote:
> > >
> > > Hi David,
> > >
> > > On Mon, Mar 6, 2017 at 5:50 PM, David T. Lewis <lewis@mail.msen.com>
> > wrote:
> > >
> > > >
> > > > In the VM, the millisecond clock wraps within the 32 bit integer range:
> > > >
> > > > #define MillisecondClockMask 0x1FFFFFFF
> > > >
> > > > In the Cuis image, Delay class>>handleTimerEvent does this:
> > > >
> > > > nextTick := nextTick min: SmallInteger maxVal.
> > > >
> > > > On a 64-bit Spur image, SmallInteger maxVal is 16rFFFFFFFFFFFFFFF, but
> > on
> > > > a 32-bit image it is 16r3FFFFFFF.
> > > >
> > > > Could that be it?
> > > >
> > > > I don't really know how to test in Squeak. As you say, Squeak is now
> > > > using the microsecond clock in #handleTimerEvent. I do not see anything
> > > > in primitiveSignalAtMilliseconds that would behave any differently on
> > > > a 64 bit versus 32 bit image or VM, but I do not know how to test to
> > > > be sure.
> > > >
> > >
> > > I bet that the following code from primitiveSignalAtMilliseconds ends up
> > > wrapping when given
> > > Delay primSignal: s atMilliseconds: Time primMillisecondClock - 10.
> > >
> > > deltaMsecs := msecs - (self ioMSecs bitAnd: MillisecondClockMask).
> > > deltaMsecs < 0 ifTrue:
> > > [deltaMsecs := deltaMsecs + MillisecondClockMask + 1].
> > > nextWakeupUsecs := self ioUTCMicroseconds + (deltaMsecs * 1000).
> > >
> > > and I bet you'll find that the VM will wake up in about 6 days and 5
> > hours
> > > ;-)
> > >
> >
> > No. The failure is specific to the 64 bit VM. Source code for the primitive
> > is the same in either case.
> >
> >
> >
> > > I suppose we could fix this, but I'm *much* happier to simply not use
> > > primitiveSignalAtMilliseconds and stay with the simpler and
> > wrapping-immune
> > > primitiveSignalAtUTCMicroseconds
> >
> > Fair enough, but given that there is a demonstrated bug that affects only
> > the 64-bit VM, and given that it expresses itself intermittently and in
> > ways
> > that affect only someone who is attempting to migrate their existing V3
> > image
> > to Spur, then I would say that it makes very good sense to take the time
> > to fix it if we are able to do so. After all, there may be other people who
> > will want to migrate V3 images to Spur, and there is no point in making the
> > process needlessly difficult.
> >
> > I do not have the solution, but maybe someone else can help. So I am asking
> > for help here. Can someone with a working 64-bit build environment please
> > check and see if what I said in an earlier email might make a difference:
> >
> > I see that ioMSecs() is declared as signed long (32 bits), but it is used
> > in expression with a 64 bit usqInt. So maybe it needs a cast, or maybe
> > the variables like msecs and deltaMsecs in primitiveSignalAtMilliseconds
> > should be declared as 32 bit long and unsigned long to match the actual
> > usage.
> >
> > Thanks,
> >
> > Dave
> >
> >
>
> Hi David
> I just reviewed the generated code for primitiveSignalAtMilliseconds
> andhere is what i found:
>
> The primitive will simply fail if fed with a negative integer, thanks to:
>
> msecs = positive32BitValueOf(msecsObj);
>
> msecs is allways positive, since declared as
> usqInt msecs;
>
> So far, so good.
>
> Then
> deltaMsecs = msecs - ((ioMSecs()) &
> MillisecondClockMask);
> if (deltaMsecs < 0) {
> deltaMsecs = (deltaMsecs +
> MillisecondClockMask) + 1;
> }
> GIV(nextWakeupUsecs) = (ioUTCMicroseconds()) +
> (deltaMsecs * 1000);
>
> Due to the 29 bits Mask, the bitAnd: (&) operation will result into a
> positive int.
> 0 <= ((ioMSecs()) & MillisecondClockMask) <= 0x1FFFFFFF
>
> However, when we do the Roll-over thing:
> deltaMsecs = (deltaMsecs +
> MillisecondClockMask) + 1;
> Then we create a giant delay (the 6 days mentionned by Eliot) which is
> HIGHLY QUESTIONNABLE whatever 32 or 64 bits!
>
> The difference between 32 and 64 bits is in the next line:
> GIV(nextWakeupUsecs) = (ioUTCMicroseconds()) +
> (deltaMsecs * 1000);
> Because
> sqInt deltaMsecs;
>
> is 32 bits for a 32bits spur, 64bits for 64bits spur, then (deltaMsecs *
> 1000) will do two different things:
> - in 32 bits it will overflow leading to a negative delay and a
> nextWakeupUsecs in the past (but that's invoking Undefined Behavior)
> - in 64 bits it won't overflow leading to he 6 days delay
>
> Does that explain?
>
> if (deltaMsecs < 0) I would either fail the primitive or set the deltaMsecs
> to zero.
> What do you think?

Hi Nicolas,

Thanks very much for looking into this and for the explanation. I am away
right now so I cannot look at it now, but it does seem likely to be a sign
extension issue at least in part.

If you have a chance to try recompiling with changes to the generated code,
Juan's test condition is just this:

s := Semaphore new.
Delay primSignal: s atMilliseconds: Time primMillisecondClock - 10.
s wait.

This locks the image on 64-bits, and not on 32-bits.

Thanks a lot!
Dave

Hi Dave,

No, it's not a sign extension problem.

The code is broken.
The rollover protection is undue because we inject (clock+delay) and then subtract next_clock,

In case of clock rollover, next_clock falls to zero and we will have a big delay, not a negative one.

The rollover protection also transform a small negative delay into big delay.

Things seems to work in 32 bits just because we have an integer overflow when we convert msecs to usecs.

big_delay_in_msec * 1000 => overflow => negative_delay_in_usecs