Hi Levente, Hi Bert, Hi All,

On Mon, Feb 15, 2016 at 3:39 PM, Levente Uzonyi <leves@caesar.elte.hu> wrote:
On Mon, 15 Feb 2016, Bert Freudenberg wrote:


On 15.02.2016, at 10:17, marcel.taeumel <Marcel.Taeumel@hpi.de> wrote:

Hi Bert,

this was just a regression. There has always been this check in the past for
Morphic projects and still today for MVC projects.

Ah, so we lost the check at some point?

If you would have used VM or OS startup time, this would still be
problematic after an overflow. (Hence the comment about snapshotting). So,
this fix does not directly address the discussion about synching
#millisecondClockValue to wall clock.

I still think it should answer milliseconds since startup. Why would we change that?

Eliot changed it recently. Probably to avoid the rollover issues. The correct fix would be to use to UTC clock instead of the local one in Time class >> #millisecondClockValue.

I changed it for simplicity.  Alas it turns out to be a much more complex issue.  Here's a discussion I'm having with Ryan Macnak, which covers what his team did with the Dart VM.  Please read, it's interesting.

 On Sun, Feb 14, 2016 at 12:08 AM, Ryan Macnak <rmacnak@gmail.com> wrote:
On Sat, Feb 13, 2016 at 5:32 PM, Eliot Miranda <eliot.miranda@gmail.com> wrote:
Hi Ryan,

On Sat, Feb 13, 2016 at 11:21 AM, Ryan Macnak <rmacnak@gmail.com> wrote:
On Thu, Feb 11, 2016 at 10:46 PM, Eliot Miranda <eliot.miranda@gmail.com> wrote:
    Further back Ryan wrote: 
5) Travis found an assertion failure. Unfortunately the assertions fail to include paths with the line numbers.

(newUtcMicrosecondClock >= utcMicrosecondClock 124)

It's easy to track down.  Just grep for the string.  You'll find it in sqUnixHeartbeat.c.  I've seen this from time to time, and have yet to understand it. What OS are you seeing this on?

Linux. Looking at the comment above this assert, I see Cog is using the wrong clock. One should not rely on the realtime clock (gettimeofday) to move steadily forward. It can jump around due to NTP syncs, the machine sleeping or the user changing the time settings. Programs running at startup on the Raspberry Pi in particular can see very large jumps because it has no hardware clock (battery too expensive) so the first NTP sync will be a very large correction. We fixed this in the Dart VM a few months ago. Timers need to be scheduled using the monotonic clock (Linux clock_gettime, Mac mach_absolute_time).

Yes, this isn't satisfactory either.  One needs the VM to answer something that is close to wall time, not drift over time.  I think there needs to be some clever averaging algorithm that has the property of always advancing the clock but trying to converge on wall time.

One can imagine on every occasion that the VM updates its notion of the time it accesses both clock_gettime and gettimeofday and computes an offset that is some fraction of the delta between the current clock_gettime and the previous clock_gettime multiplied by the difference between the two clocks.  So the VM time is always monotonic, but hunts towards wall time as answered by gettimeofday.  

Thanks. I was unaware of clock_gettime & mach_absolute_time.  Given these two it shouldn;t be too hard to concoct something that works.  Or is that the approach you've taken in Dart?  Or are there standard algorithms out there?  I'll take a look.

I'm not seeing why it needs to be close to wall time. The VM needs make both a wall clock and a monotonic clock available to the image.

That's one way, but it's complex.  I think having a clock that is flexible, that will deviate by no more than a specified percentage from clock_gettime in approaching wall time is simpler for the user albeit more complex for the VM implementor.  It therefore seems to me to be in the Smalltalk tradition.

In Dart, there are three uses of time
Stopwatch measures durations (BlockClosure timeToRun). It uses the monotonic clock.
Timer schedules a future notification (Delay wait). It uses the monotonic clock.
DateTime gets a timestamp (DateAndTime now). It uses the wall clock.

Makes sense, at the cost of having two clocks.
 
Smalltalk has the additional complication of handling in-flight Delays or timeToRuns as an image moves across processes. There will be a discontinuity in both clocks, and both of them can move backwards. The logic to deal with the discontinuity must already exist for Delays, though I suspect no one has bothered for timeToRun. If I create a thousand Delays spaced apart by a minute, snapshot, move the system time forward a day, then resume, they remain evenly spaced. If I do this while the image is still running, they all fire at once and the VM becomes unresponsive, which is what using the monotonic clock would fix.

Yes, but there is another way.  Delays can be implemented to function as durations, not deadlines.  This is orthogonal to clocks.  If Delays are deadlines then it is correct that on start-up they all fire.  If they are durations, it is not.

_,,,^..^,,,_
best, Eliot

 
Currently this change also affects performance (down to 8-10% of the previous implementation), because of the creation of multiple LargeIntegers.

This is no longer an issue in 64-bits ;-).  But even if answering large integers is slower it doesn't impact real applications since they spend little of their time in the delay & timing part of the code.  But I'm sure that Nicolas & I can do something about large integer performance.

 
Levente


- Bert -








--
_,,,^..^,,,_
best, Eliot