[Vm-dev] Time primitive precision versus accuracy (was: [Pharo-dev] strange time / delay problems on pharo-contribution-linux64-4.ci.inria.fr

David T. Lewis lewis at mail.msen.com
Tue Mar 11 19:30:43 UTC 2014


Hi Eliot,

Yes that makes perfect sense and I agree on all points.

IIRC, there are some differences in the support code (and hence primitive
implementations) for the gettimeofday calls. I think they are equivalent
from of performance point of view (though I'm not looking at the code
right now to check). If you are able to use the implementations in trunk
without a performance hit, that would help get rid of some accidental
branching differences.

Thanks,
Dave

>  Hi David,
>
>     forgive the lack of response.  I needed to think about this...
>
>
> On Wed, Mar 5, 2014 at 4:59 PM, David T. Lewis <lewis at mail.msen.com>
> wrote:
>
>>
>> On Tue, Mar 04, 2014 at 03:13:26PM -0800, Eliot Miranda wrote:
>> >
>> > On Tue, Mar 4, 2014 at 2:37 PM, Sven Van Caekenberghe <sven at stfx.eu>
>> wrote:
>> > >
>> > > There is a big difference in how DateAndTime class>>#now works in
>> 2.0
>> vs
>> > > 3.0. The former uses #millisecondClockValue while the latter uses
>> > > #microsecondClockValue. Furthermore 2.0 does all kind of crazy stuff
>> with a
>> > > loop and Delays to try to improve the accuracy, we threw all that
>> out.
>> > >
>> > > Yeah, I guess the clock is just being updated slowly, maybe under
>> heavy
>> > > load. The question is where that happens. I think in the virtualised
>> OS.
>> > >
>> >
>> > That makes sense.  In the VM the microsecond clock is updated on every
>> > heartbeat and the heartbeat should be running at about 500Hz.  Can
>> anyone
>> > with hardware confirm that in 3.0 the time on linux does indeed
>> increase
>> at
>> > about 500Hz?
>> >
>> > Note that even in 2.0 the time is being derived from the same basis.
>> In
>> > Cog, the effective time basis is the 64-bit microsecond clock
>> maintained
>> by
>> > the heartbeat and even the secondClock and millisecondClock are
>> derived
>> > from this.  So its still confusing why there should be such a
>> difference
>> > between 2.0 and 3.0.
>> >
>>
>> The separate thread for timing may be good for profiling, but it is not
>> such
>> a good idea for the time primitives. When the image asks for "time now"
>> it
>> means now, not whatever time it was when the other thread last took a
>> sample.
>> By reporting the sampled time, we get millisecond accuracy (or worse)
>> reported
>> with microsecond precision.
>>
>
> Yes, you're quite right about the lack of accuracy being broken.  You're
> not right about the heartbeat thread though, here's why, and this took me
> some time to remember.
>
> gettimeofday is quite slow.  It takes a timezone argument which needs to
> be
> looked up.  So if at all possible calls to gettimeofday should be
> minimised.  Of course, using it to get the time at the Smalltalk level is
> fine.  Where it /isn't/ fine is in polling for events.  The VM can enquire
> of the time at a very high frequency, and if it calls gettimeofday every
> time through its checkForEvents loop this can add up to significant time.
> One of the uses here is in seeing if the next scheduled delay fire time
> has
> been reached. Hence, prompted by Andreas I implemented time update in the
> heartbeat thread.  The idea here is that the time is accessed only at
> heartbeat frequency (say 500Hz), not at interrupt check frequency, which
> is
> heartbeat frequency + stack page overflow frequency, which can be 250KHz
> (!!).
>
> For example here's data from profiling 10 timesRepeat: [0 tinyBenchmarks]
>
> **Events**
> Process switches 729 (10 per second)
> ioProcessEvents calls 3613 (50 per second)
> Interrupt checks 38635 (530 per second)
> Event checks 42300 (580 per second)
> Stack overflows 17461252 (239346 per second)  << deep recursion
> Stack page divorces 0 (0 per second)
>
>
> deep recursion from the benchFib method causes a high rate of stack
> overflows, and the VM uses stack overflow to check for events.  The
> heartbeat forces a stack overflow by setting the stack limit suitably.  So
> if check for events uses the time now, rather than the time set in te
> heartbeat, it would be calling gettimeofday at 240KHz, not 500Hz, a big
> difference.  The heartbeat time is then used to reduce the frequency of
> ioProcessEvents calls to 50Hz.
>
> But it wasn't until today in the shower that I realised that of course the
> VM /doesn't/ need to use the heartbeat time for the time accessed at the
> Smalltalk level.  So primUTCMicrosecondClock should use
> ioUTCMicrosecondsNow while the event checking code should remain
> unchanged.
>  The effect of this is to increase the accuracy of Smalltalk-level time,
> while keeping the Delay accuracy at the heartbeat frequency, or a jitter
> of
> 2ms or so.  And of course its simpel to modify the code to use a flag so
> people can experiment.
>
> But for me the right compromise is to access the current time in the time
> primitives but continue to use the heartbeat time for checking for events.
>  Does that make sense to you David?
>
>
>> If you collect primUTCMicrosecondClock in a loop with the primitive as
>> implemented in Cog, then plot the result, you get a staircase with long
>> runs of the same time value, and sudden jumps every one and a half
>> milliseconds
>> or so. The equivalent plot for the primitive as implemented in the
>> interpreter
>> VM is a smoothly increasing slope.
>>
>> To illustrate, if you run the example below with Cog to collect as many
>> "time now" data points as possible within a one second time period, you
>> find
>> a large number of data points at microsecond precision, but only a small
>> number
>> of distinct time values within that period.
>>
>>   "Cog VM"
>>   oc := OrderedCollection new.
>>   now := Time primUTCMicrosecondClock.
>>   [(oc add: Time primUTCMicrosecondClock - now) < 1000000] whileTrue.
>>   oc size. ==> 2621442
>>   oc asSet size. ==> 333
>>
>> In contrast, an interpreter VM with no separate timer thread collects
>> fewer samples
>> (because it is slower) but the sample values are distinct and increase
>> monotonically
>> with no stair stepping effect.
>>
>>   "Interpreter VM"
>>   oc := OrderedCollection new.
>>   now := Time primUTCMicrosecondClock.
>>   [(oc add: Time primUTCMicrosecondClock - now) < 1000000] whileTrue.
>>   oc size. ==> 246579
>>   oc asSet size. ==> 246579
>>
>> Dave
>>
>>
>
>
> --
> best,
> Eliot
>




More information about the Vm-dev mailing list