[squeak-dev] The Inbox: Chronology-Core-ul.54.mcz

David T. Lewis lewis at mail.msen.com
Sat May 9 22:49:58 UTC 2020


Hi Levente,

Some additional measurements below, this time comparing DateAndTime now
speed for three versions of the primitive, and for different clock policy
settings.

The current primitive clearly needs to be fixed to get good sampling, and
there is minimal performance impact for the fix.

The gettimeofday() version of the primitive is much slower, but produces
correct results without caching the GMT offset. The speed difference is
larger than I expected. I am not sure which is better, I guess it depends
on how much the speed of DateAndTime now matters in practice.

In any case, this gives some numbers for comparison.

Dave

On Sat, May 09, 2020 at 04:46:55PM -0400, David T. Lewis wrote:
> On Wed, May 06, 2020 at 09:31:33AM -0400, David T. Lewis wrote:
> > On Wed, May 06, 2020 at 01:03:02PM +0200, Levente Uzonyi wrote:
> > > Hi Dave,
> > > 
> > > On Tue, 5 May 2020, David T. Lewis wrote:
> > > 
> > > >Hi Levente,
> > > >
> > > >On Wed, May 06, 2020 at 02:39:28AM +0200, Levente Uzonyi wrote:
> > > >>Hi Dave,
> > > >>
> > > >>On Tue, 5 May 2020, David T. Lewis wrote:
> > > >>
> > > >>>On Sun, May 03, 2020 at 06:45:52PM +0000, commits at source.squeak.org 
> > > >>wrote:
> > > >>>>Levente Uzonyi uploaded a new version of Chronology-Core to project The 
> > > >>>>Inbox:
> > > >>>>http://source.squeak.org/inbox/Chronology-Core-ul.54.mcz
> > > >>>>
> > > >>>>==================== Summary ====================
> > > >>>>
> > > >>>>Name: Chronology-Core-ul.54
> > > >>>>Author: ul
> > > >>>>Time: 3 May 2020, 8:45:27.099465 pm
> > > >>>>UUID: 770bbc91-fb33-46c2-a283-57f7163b672e
> > > >>>>Ancestors: Chronology-Core-nice.52
> > > >>>>
> > > >>>>Automatically update the time zone cached by the VM every 30 minutes:
> > > >>>
> > > >>>
> > > >>>I am concerned that we are adding kludges on top of kludges, while
> > > >>>overlooking a very simple underlying problem.
> > > >>>
> > > >>>The intent of primitiveUtcWithOffset is to answer, in a single atomic
> > > >>>operation, the current time UTC along with the current GMT offset. It
> > > >>>should be reported directly from the operating system, and not otherwise
> > > >>>cached or manipulated in the VM. The intended use case is for getting the
> > > >>>current system time and offset to be used in the creation of DateAndTime 
> > > >>>now,
> > > >>>nothing else.
> > > >>>
> > > >>>The original and still correct implementation of ioUtcWithOffset for
> > > >>>Unix is a direct call to gettimeofday(). On any reasonably modern Unix
> > > >>>system it provides the two values directly in a single OS system call.
> > > >>>
> > > >>>/* implementation of ioLocalSecondsOffset(), defined in config.h to
> > > >>>* override default definition in src/vm/interp.h
> > > >>>*/
> > > >>>sqInt sqUnixUtcWithOffset(sqLong *microSeconds, int *offset)
> > > >>>{
> > > >>> struct timeval timeval;
> > > >>> if (gettimeofday(&timeval, NULL) == -1) return -1;
> > > >>> time_t seconds= timeval.tv_sec;
> > > >>> suseconds_t usec= timeval.tv_usec;
> > > >>> *microSeconds= seconds * 1000000 + usec;
> > > >>>#if defined(HAVE_TM_GMTOFF)
> > > >>> *offset= localtime(&seconds)->tm_gmtoff;
> > > >>>#else
> > > >>> {
> > > >>>   struct tm *local= localtime(&seconds);
> > > >>>   struct tm *gmt= gmtime(&seconds);
> > > >>>   int d= local->tm_yday - gmt->tm_yday;
> > > >>>   int h= ((d < -1 ? 24 : 1 < d ? -24 : d * 24) + local->tm_hour - 
> > > >>>   gmt->tm_hour);
> > > >>>   int m= h * 60 + local->tm_min - gmt->tm_min;
> > > >>>   *offset= m * 60;
> > > >>> }
> > > >>>#endif
> > > >>> return 0;
> > > >>>}
> > > >>>
> > > >>>The oscog code base does not have ioUtcWithOffset in 
> > > >>>platforms/Cross/vm/sq.h,
> > > >>>and the actual implementation of the primitive uses a workaround 
> > > >>involving
> > > >>>ioUTCMicroseconds and ioLocalSecondsOffset rather than the single call to
> > > >>>ioLocalSecondsOffset.
> > > >>>
> > > >>>I am personally responsible for putting that kludge into Cog in earlier 
> > > >>>days,
> > > >>>and I apologize for my lapse in judgement. It is horrible, and that's my 
> > > >>>fault.
> > > >>>
> > > >>>It now looks to me like we are trying to have the image keep track of 
> > > >>>current
> > > >>>GMT offset, and then inform the VM when it changes so that the VM can 
> > > >>>update
> > > >>>its cached offset value, which it then turns around and uses for my 
> > > >>kludgy
> > > >>>workaround for the lack of an ioUtcWithOffset implementation.
> > > >>>
> > > >>>Maybe it is time to add a proper ioUtcWithOffset, and get rid of the 
> > > >>>various
> > > >>>workarounds in both the VM and the image?
> > > >>
> > > >>Eliot always says that asking the OS about the current time zone is too 
> > > >>costly for something like DateAndTime now.
> > > >>I haven't benchmarked the C code, but I have a Smalltalk benchmark to 
> > > >>demonstrate Eliot is right:
> > > >>
> > > >>
> > > >>"primitiveUtcWithOffset uses the cached time zone offset"
> > > >>a := Array new: 2.
> > > >>[ Time primPosixMicrosecondClockWithOffset: a ] benchFor: 1 seconds. 
> > > >>'28,200,000 per second. 35.5 nanoseconds per run. 0 % GC time.'.
> > > >>
> > > >>"LocalePlugin's primitiveTimezoneOffset asks the OS about the offset via 
> > > >>localtime()
> > > >>https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/platforms/unix/plugins/LocalePlugin/sqUnixLocale.c#L675 "
> > > >>
> > > >>l := Locale current.
> > > >>[ l primTimezone ] benchFor: 1 seconds.
> > > >>'773,000 per second. 1.29 microseconds per run. 0 % GC time.'
> > > >>
> > > >>
> > > >>The latter takes 363 times longer according to this benchmark.
> > > >>
> > > >
> > > >It is OK to cheat if you don't get caught.
> > > >
> > > >On a slow interpreter VM, with the IMHO correct primitive:
> > > >
> > > >  oc := OrderedCollection new.
> > > >  stopTime := DateAndTime now + 1 second.
> > > >  [ (oc add: (Time primPosixMicrosecondClockWithOffset: DateAndTime 
> > > >  basicNew)) < stopTime]
> > > >  	whileTrue.
> > > >  oc size.  "==> 242827"
> > > >  oc asSet size. "==> 242827"
> > > >
> > > >On a fast Spur 64 VM:
> > > >
> > > >  oc := OrderedCollection new.
> > > >  stopTime := DateAndTime now + 1 second.
> > > >  [ (oc add: (Time primPosixMicrosecondClockWithOffset: DateAndTime 
> > > >  basicNew)) < stopTime]
> > > >  	whileTrue.
> > > >  oc size.  "==> 3098713"
> > > >  oc asSet size. "==> 106"
> > > >
> > > >So yes it is fast if you cache the values, but the results are wrong.
> > > >That in turn leads to the perceived need for the logic in
> > > >Time class>>posixMicrosecondClockWithOffset: to artificially generate
> > > >unique time values. Those values are even more incorrect, and the overall
> > > >process is almost certainly slower than just doing it right in the first 
> > > >place.
> > > 
> > > 
> > > Right, that looks really bad. But, primitve 240 seems to do the right 
> > > thing:
> > > 
> > > oc := OrderedCollection new.
> > > stopTime := Time utcMicrosecondClock + 1000000.
> > > [ (oc add: Time utcMicrosecondClock) < stopTime ] whileTrue.
> > > { oc size. oc asSet size }. #(10485762 489582).
> > > 
> > > If I preallocate the right sized OrderedCollection to avoid GCs, it's 
> > > almost perfect in the sense that almost all possible values are there:
> > > 
> > > oc := OrderedCollection new: 23000000.
> > > stopTime := Time utcMicrosecondClock + 1000000.
> > > [ (oc add: Time utcMicrosecondClock) < stopTime ] whileTrue.
> > > { oc size. oc asSet size }.
> > >  #(22899376 996417)
> > > 
> > > The missing values are probably the VM doing something else. Most of them 
> > > come in 2-3 microsecond long gaps starting exactly 4000 microseconds apart 
> > > from each other. The longest gap was 33 microseconds long when I measured.
> > > 
> > > So, it seems reasonable to make primitiveUtcWithOffset do the same thing 
> > > primitive 240 does to get the clock value.
> > >
> > 
> > That might be just a matter of calling ioUTCMicrosecondsNow() rather than
> > ioUTCMicroseconds() in primitiveUtcWithOffset.
> > 
> > I am travelling today so I can't follow up to confirm.
> > 
> > It would also be worth measuring the performance of the gettimeofday()
> > implementation, which handles the GMT offset properly. My expectation
> > is that it will be plenty fast enough, but I have not measured it on Spur.
> >
> 
> OK, here are some actual measurements. I built a Spur64 VM locally, generating
> the VMMaker code, and adding two new primitives.
> 
> The first primitive that I added is primitiveUtcWithOffsetNOW, which is identical
> to primitiveUtcWithOffset except that it calls ioUTCMicrosecondsNow rather
> than ioUTCMicroseconds. This is the minimal fix to produce correct sampling
> for DateAndTime now.
> 
> The second primitive that I added is primitiveUtcWithOffsetDTL, which is an
> exact copy of my original implementation in interpreter VM trunk. I also added
> ioUtcWithOffset() from interpreter VM trunk to the support code in sqUnixMain.c.
> 
> Here are the results that I measured:
> 
> "Pre-allocate the storage"
> baselinePrimitve240 := OrderedCollection new: 40000000.
> usingOscogPrimitive := OrderedCollection new: 20000000.
> usingOscogPrimitiveWithFix := OrderedCollection new: 20000000.
> usingGettimeofdayPrimitive := OrderedCollection new: 20000000.
> dt := DateAndTime basicNew. "reused by the primitives, so no allocation"
> 
> Smalltalk garbageCollect.
> 
> "Baseline - call primitive 240 for utcMicrosecondClock"
> stop := Time utcMicrosecondClock + (1 second asMicroSeconds).
> [ (baselinePrimitve240 add: Time utcMicrosecondClock) < stop ] whileTrue.
> 
> "Call the oscog primitive saving only small integers in the arrays"
> stop := DateAndTime now + 1 second.
> [ ( usingOscogPrimitive add: (Time primPosixMicrosecondClockWithOffset: dt ) utcMicroseconds ). dt < stop ] whileTrue.
> 
> "Call the oscog primitive with fix for NOW saving only small integers in the arrays"
> stop := DateAndTime now + 1 second.
> [ ( usingOscogPrimitiveWithFix add: (Time primPosixMicrosecondClockWithOffsetNow: dt ) utcMicroseconds ). dt < stop ] whileTrue.
> 
> "Call the gettimeofday primitive from interpreter VM saving only small integers in the arrays"
> stop := DateAndTime now + 1 second.
> [ ( usingGettimeofdayPrimitive add: (Time primPosixMicrosecondClockWithOffsetDTL: dt ) utcMicroseconds ). dt < stop ] whileTrue.
> 
> "Look at only the saved values"
> baselinePrimitve240 := baselinePrimitve240 select: [:e | e notNil].
> usingOscogPrimitive := usingOscogPrimitive select: [:e | e notNil].
> usingOscogPrimitiveWithFix := usingOscogPrimitiveWithFix select: [:e | e notNil].
> usingGettimeofdayPrimitive := usingGettimeofdayPrimitive select: [ :e | e notNil].
> 
> "Compare the results"
> results :={
> #baselinePrimitive240 -> { baselinePrimitve240 size . baselinePrimitve240 asSet size } .
> #primPosixMicrosecondClockWithOffset: -> { usingOscogPrimitive size . usingOscogPrimitive asSet size }.
> #primPosixMicrosecondClockWithOffsetNow: -> { usingOscogPrimitiveWithFix size . usingOscogPrimitiveWithFix asSet size }.
> #primPosixMicrosecondClockWithOffsetDTL: -> { usingGettimeofdayPrimitive size . usingGettimeofdayPrimitive asSet size }
> }.
> 
> results. 
> "==> {
> 	#baselinePrimitive240->#(
> 		21426822
> 		959518
> 	) .
> 	#primPosixMicrosecondClockWithOffset:->#(
> 		18542112
> 		496
> 	) .
> 	#primPosixMicrosecondClockWithOffsetNow:->#(
> 		13032095
> 		959842
> 	) .
> 	#primPosixMicrosecondClockWithOffsetDTL:->#(
> 		692791
> 		692791
> 	)
> }"
> 
> 

Additional measurements comparing DateAndTime now speed for three versions
of the primitive, and for different clock policy settings:

"With current primPosixMicrosecondClockWithOffset:"

Time clockPolicy: #monotonicAllowDuplicates.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 1022"

Time clockPolicy: #acceptPlatformTime.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 874"

Time clockPolicy: #monotonicForceMicrosecondIncrement.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 1056"

Time clockPolicy: #monotonicForceNanosecondIncrement.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 9348"


"Using the primitive with ioUTCMicrosecondsNow() rather than ioUTCMicroseconds() to fix sampling error:"

Time clockPolicy: #monotonicAllowDuplicates.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 1659"

Time clockPolicy: #acceptPlatformTime.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 1482"

Time clockPolicy: #monotonicForceMicrosecondIncrement.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 1739"

Time clockPolicy: #monotonicForceNanosecondIncrement.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 10195"


"Using the gettimeofday() primitive from interpreter VM trunk:"

Time clockPolicy: #monotonicAllowDuplicates.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]].  "==> 15330"

Time clockPolicy: #acceptPlatformTime.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]].  "==> 15138"

Time clockPolicy: #monotonicForceMicrosecondIncrement.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]]. "==> 15377"

Time clockPolicy: #monotonicForceNanosecondIncrement.
Time millisecondsToRun: [10000000 timesRepeat: [DateAndTime now]].  "==> 15549"


 


More information about the Squeak-dev mailing list