[Vm-dev] Time primHighResClock truncated to 32 bits in 64 bits VMs.
Das.Linux at gmx.de
Thu Dec 28 17:52:41 UTC 2017
> On 28.12.2017, at 18:06, Eliot Miranda <eliot.miranda at gmail.com> wrote:
> Hi David, Hi Jiuan,
> On Thu, Dec 28, 2017 at 8:56 AM, David T. Lewis <lewis at mail.msen.com> wrote:
> On Thu, Dec 28, 2017 at 09:32:46AM -0300, Juan Vuletich wrote:
> > Hi Folks,
> > In 32 bit Cog VMs, `Time primHighResClock` answers LargePositiveInteger,
> > presumably up to 64 bits. This would mean a rollover in 167 years on a
> > 3.5GHz machine.
> > But on 64 bit Cog and Stack Spur VMs, it answers a SmallInteger that is
> > truncated to 32 bits. This means a rollover in about one second.
> > I guesss this is a bug. Answering a SmallInteger, truncating the CPU 64
> > bit counter to 60 bits would be ok. I think it makes sense to restrict
> > answer to SmallInteger to avoid allocation, and a rollover every 41
> > years is not too much :)
> > Thanks,
> > --
> > Juan Vuletich
> > www.cuis-smalltalk.org
> > https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> > @JuanVuletich
> Attached is the #primHighResClock accessor for Squeak/Pharo users.
> I don't see anything obviously wrong with the primitive, although maybe it
> involves the handling of positive64BitIntegerFor: in the 64-bit VM.
> The primitive is:
> "Return the value of the high resolution clock if this system has any. The exact frequency of the high res clock is undefined specifically so that we can use processor dependent instructions (like RDTSC). The only use for the high res clock is for profiling where we can allocate time based on sub-msec resolution of the high res clock. If no high-resolution counter is available, the platform should return zero."
> <export: true>
> self pop: 1.
> self push: (self positive64BitIntegerFor: self ioHighResClock).
> And the platform support code does this:
> /* return the value of the high performance counter */
> sqLong value = 0;
> #if defined(__GNUC__) && ( defined(i386) || defined(__i386) || defined(__i386__) \
> || defined(i486) || defined(__i486) || defined (__i486__) \
> || defined(intel) || defined(x86) || defined(i86pc) )
> __asm__ __volatile__ ("rdtsc" : "=A"(value));
> #elif defined(__arm__) && (defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_7A__))
> /* tpr - do nothing for now; needs input from eliot to decide further */
> # error "no high res clock defined"
> return value;
> Ah, OK. So this is the problem. This will answer ex on 64-bit systems, which discard the upper 32-bits. rdtsc loads %edx:%eax with the 64-bit time stamp, but on 64-bits the in-line asm will simply move %eax to %rax. We have to rewrite that in-line assembler to construct %rax correctly from %edx and %eax.
Wikipedia points to this code: https://web.archive.org/web/20161215213659/http://www.cs.wm.edu/~kearns/001lab.d/rdtsc.html
unsigned long long int x;
unsigned a, d;
__asm__ volatile("rdtsc" : "=a" (a), "=d" (d));
return ((unsigned long long)a) | (((unsigned long long)d) << 32);
Which seems reasonable.
> best, Eliot
More information about the Vm-dev