OQO [Re: Yet another interesting bit of hardware in theDynapadvein...]
Mike Rutenberg
mdrs at akasta.com
Tue May 21 08:58:43 UTC 2002
I suspect loop overhead is the main influence on the ratio, even though it will be the same for integer and FP on a given platform.
I also get a ratio of roughly 2.1 for my Pentium for this:
[100000 timesRepeat: [0+0]] timeToRun.
[100000 timesRepeat: [0.0+0.0]] timeToRun.
If I unwind the computation to reduce the overhead of the 100000 times
loop, I get a ratio of more like 7.7
[100000 timesRepeat: [0+0+0+0+0+0+0+0+0+0]] timeToRun.
[100000 timesRepeat: [0.0+0.0+0.0+0.0+0.0+0.0+0.0+0.0+0.0+0.0]]
timeToRun.
This might be a better test for comparing FP/Int performance ratio on
Pentium vs. ARM
Mike
"Ohshima, Yoshiki" <Yoshiki.Ohshima at disney.com> wrote:
> Hello,
>
> > > An XScale based PocketPC is coming out in a few weeks from
> > >Toshiba. I would assume that XScale at 400MHz (max) is
> > >faster than 300MHz Geode:-)
> >
> > Well, that depends.... If you mean the PXA210, then I
> > would say that the 300 MHz Geode definitely beats the 400
> > MHz XScale. The Geode is essentially a Pentium class CPU
> > with floating point and MMX. I'd say that it is probably
> > still faster than even a 400 MHz PXA250, especially for
> > running Squeak.
>
> Hmm. Interesting.
>
> > Squeak likes hardware floating point. Compare Squeak on a
> > 200 MHz Celeron against Squeak on an iPaq. I know for
> > sure that Squeak is much more responsive and the
> > benchmarks are better on my old Pentium 133 Sharp Widenote
> > that on my Casiopeia E-105 (a 131 MHz MIPS 3 CPU with no
> > floating point hardware).
>
> Do you think this is due to the floating point hardware?
> I've been thinking this is more because of the memory
> bandwidth.
>
> On my Pentium III 800MHz laptop, the ratio of results from
> following two lines is around 2.4. (43ms vs. 103ms).
>
> [100000 timesRepeat: [0+0]] timeToRun.
> [100000 timesRepeat: [0.0+0.0]] timeToRun.
>
> On my iPAQ, the ratio is around 2.3. I think the primitive
> callout is so slow that the actual computation is pretty
> much shadowed by the other factor. The #+ primitive first
> trys SmallInteger version and then fall back to Float
> version. This would explain the factor of two difference.
>
> > I don't think Squeak would readily take advantage of the
> > dual multiply-accumulate pipelines or SIMD on the PXA250,
> > just like it doesn't really benefit from MMX.
>
> Yes. Some bitblt rules, such as rule 24 can be much
> faster if we bind it with the MMX (or Intel IPP stuff).
>
> -- Yoshiki
More information about the Squeak-dev
mailing list
|