Code Generation (was VM improvement: speeding up ...)
Marcel Weiher
marcel at metaobject.com
Mon Feb 14 09:05:29 UTC 2000
Andreas,
the difference between plain and optimized code generation is also
one that is constantly on my mind. Just as Andrew, I don't see how
we can possibly match the code generation proficiency of the really
good C compilers out there, with years of tweaking on platform
independent as well as platform dependent optimizations.
The only way it would make sense is if Squeak simply doesn't use the
type of code that C compilers are good at optimizing. The example
you give sort of falls into that category, but does this have wider
applicability?
To me, it looks like the sort of bottle-necked function that might
always be profitable to hand-tune, or better yet, eliminate
altogether. While it is almost always possible to hand-tune a single
particular function so it performs better than compiler output, this
issue seems to be entirely unrelated to comparing the efficiency of
two automatic code generators, one driven by Squeak and one driven by
native C compilers.
The binary for egcs/gcc is 2.5 MB all by its lonesome, and that's
just one platform. While I am sure that there is some bloat, that's
a lot of code generation know-how (also shown in the sources). Is
all of that irrelevant? It may be, I don't know, but I have a
difficult time with the idea.
Marcel
> From: "Raab, Andreas" <Andreas.Raab at disney.com>
>
> > While intermediate code generation would be convenient in some ways,
> > let us not forget that intermediate generation to C-code permits
> > Squeak to leverage the quality of highly mature, highly optimizing
> > compilers built painstakingly, platform by platform, over the past
> > twenty or so years. The difference between a dull and a highly
> > optimizing compiler can be an order of magnitude in some cases
-- and
> > the existence of so many high quality C compilers is a large reason
> > Squeak is so readily portable today.
>
> I agree on the point of portability but not necessarily on the point of
> efficiency. While C compilers do on the average a relatively good
job there
> are certain cases where you really want to fine tune the code.
Here are two
> examples that are currently in use:
> * The number of inlined temps in interpret() matches *exactly* the
number of
> temps CodeWarrior can handle
> * Ian's 'gnuify' script gives the compiler very strong and
accurate hints
> about where to put certain variables
> You may ask how much of a difference this really makes. Well, how about
> this: I very recently switched to gcc for compiling the Windows
VM. Applying
> Ian's little hints resulted in a 70% increase in byte code speed
and a 25%
> increase in send speed. While this doesn't necessarily show up in macro
> benchmarks it's certainly a point to keep in mind. I could image
that the
> RTOS generator could do an average job on most parts (though this
needs to
> be proven) and then one could go on and tweak the one percent that are
> really critical for speed.
More information about the Squeak-dev
mailing list
|