Code Generation (was VM improvement: speeding up ...)

Marcel Weiher marcel at metaobject.com
Mon Feb 14 09:05:29 UTC 2000


Andreas,

the difference between plain and optimized code generation is also  
one that is constantly on my mind.  Just as Andrew, I don't see how  
we can possibly match the code generation proficiency of the really  
good C compilers out there, with years of tweaking on platform  
independent as well as platform dependent optimizations.

The only way it would make sense is if Squeak simply doesn't use the  
type of code that C compilers are good at optimizing.  The example  
you give sort of falls into that category, but does this have wider  
applicability?

To me, it looks like the sort of bottle-necked function that might  
always be profitable to hand-tune, or better yet, eliminate  
altogether.  While it is almost always possible to hand-tune a single  
particular function so it performs better than compiler output, this  
issue seems to be entirely unrelated to comparing the efficiency of  
two automatic code generators, one driven by Squeak and one driven by  
native C compilers.

The binary for egcs/gcc is 2.5 MB all by its lonesome, and that's  
just one platform.  While I am sure that there is some bloat, that's  
a lot of code generation know-how (also shown in the sources).  Is  
all of that irrelevant?  It may be, I don't know, but I have a  
difficult time with the idea.

Marcel

> From: "Raab, Andreas" <Andreas.Raab at disney.com>
>
> > While intermediate code generation would be convenient in some ways,  
> > let us not forget that intermediate generation to C-code permits  
> > Squeak to leverage the quality of highly mature, highly optimizing  
> > compilers built painstakingly, platform by platform, over the past  
> > twenty or so years.  The difference between a dull and a highly  
> > optimizing compiler can be an order of magnitude in some cases  
-- and
> > the existence of so many high quality C compilers is a large reason  
> > Squeak is so readily portable today.
>
> I agree on the point of portability but not necessarily on the point of 
> efficiency. While C compilers do on the average a relatively good  
job there
> are certain cases where you really want to fine tune the code.  
Here are two
> examples that are currently in use:
> * The number of inlined temps in interpret() matches *exactly* the  
number of
> temps CodeWarrior can handle
> * Ian's 'gnuify' script gives the compiler very strong and  
accurate hints
> about where to put certain variables
> You may ask how much of a difference this really makes. Well, how about 
> this: I very recently switched to gcc for compiling the Windows  
VM. Applying
> Ian's little hints resulted in a 70% increase in byte code speed  
and a 25%
> increase in send speed. While this doesn't necessarily show up in macro 
> benchmarks it's certainly a point to keep in mind. I could image  
that the
> RTOS generator could do an average job on most parts (though this  
needs to
> be proven) and then one could go on and tweak the one percent that are 
> really critical for speed.





More information about the Squeak-dev mailing list