Awe and horror.
John M McIntosh
johnmci at smalltalkconsulting.com
Tue Nov 1 20:36:53 UTC 2005
On 1-Nov-05, at 12:41 PM, Alan Grimes wrote:
> Instead, class variables are optimized as constants, and while
> instance
> variables are put in a structure
The instance variables are put into a structure because on powerpc if
they are non-structure static/nonstatic variables
in the scope of the file then it takes an extra memory load to
deference the data storage pointer to load/store the variable.
By using a structure you avoid that extra load, this is why the
structure is there. Testing on intel based and 68K machines showed
there was no impact so along the way we made it the default, although
I think you can choose to turn it off.
a) Usage of the
register struct foo * foo = &fum;
ensures that on powerpc the foo pointer gets into a register if and
only if two or more references are made to the structure.
b) Some variables are not in the structure because they require
initialization, this could be changed by having a method
that actually does the initialization.
c) Over the years sometimes arrays have gone into or out of the
structure on powerpc based on compiler behaviour.
d) Technically on register happy machines you could say to GCC let
register 42 contain the foo pointer, if you of course
ensure all plugins are happy with that rule.
e) Inlining has a modification so that if a instance variable that is
used in multiple routines and is then folded into
a single routine then that variable is consolidated into a local
scoped variable. The main user of this logic is in the
GC logic where variables are shared between different methods making
it easy to write the algorithms, but all those methods are folded
into a single C procedure.
This change made a significant improvement in GC performance on
register happy machines.
f) The interpreter case loop has logic to scope local variable usage
to a particular case statement, versus scoping to the entire C
procedure.
By scoping to individual cases statements most compilers are much
happier to do register optimizations.
g) lastly gnuifying alters the case statement to use jumptables which
is much more efficient.
h) using C++ inline keyword and not-inlining the VM has in the past
produced lousy performance.
i) The inline uses some rules to decide if small routines can be
inline, otherwise it follows the hint from self inline: boolean if
the routine could
be inline and not fail some other rule other than length. In the
past there was a patch I did to say yes do the inline anyways for
this procedure, not sure
if that is still in vmmaker.
j) Compiler optimizations can do ugly things with common code
elimination etc etc, such as dragging part of the common send logic
into many of the
individual bytecode case logic. Testing across (many) gcc versions
will show you which is the best compiler for your platform.
--
========================================================================
===
John M. McIntosh <johnmci at smalltalkconsulting.com> 1-800-477-2659
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===
More information about the Vm-dev
mailing list