Switching to use foo struct on Windows VM

sig siguctua at gmail.com
Sun Jul 15 18:55:26 UTC 2007


On 15/07/07, Bert Freudenberg <bert at freudenbergs.de> wrote:
>
> On Jul 15, 2007, at 10:51 , John M McIntosh wrote:
>
> >
> > On Jul 14, 2007, at 7:45 PM, Andreas Raab wrote:
> >
> >> This result is quite surprising. When John originally introduced
> >> this option, x86 was significantly slower when compiling with than
> >> without it. As a matter of fact, given that probably some 90+% of
> >> all Squeak platforms are now x86 I was thinking about removing it
> >> altogether (after all, it's just a pointless memory dereferencing
> >> which is only advantageous on platforms that don't have direct
> >> addressing modes).
> >>

Everywhere when some method uses foo struct, generator places
following line in function:
register struct foo * foo = &fum;

and then uses everywhere  foo->bar.
So, the difference in compiled code when using foo struct or not is minimal:

   mov reg, [bar]   <- using globals
   mov reg, [foo + bar_offset]  <- with foo

Of course, this depends how well GCC optimizes code, but in optimal
case - difference between loading value using direct pointer or using
base+offset is a just few cycles. And i don't think that this may
cause a major speed degradation.

The only platform , which uses another level of indirection is RiscOS
(which passes
'globalStructDefined: false'  to CCodeGeneratorGlobalStructure).
when globalStructDefined: false, it not generates a line in each
function (register struct foo * foo = &fum;) and uses foo directly (it
seems that 'foo' declared somewhere in platform code, because
CCodeGeneratorGlobalStructure omits declaration of foo, when
globalStructDefined: false).


> >>> Please , let me know, if my patch is acceptable, from this
> >>> depends the
> >>> way how i implement VM pointers table. :)
> >>
> >> To be blunt, there are two things I don't like about it: First, it
> >> introduces the need for another dereferencing in an already
> >> register-deprived model. Second, anything containing "struct foo
> >> fum" is immediately on my list of things I never want to see in my
> >> code. Changing these names to something sensible would make it a
> >> lot easier to convince me about the changes.
> >
> > Ah, well the history why it was Foo was because I had discovered
> > that under PPC the usage of a structure would remove one
> > instruction for each read or write to a VM memory location. This
> > made a significant change to the performance of the PowerPC VM, if
> > you run 1/3 less instructions you get more work done. I set out one
> > weekend to alter the VM and named the structure Foo as a joke, and
> > then dug deep into SLang to figure out how to change it so that
> > references to global variables would refer to the Foo structure
> > because I really didn't think I was going to be able to change it.
> > However I was successful and left it named Foo as a reminder how
> > well build slang was, oddly no one complained until tonight (took
> > years I note).  Also of course I had to make it so that you could
> > build the VM with or without the feature because as Andreas pointed
> > out it did not produce good assembler on the Intel Platform, so
> > getting all that to work was non-trival.
> >
> > Lurking in here also was some comments from people wanting to build
> > VMs for some special purpose CPUS where they would hang all the
> > globals off a single structure pointed to by a register versus
> > having 1000 separate globals, plus a thought about making a VM with
> > multiple VM threads that would only require a register switch to
> > change squeak VM processes.
> >
> > Other notes.
> >
> > (a) Sometimes depending on the compiler version Arrays are, or are
> > not allocated into the structure because of  how the compiler feels
> > it should generate the code.  Sometimes it does insane things,
> > other times it removed one or two instructions for PowerPC
> > references. This behaviour is tied to the compiler version.
> > Truthfully I've not check this on macintel to see if it makes any
> > difference, likely not.
> >
> > (b) The other few none-foo structure variables are variables
> > initialized to constants, these could have been moved into foo and
> > an initialization routine used to populate them, but work on that
> > never happen. I guess if someone wants to change the foo name then
> > those few initialized variables should be dragged into the
> > structure for completeness as part of the cleanup.
> >
> >
> > A few years back I noticed Ian was compiling the Unix Intel VM with
> > the foo structure and I asked him why? Since I had earlier noted
> > the intel performance degradation. I think Ian said he had checked
> > and there was no longer an issue and there was no harm in compiling
> > with foo for the intel platform.  I believe now what happens is
> > because it's declared as struct foo * foo = &fum; you just end up
> > with a reference into the dynamic storage area for the VM with the
> > precomputed offset being the location of the fum and the variable
> > offset. Earlier compilers I guess would first reference the storage
> > area to the pointer, then reference the variable into the structure
> > which gave the poor performance values.
> >
> > Because PowerPC is not yet dead, don't all the game consoles use
> > it? It would not be wise to abandon this feature because today all
> > mainstream platforms are Intel based register-deprived solutions,
> > someday that might change.
> > Well that and PowerPC based macintosh machines likely will still be
> > around for 5 to 7  more years given the historical longevity of
> > macintosh hardware.
> >
> >
> >> However, I can probably fix up the support code so that it's
> >> possible to compile a "struct foo VM", which I presume is your
> >> main need. Although, given that a "struct foo VM" will compile
> >> trivially without the indirection, it may be easier for you to
> >> compile Unix and Mac VMs without the extra indirection.
> >
 What i would like to see, is to make sources unified for different platforms.
The situation is simple: i made modifications to VM and all working
fine, but only for Win32 platform, because i was not aware that
other's using foo struct.
Well, i can make things work regardless CCodeGenerator uses foo struct or not.

> >
> > A few years back I changed all the mac support code to avoid
> > referring to foo or fum or interp.c globals directly and use the vm
> > supplied accessors via the interpreterProxy or via interp.c
> > accessor routine.
>
> Wonder how that would affect the AMD Geode, which is a not-so-modern
> x86 processor, but still quite important for Squeak. Once we get a
> Geode LX we need to seriously measure performance ... what magic bit
> do I need to flip to disable/enable foo fum?
>
See overridden method #createCodeGenerator
to use foo, it uses CCodeGeneratorGlobalStructure
to use globals - simple CCodeGenerator.

I don't think that switching back to globals will introduce problems
in generated code which prevent it from building. Event if so, the
code will require few fixes.

> - Bert -
>
>
>



More information about the Squeak-dev mailing list