Report from a novice VM h4x0r.
tim at sumeru.stanford.edu
Wed Mar 31 23:17:18 UTC 2004
Quick few points before I head out for a while:-
first, congratulations on getting up the nerve to play in the VM.
> Furthermore, making all integer values word-aligned (are they already?)
> will tremendously improve memory access times on many architectures.
I can't think of any non-aligned integers in use.
> 2. While the construct:
> | foo |
> foo _ self bar: baz
> foo = bat ifTrue: [Mars invade].
> will help the inliner when bar is a non-trivial method significantly, it
> becomes counterproductive when bar happens to be "integerAt:" (which it
> is in several places) which is actually a compiler macro.
How so? I can see it won't be much of a help, but how is it a problem?
> 3. Many class, and a few instance variables are treated by the compiler
> as constants. Apparently this is a compiler hack which makes the C code
> more readable without affecting the binary. Unfortunately, this badly
> complicates the compiler class definition. Also, the C compiler does not
> follow the C idiom of making all constants ALLCAPS.
Never really noticed that idiom before. Besides, since _our_ idiom is
that class vars are capitalised I'd say we should leave them that way.
Basically we shouldn't look at the C code. Andreas' fiddle to use the
symbolic names simply makes it a bit less painful if we have to.
> My proposal is to edit the compiler so that it will detect degenerate
> methods (methods which only produce a constant value) and to treat these
> as constants.
Good idea. Go for it.
> 4. A great many computations involve extracting bit-fields. In many
> cases these bit fields include important scalar information such as
> "sizeBits". A useful improvment would be to make these either bytes or
> short-ints and then let the C compiler figure out how best to extract
> them from the words...
If you can think of a way to nicely express bitfield manipulation in
Smalltalk so that it can map to C bitfields, great. So long as it's
optional, since some cc/machines don't do too well and would need the
older macro type hacking.
> 5. I started experamenting with replacing long-coded block moves with
> calls to C's " memmove" and "memcpy" where appropriate. The motovation
> is that the 386 has an ungodly fast "rep movsb " compound instruction
> which does the operation at the speed of the FSB. It would be even
> better to use an inline assembly command for this but to remain portable
> I started trying to insert the C library calls mentioned.
Inline assembler and portable in one sentence? The only way is "inline
assembler is not portable". And don't forget that x86 is not the only
architecture and not all optimisations apply even across all x86 chips.
You'd probably be horrified at how much it can take to cope with all
> There are actually a number of DIFFERENT implementations of block memory
> copy in the VM source. One example in the Interpreter is called by a
> function which takes a word count then divides it by 4 to make a word
> coust for the method call, When it is processed by the method it is
> multiplied again to make the byte count used to determine the stop
> pointer for the For loop!
Hmm, yuck. Should be improved...
> 7. The compiler emits many unnecessary gotos.
Part of the not-terribly neat inlining we do. I have an untested theory
that it would be better for us to NOT textually inline but mark the
code with _inline_ and see what CC will do for us.
> 8. The compiler will compile:
> foo _ foo + 1.
> as foo += 1,
> Which isn't bad code but on a non-optimizing C compiler this may turn
> into an add[immediate] opcode instead of the shorter inc x, opcode.
Again, very architecture dependent. And the idiom works for other
values than 1..
> 10. I came across an opcode in the interpreter whith a comment that
> stated it was supposed to be obsolete after 2.6!!!
I think I've removed that for 3.7
> 11. It might be profitable to use parts of the internal smalltalk
> compiler in the cCode generator (if it doesn't already.)
Take a look at how TMethod etc are created. We use the normal compiler
to build a compiled method and convert it.
> 13. The add opcode seems to attempt the add without changing the stack
> pointer and change it only after succeding. The logic operations (and
> some others) will change the stack pointer 3 times on success or
> failure! ( doing two pops followed by a push)
Not so bad; look at the actual implementation of pop:thenPush: for
example and recall that our SP is _not_ the C sp. We don't actually
push and pop here.
> 14. Many of the opcodes that access the stack without pushing will use
> the method "stackValue: 0" instead of the cleaner "stackTop". This
> probably won't affect the binary but it adds alot of constant arithmetic
> to the generated C code. It also indicates that the cCompiler relies on
> the native compiler too much to optimize out constant arithmetic...)
I've cleaned out at laest some of these recently.
> 16. C99 adds block contexts. The current cCompiler will take all inlined
> methods as well as loop variables and put them at the beginning of the
> generated function. Since any current C and even older C++ compilers
> will support for (int i; i < x ; i ++ ), it would be much better form to
> use this instead of heaping all variables at the beginning of the block.
If we could really rely on having C99 compilers at all times this might
> 17. The last major revision to the interpreter was 5 years ago. =\
Does a big change to the primitive calling count? If so, there is one
going on right now:-)
Tim Rowledge, tim at sumeru.stanford.edu, http://sumeru.stanford.edu/tim
Useful Latin Phrases:- Braccae illae virides cum subucula rosea et tunica Caledonia-quam elenganter concinnatur! = Those green pants go so well with that pink shirt and the plaid jacket!
More information about the Squeak-dev