Progrmaming in Bytecode?

Ian Piumarta ian.piumarta at inria.fr
Thu Aug 1 01:03:00 UTC 2002


On Wed, 31 Jul 2002, Swan, Dean wrote:

> 	Keep in mind that the Squeak VM bytecode set is fairly simple
> compared to some "modern" CPUs.  x86 assembly can be a little tricky
> to generate code from because of various modifier prefix bytes and
> relative branches that can't be computed until the code in between
> the branch and the branch target is generated, and variable sized
> op-codes depending on addressing mode, etc.

Same problem in Smalltalk bytecode.  Send, Ld and [Pop]St bytecodes come
in short and long forms, the short form being used when the
variable/selector index is small.  Jumps too, where the range in forward
conditional jumps is very limited (one occasionally sees an inverted short
conditional jump over a long unconditional jump that goes to the real
target).  Whether you'd want the assembler to fix range problems like this
for you automatically is a good question (and if so then it should
probably at least tell you what it's doing on your behalf).

> 	It's not uncommon to find x86 code padded with a lot of NOPs
> that weren't in the original source.

Cowardly assembler! ;-)

The real problem is that the solution in general might require N
iterations for a piece of code with N branches -- if promoting one of them
from short to long causes the next (or, much more problematically, a
preceding) one to become long too.

This is a weak form of what I suppose one might call an "oscillating
constraints" problem.  (Hands up anyone who has ever seen [La]TeX get
stuck in an endless "refs might have changed" loop because of a page or
section number at the bottom of a page that keeps moving to/from the next
page -- and in doing so immediately changes the page/section number to
which it refers? ;)

In Smalltalk you've also got a bigger problem: for very long methods the
range of even unconditional branches might not be enough, at which point
you'd have to start thinking about chaining them together.  Should the
assembler do this for you too?  Are there any simple heuristics that can
pick the best places to insert the intermediate jumps??  If the chained
jump comes from a jump over an inverted conditional jump, should the
original condition be restored and its target changed to refer to the
(suitably repositioned) intermediate unconditional jump???  Etc...

It might be interesting to make it a macro assembler too.  This could take
the pain out of "to:do:" loops and similar.

> [NOPs...] Sometimes they're there to pad the execution pipeline, and
> sometimes it's just because the code generator allocates a whole large
> enough for the largest form of an instruction to simplify the branch
> target calculations.

We must be talking about M$oft assemblers here. ;^p)

The usual cause for "padding" in (capable) assemblers (other than for
prefetch alignment, as you point out) is due to dynamic linking where the
size of call sites isn't known until link time.  (On modern systems the
distinction between assembler and linker is often quite blurred -- e.g.,
the linker typically takes over most if not all of the label resolution
work on behalf of the assembler, and on some ABIs [AIX and SysV/PPC spring
to mind] they are responsible for inserting extra glue code at call sites
to cope with cross-compilation-unit calls.  I know this has nothing
whatsoever to do with Smalltalk...)

> 	It is true with many seemingly simple things that there is
> more to it than is readily apparent to a "casual observer".  I'd
> say assemblers can fall in this category.

I couldn't agree more.  For something that seems like a fairly
straightforward program to write, creating an assembler than generates
good code (where multiple equivalent insn forms are available) quickly
(i.e., in a single pass, without wasting time resolving forward labels
before starting to generate code) is an interesting problem.

> Due to the "peculiarities"
> of some instruction sets, there are even such things as "optimizing
> assemblers"!

Assemblers for the Alpha (or was is MIPS?  It's been a *looong* time...)
make _register allocation_ decisions.

Ian





More information about the Squeak-dev mailing list