[ANN] Exupery, yet another compiler project

Bryce Kampjes bryce at kampjes.demon.co.uk
Tue Apr 1 17:29:21 UTC 2003


Daniel Vainsencher writes:
 > Bryce Kampjes <bryce at kampjes.demon.co.uk> wrote:
 > > Currently I only target x86 on Linux. It shouldn't be hard to extend
 > > both to other operating systems and also to other CPUs.
 > 
 > Perfect on both counts - that's what I use :-)
 > <SNIP>
 > I'll tell you one thing that sounds to me like it might be both useful
 > and relatively easy to improve on - running the simulator. Since it's
 > actually Slang (no dynamic sends), it's already a subset of the full
 > Smalltalk problem (project 1.5?), and since it's currently either
 > compiled externally (clumsy) or interpreted as full Smalltalk
 > (relatively slow), you might make it more easily useful.
 > 
 > If the simulator was fast enough to comfortably run regular size images,
 > I guess people would use it more to learn about/debug problem images,
 > and study the VM. Oh, we need to fix it in the image, too, but Craig's
 > doing that...

Interesting. I'm not sure if using an alpha compiler is going to help
people learn about the VM. If something works strangely what part of
the system do you blame?

The simulator is probably integer/symbol heavy so it should look
similar to my compiler. Besides that special handling of sends to self
would help more for the simulated VM than for normal code.


Andreas's idea about floating point numerical compilation could work
early on. By adding floatAt: and floatAt:put: byte codes it should be
easy to spot float reads and writes. Once floats are identified it
shouldn't be too hard to compile efficient code. Avoiding boxing and
unboxing floats is similar to avoiding tagging and untagging integer
code.

The reason for the extra byte codes is just to make the float reads
and writes easy for the compiler to spot. Ideally type feedback should
allow normal sends to access floats, but that would involve waiting
until after the system has type feedback.

Given Andreas wants primitive semantics, I'm assuming he's got a lot
of short methods that each of which do a small floating point
calculation. If the methods included a long loop then the cost of
creating a context would be marginal. Optimizing for leaf methods
shouldn't be too hard. 

Primitive semantics probably will not help much in a system with
type feedback based adaptive inlining because the system should have
inlined all the inner loop methods. The decision to add leaf method
optimization can be deferred until it is possible to get some
measurements of real code.

AI-Memo-421 "Fast Arithmetic in MacLisp" by Guy Steele describes an
interesting example of using a few well chosen language extensions to
basically provide Fortran type semantics to a high level
language. Something similar could help in Squeak.

There are definitely some interesting problems in designing a system
that is both expressive and fast for numerical work.


None of these decisions need to be made now. Each can be dealt with
once the compiler can compile more than a few basic methods. It is
worthwhile bouncing ideas around though. I didn't think about the VM
simulator and discarded numerical calculations assuming that a few
matrix primitives would be enough.


 > > I don't want to start trying to speed up sends until after
 > > I've got useful byte code speed ups.
 > Just curious - why? to partition it more clearly?

I'm trying to carve out the smallest useful project. Inlining methods
doesn't seem too bad especially compared with using an alpha compiler
that could corrupt the image. The first people to use the compiler in
anger will need to have a very pressing performance problem.

There are three different cases where I need to handle calling
compiled code from interpreted, calling interpreted code from
compiled, and calling compiled code from compiled code. All calls can
go through the current interpreter message dispatch logic so it is
possible to defer dealing with compiled code calling compiled code
without any loss of functionality.

Handling compiled calls to compiled calls for maximum speed will
involve adaptive compilation with type feedback. There are a lot of
unknowns getting there. Also while handling the calls earlier would
make a better prototype it doesn't help in practice unless most of the
code is compilable. By delaying decision making, I can use the
learning involved in getting there.

At this stage I don't want to decide when fast calls will be important
and also how much optimization they will need. It depends on what the
first uses are, and that I can not know by myself.

Bryce



More information about the Squeak-dev mailing list