RE: Animorphic ST (Strongtalk) released!

18 Jul 2002


      Hi Dan and all,
Sorry for this long response, but I'll try to respond in some detail some of the questions you posed, and offer a few suggestions,
since I think it would be great if Squeak took off and could take advantage of what we've done.
...
-----Original Message-----
From: Dan Ingalls [mailto:Dan@SqueakLand.org]
Sent: Tuesday, July 16, 2002 7:49 PM
To: squeak-dev@lists.squeakfoundation.org
Cc: David Griswold
Subject: Re: Animorphic ST (Strongtalk) released!
<snip>
...
The second component is a sort of BOBW (best of both worlds) notion that with the cool aspects of Squeak and its numerous
multimedia facilities, together with what is arguably the fastest execution engine going, we could at least have a lot of fun.
That would be extremely cool.  Many of the things that the Squeak UI does well are things that I never had the time to work on in
the Strongtalk UI.  One thing that I would urge you to do, however, is to allow the Squeak UI to at least to *appear* to integrate
better with host window systems, which I think would help Squeak acceptance quite a bit.  For example, breaking out of the
world-in-a-window approach so that Squeak windows and host windows were at the same level.  I know it makes portability harder, but
I think it would help overcome a significant acceptance barrier.
...
The third component has to do with applying the Squeak philosophy to what otherwise appears to me as a daunting project.
The Animorphics VM is (I would suggest) a programming tour de force.  I have always been paralyzed when considering such
projects (and this goes for, eg, the SELF compiler, too), by the thought that I would burn out simply dealing with so
much complexity all in C or worse (if you can imagine that ;-).  The whole idea behind Squeak (well, not the whole of it,
but the implementation approach) was that we could write it in the language we already knew, and it would be easy to
understand and test.  I don't see why we couldn't do the same thing for an engine similar to the Animorphics VM (Ian and
I have also talked about doing the same kind of thing for Jitter).
When the Strongtalk VM development started, there were a number of discussions about bootstrapping the VM in Smalltalk.  We all of
course thought it would be very cool to do, but there were a number of reasons we didn't attempt it.  We were trying to build a
commercial product with a limited amount of startup capital, and it seemed too risky, since there are tight performance constraints
on the compiler speed to avoid noticeable pauses, in addition to the obvious problems of trying to avoid infinitely recursive
compilation.  We also modelled our architecture on the Self VM, which was already in C++.  The VM architecture is quite
object-oriented; it is not at all like old-fashioned C-style VMs (for example, the collector asks VM objects to enumerate their
internal obj references, etc so that it is easy to add new VM data types or change their formats).  As you suggest, an OO
architecture like this is critical to being able to do it at all- I know that Lars Bak thinks that it would have been practically
impossible to do the Animorphic VM in a non-OO language.
I have always thought that the fact that Squeak has a solution to the bootstrapping problem might offer a nice way to do it again in
Smalltalk without the risks we tried to avoid.  The question then becomes how much of a productivity advantage you would have coding
in Smalltalk over the style in which the Animorphic VM was written.  I haven't had time yet to look in detail at the Squeak code to
see how the bootstrapping translation details work, so I'll ask some questions I think might help illuminate some of the issues and
advantages associated with doing it the Squeak way.  Of course, one obvious advantage is being able to debug and test in Smalltalk.
Other important issues to think about:
1) How fast is the translated C code?  The compiler has to be quite fast.  Is dispatch as fast as a decent inline-cached dispatch?
How much overhead is there for checking/handling the SmallInteger case?  How much code space blowup is there in the translated C
code (an inlining/deoptimizing compiler is a lot bigger than an interpreter).
2) How much of the Smalltalk coolness is useable in the translateable subset?  Are integers still objects?   Is there support for
some kind of block closure (even if only downward-passed ones) so that control structures other than the hardwired ones like
#ifTrue:ifFalse, #whileTrue: etc. can be used?   What kind of GC support is available for VM internal data structures that can't be
put in the target ObjectMemory, like dynamically allocated GC data structures (if any)? These are the things that would give the
most significant advantages over writing the compiler in C++, since the Animorphic VM is already fairly OO.
3) How would compiler portability work?  There would obviously have to be a more complex portability infrastructure, since the
compiler has to generate native code dynamically.
...
Some questions are:
Would the system benefit from being cast into StrongTalk?
   	and how much work would this be?
There are a bunch of interrelated issues here, which deserve more in-depth conversations.  Here are a few scattered observations.
Obviously I think any Smalltalk would benefit from both the type system and inlining compiler, since Strongtalk is my baby.  But I
think it's fairly clear that the bigger a piece of software is, the more a type system helps for browsing/programming/testing, and
the compiler itself is obviously a big piece of code.
Having typed "Blue Book" classes I think helps a lot, but the Animorphic ones have quite a different structure than the Squeak ones,
for two reasons: 1) typing requires a different hierarchy and interface structure, and 2) they use far more sends and block closures
than other Smalltalk systems because I knew all the inlining would be taking place- they would be too slow as written on Squeak
without an inlining compiler.  However, I think that the chances of being able to graft them or something like them into Squeak are
pretty good, if you are open to wholesale rewriting of a lot of those core classes and changes to code that subclasses them.
One idea to think about: the compiler and VM itself are clearly things that would benefit from running faster while being written in
a high-level way, which is what the inlining compiler can help with.  While that is not necessary, it would certainly ease some of
the performance concerns of writing the compiler and VM in Smalltalk.  To take advantage of that would require significant changes
to the bootstrapping scheme, like using something like our inlining-database to drive a more restricted kind of static inlining run
to generate a faster VM and compiler (however you can't deoptimize in that scenario, which restricts some kinds of inlining,
although not all).
This raises an interesting issue: how much benefit could be derived from a much simpler inlining compiler?  I have always thought
about a much simpler but more restricted inlining scheme that would inline constant sends and use inheritance 'customization' to
make self-sends sends constant, and then you could do many of the same block-closure-elimination etc. optimizations on that code.
The huge advantage of such a scheme would be that it eliminates the necessity for deoptimization and all of its attendant
complexity, and that in turn would also make static translation or compilation of inlined code possible.
This kind of scheme wouldn't inline all the things that Strongtalk can inline, but it might still help a lot.  I did some
measurements on the VisualWorks image long ago, and determined that statically, about 25% of 'real' sends were self sends (I never
did dynamic measurements, though).  I suspect my own coding style probably has a much higher fraction of self sends dynamically than
that (since leaf sends for instvar accessors etc. are often self sends), and so I suspect that the Strongtalk libraries would run
nicely under such a compiler (note that you could even do this kind of optimization on bytecodes, although I think it would require
bytecode changes).  The upshot is that this kind of optimization would probably tend to disproportionally speed up high-level code
rather than hand-optimized code.
...
Would anyone care if it ran 10 times faster?
   	and how much work would this be?
Yes, yes, yes. I think the Smalltalk community has always underestimated the extent to which performance matters in the real world
(and the extent to which proponents of other languages use it as a weapon against Smalltalk).  Of course it is true that first you
need to write your algorithms correctly, and that *most* applications don't need ultimate speed.  However, it can be very hard to
predict ahead of time whether your application will need some critical algorithm to run at top speed, and once you are in that
situation, it can be incredibly painful to have to write some algorithms in another language, especially if by chance they need to
extensively access, create, or modify shared data structures.  So this makes it a gamble: are you willing to *bet* your success on
the hope that your application will never be compute bound?
And yes, making Smalltalk faster is a huge amount of work.  But it is the kind of work that pays off practically forever, especially
when you have an open-source system like Squeak that can survive independently of any person or organization.  And maybe a scheme
like I outlined above could help make it simpler (although a lot more thought would be needed).
...
Would it be fun to do?
If you like writing cool code ;-)   Although building the Animorphic system was an incredibly long, hard, risky process, it was also
incredibly exciting.  I suspect that you know that feeling.
Cheers,
Dave