Hi Dan and all,
Sorry for this long response, but I'll try to respond in some detail some of the questions you posed, and offer a few suggestions, since I think it would be great if Squeak took off and could take advantage of what we've done.
-----Original Message----- From: Dan Ingalls [mailto:Dan@SqueakLand.org] Sent: Tuesday, July 16, 2002 7:49 PM To: squeak-dev@lists.squeakfoundation.org Cc: David Griswold Subject: Re: Animorphic ST (Strongtalk) released!
<snip>
The second component is a sort of BOBW (best of both worlds) notion that with the cool aspects of Squeak and its numerous multimedia facilities, together with what is arguably the fastest execution engine going, we could at least have a lot of fun.
That would be extremely cool. Many of the things that the Squeak UI does well are things that I never had the time to work on in the Strongtalk UI. One thing that I would urge you to do, however, is to allow the Squeak UI to at least to *appear* to integrate better with host window systems, which I think would help Squeak acceptance quite a bit. For example, breaking out of the world-in-a-window approach so that Squeak windows and host windows were at the same level. I know it makes portability harder, but I think it would help overcome a significant acceptance barrier.
The third component has to do with applying the Squeak philosophy to what otherwise appears to me as a daunting project. The Animorphics VM is (I would suggest) a programming tour de force. I have always been paralyzed when considering such projects (and this goes for, eg, the SELF compiler, too), by the thought that I would burn out simply dealing with so much complexity all in C or worse (if you can imagine that ;-). The whole idea behind Squeak (well, not the whole of it, but the implementation approach) was that we could write it in the language we already knew, and it would be easy to understand and test. I don't see why we couldn't do the same thing for an engine similar to the Animorphics VM (Ian and I have also talked about doing the same kind of thing for Jitter).
When the Strongtalk VM development started, there were a number of discussions about bootstrapping the VM in Smalltalk. We all of course thought it would be very cool to do, but there were a number of reasons we didn't attempt it. We were trying to build a commercial product with a limited amount of startup capital, and it seemed too risky, since there are tight performance constraints on the compiler speed to avoid noticeable pauses, in addition to the obvious problems of trying to avoid infinitely recursive compilation. We also modelled our architecture on the Self VM, which was already in C++. The VM architecture is quite object-oriented; it is not at all like old-fashioned C-style VMs (for example, the collector asks VM objects to enumerate their internal obj references, etc so that it is easy to add new VM data types or change their formats). As you suggest, an OO architecture like this is critical to being able to do it at all- I know that Lars Bak thinks that it would have been practically impossible to do the Animorphic VM in a non-OO language.
I have always thought that the fact that Squeak has a solution to the bootstrapping problem might offer a nice way to do it again in Smalltalk without the risks we tried to avoid. The question then becomes how much of a productivity advantage you would have coding in Smalltalk over the style in which the Animorphic VM was written. I haven't had time yet to look in detail at the Squeak code to see how the bootstrapping translation details work, so I'll ask some questions I think might help illuminate some of the issues and advantages associated with doing it the Squeak way. Of course, one obvious advantage is being able to debug and test in Smalltalk. Other important issues to think about:
1) How fast is the translated C code? The compiler has to be quite fast. Is dispatch as fast as a decent inline-cached dispatch? How much overhead is there for checking/handling the SmallInteger case? How much code space blowup is there in the translated C code (an inlining/deoptimizing compiler is a lot bigger than an interpreter).
2) How much of the Smalltalk coolness is useable in the translateable subset? Are integers still objects? Is there support for some kind of block closure (even if only downward-passed ones) so that control structures other than the hardwired ones like #ifTrue:ifFalse, #whileTrue: etc. can be used? What kind of GC support is available for VM internal data structures that can't be put in the target ObjectMemory, like dynamically allocated GC data structures (if any)? These are the things that would give the most significant advantages over writing the compiler in C++, since the Animorphic VM is already fairly OO.
3) How would compiler portability work? There would obviously have to be a more complex portability infrastructure, since the compiler has to generate native code dynamically.
Some questions are:
Would the system benefit from being cast into StrongTalk? and how much work would this be?
There are a bunch of interrelated issues here, which deserve more in-depth conversations. Here are a few scattered observations. Obviously I think any Smalltalk would benefit from both the type system and inlining compiler, since Strongtalk is my baby. But I think it's fairly clear that the bigger a piece of software is, the more a type system helps for browsing/programming/testing, and the compiler itself is obviously a big piece of code.
Having typed "Blue Book" classes I think helps a lot, but the Animorphic ones have quite a different structure than the Squeak ones, for two reasons: 1) typing requires a different hierarchy and interface structure, and 2) they use far more sends and block closures than other Smalltalk systems because I knew all the inlining would be taking place- they would be too slow as written on Squeak without an inlining compiler. However, I think that the chances of being able to graft them or something like them into Squeak are pretty good, if you are open to wholesale rewriting of a lot of those core classes and changes to code that subclasses them.
One idea to think about: the compiler and VM itself are clearly things that would benefit from running faster while being written in a high-level way, which is what the inlining compiler can help with. While that is not necessary, it would certainly ease some of the performance concerns of writing the compiler and VM in Smalltalk. To take advantage of that would require significant changes to the bootstrapping scheme, like using something like our inlining-database to drive a more restricted kind of static inlining run to generate a faster VM and compiler (however you can't deoptimize in that scenario, which restricts some kinds of inlining, although not all).
This raises an interesting issue: how much benefit could be derived from a much simpler inlining compiler? I have always thought about a much simpler but more restricted inlining scheme that would inline constant sends and use inheritance 'customization' to make self-sends sends constant, and then you could do many of the same block-closure-elimination etc. optimizations on that code. The huge advantage of such a scheme would be that it eliminates the necessity for deoptimization and all of its attendant complexity, and that in turn would also make static translation or compilation of inlined code possible.
This kind of scheme wouldn't inline all the things that Strongtalk can inline, but it might still help a lot. I did some measurements on the VisualWorks image long ago, and determined that statically, about 25% of 'real' sends were self sends (I never did dynamic measurements, though). I suspect my own coding style probably has a much higher fraction of self sends dynamically than that (since leaf sends for instvar accessors etc. are often self sends), and so I suspect that the Strongtalk libraries would run nicely under such a compiler (note that you could even do this kind of optimization on bytecodes, although I think it would require bytecode changes). The upshot is that this kind of optimization would probably tend to disproportionally speed up high-level code rather than hand-optimized code.
Would anyone care if it ran 10 times faster? and how much work would this be?
Yes, yes, yes. I think the Smalltalk community has always underestimated the extent to which performance matters in the real world (and the extent to which proponents of other languages use it as a weapon against Smalltalk). Of course it is true that first you need to write your algorithms correctly, and that *most* applications don't need ultimate speed. However, it can be very hard to predict ahead of time whether your application will need some critical algorithm to run at top speed, and once you are in that situation, it can be incredibly painful to have to write some algorithms in another language, especially if by chance they need to extensively access, create, or modify shared data structures. So this makes it a gamble: are you willing to *bet* your success on the hope that your application will never be compute bound?
And yes, making Smalltalk faster is a huge amount of work. But it is the kind of work that pays off practically forever, especially when you have an open-source system like Squeak that can survive independently of any person or organization. And maybe a scheme like I outlined above could help make it simpler (although a lot more thought would be needed).
Would it be fun to do?
If you like writing cool code ;-) Although building the Animorphic system was an incredibly long, hard, risky process, it was also incredibly exciting. I suspect that you know that feeling.
Cheers, Dave