[squeak-dev] compiled squeakjs

Sun Jan 18 18:38:06 UTC 2015

Hi Florin,

On Jan 18, 2015, at 9:32 AM, Florin Mateoc <florin.mateoc at gmail.com> wrote:

> Hi,
> 
> Inspired by Bert's project, I started thinking about how to get Smalltalk compiled to Javascript instead of interpreted.
> I do have previous experience in compiling Smalltalk to Java (after type inference, which we thankfully don't need
> here). But, the requirements are a bit tighter here: we have to take an unknown image, get it translated on the fly,
> completely automatically, and even allow the translated image to self-modify. Plus we cannot just decree that become:
> cannot be used
> 
> Given that the input is an image, not sources, we'd better rely on the decompiler, so I started there. I think I fixed
> it, so that it can now decompile everything correctly.
> I also implemented a few AST transformations (similar to the ones that were necessary for Java, like normalizing the
> various boolean constructs and making them statements).
> I then started to write a Javascript pretty-printer, but I stopped when I realized that there were a few things missing:
> while non-local returns and resumable exceptions can be implemented using exceptions and an explicit stack of handlers,
> preemption (and Smalltalk's processes in general) were harder. After some research I came to the conclusion that this
> was doable if, instead of doing a direct pretty-printing of the Smalltalk nodes to Javascript, we also used the
> translation process to transform the code in continuation passing style. Then non-local returns become trivial and
> preemption can be implemented with closures, without needing access to the underlying execution stack.
> An interrupted context would have a no-arg closure representing the continuation instead of a pc. In general, only
> preemption points (which all have a corresponding continuation closure) would have to be mapped, and this would happen
> at image read time as well. The exception would be the debugger - I am not sure about that one yet.
> The primitive code would be inlined in the primitive methods, followed by a preemption point and the failure code.
> Unfortunately invocation would still not be direct, but looked up (and invoking DNU if needed), but I would store all
> the translated methods directly in the class prototype, so there would be no need for explicit superclass chain lookup.
> The instvars would also be stored directly in the class prototype (but with a prefix, to not conflict with the methods
> or with reserved keywords), and they would be accessed directly (with dot notation), except for assignments, which would
> record the owner (and the index in the owner), for all non-primitive types (not sure what to do about strings).
> Every method (and formerly Smalltalk block closure) would have a single temp called "thisContext", which would be an
> owner for the actual temps. The owners info would be used for implementing become: and allReferences.
> 
> The ProtoObject and Object methods coming from Smalltalk would be stored in Object.prototype. Proxy classes would have
> their prototypes cleared and only contain the ProtoObject methods.
> Primitive type classes would have to be massaged a little: Number would have a union of methods from the Smalltalk
> Number subclasses, as well as the methods inherited from Magnitude.
> String would also have the methods inherited from Collection and SequenceableCollection, as well as from Character (and
> Magnitude) and Symbol - this one could be a little nastier, but I think it could be made to work.
> I would also map Array to Array, IdentitySet to Set and IdentityDictionary to Map. Weak collections are harder, because
> Javascript decided to make them not enumerable. Because of this, allInstances would also be a challenge.
> 
> I am not sure yet about the bootstrap process. I just have a fuzzy feeling that Craig's Context running under SqueakJS
> might make it easier.
> 
> I hope this gives a general idea about the approach. Please do point out weaknesses that I may have missed. For me this
> is fun and I will proceed slowly, as time permits, since I cannot do it at work.
> Of course, I am very interested to hear Bert's opinion :)

I like your approach, that if making everything work, not taking the simpe approach of translating what will work directly and disallowing the rest (as does Amber and Clamato etc).  You might want to talk to Ryan Macnak and Gilad Bracha about their Newspeak implementation above JavaScript (they're also doing one above Dart).

I do think Bert's approach is fun, too.  But I do feel extremely frustrated that no one is taking the obvious route of making a plugin to allow the Cog VM to be used directly, gaining much higher performance and reducing the number of execution platforms we have to support.

A plugin would use JavaScript to collect events, to render and to access the DOM (all of this code can be stolen from Bert's VM). The JavaScript component would connect to the VM via a socket.  The VM itself would be quite small (it's already only around a megabyte of executable).  For me arguments about the inconvenience and slowness of downloading and installing are not compelling given the ubiquity of Flash.  

And then there really is /no/ difference in the execution semantics, and /no/ performance degradation, and the code is as portable as Bert's VM provided Cig runs on the platform.

I'd be doing this myself if I weren't working on getting Spur released, getting 64-but Spur working, working with Clément on Sista and looking at hosting Cog over Xen.  Come on folks; someone out there must think this is useful and interesting.

> Florin