[squeak-dev] compiled squeakjs

Bert Freudenberg bert at freudenbergs.de
Mon Jan 19 15:37:15 UTC 2015


Hi Florin,

this sounds extremely interesting. In particular the part about using continuations to model execution flow, that thought had not occurred to me yet. Indeed, non-local returns and interruptability are the hardest to map to JS. I will have to think about this idea a while :)

SqueakJS does compilation, too, by now. It has a (very simple) JIT compiler that compiles bytecodes into equivalent JavaScript, on a method-by-method basis. Read the initial comment at 
	https://github.com/bertfreudenberg/SqueakJS/blob/master/jit.js

To see it in action, open your browser's JS console and evaluate "SqueakJS.vm.method.compiled" which is the compiled version of the currently executing method. Or, if you're running a JS profiler, the generated methods will show up in that profile, too, and you can see their source.

This is not a high-performance JIT yet, but it helps a lot compared to the simple interpreter. Here's the numbers on Chrome's V8:
with JIT: 82315112 bytecodes/sec; 902155 sends/sec
no JIT:    2775850 bytecodes/sec; 137439 sends/sec

Also interesting - no JIT, before V8 deoptimization kicks in:
          11494252 bytecodes/sec; 523121 sends/sec
With the JIT, the code is more distributed, less polymorphic, so V8 can optimize better.

Beware microbenchmarks etc, but it pays hugely to make your code "friendly" for the JS VM. Amber for example, on the same machine in the same browser, reports '2214839.4241417497 bytecodes/sec; 229042.45283018867 sends/sec' even though it directly compiles to JavaScript and does not have full Smalltalk semantics (e.g. no real thisContext, no become). I suspect deoptimization in V8. Indeed, on FireFox it reports '3007518.796992481 bytecodes/sec; 408234.4251766217 sends/sec', which is roughly the same as SqueakJS (40327662 bytecodes/sec; 516034 sends/sec).

So since you're after the highest performance, it may pay off to do some experiments first. 

Btw, here is a very interesting talk about how to make Smalltalk-style method invocation be fast on V8.
video: http://2014.jsconf.eu/speakers/vyacheslav-egorov-invokedynamic-js.html
slides: http://mrale.ph/talks/jsconfeu2014/

I did not fully understand your proposal about the object memory layout, and how become would work. Also, allInstances/weak refs and finalization isn't accounted for.

Do you intend this to be a fully compatible VM for Squeak? That was my goal with SqueakJS, performance being secondary (although not unimportant). SqueakJS does fully implement Squeak's execution semantics, including thisContext, non-local return, stack unwinding, DNU, process switching etc. and the object memory semantics too, including allObjects/allInstances, weak refs and finalization.

Your proposed mapping to JS Arrays/Maps etc. seems to imply that it would not be fully compatible, right? Rather a Smalltalk-for-the-web with fewer compromises than Amber? Or is this even only meant as a deployment step, not as a fully self-hosted development environment?

- Bert -

On 18.01.2015, at 18:32, Florin Mateoc <florin.mateoc at gmail.com> wrote:
> Hi,
> 
> Inspired by Bert's project, I started thinking about how to get Smalltalk compiled to Javascript instead of interpreted.
> I do have previous experience in compiling Smalltalk to Java (after type inference, which we thankfully don't need
> here). But, the requirements are a bit tighter here: we have to take an unknown image, get it translated on the fly,
> completely automatically, and even allow the translated image to self-modify. Plus we cannot just decree that become:
> cannot be used
> 
> Given that the input is an image, not sources, we'd better rely on the decompiler, so I started there. I think I fixed
> it, so that it can now decompile everything correctly.
> I also implemented a few AST transformations (similar to the ones that were necessary for Java, like normalizing the
> various boolean constructs and making them statements).
> I then started to write a Javascript pretty-printer, but I stopped when I realized that there were a few things missing:
> while non-local returns and resumable exceptions can be implemented using exceptions and an explicit stack of handlers,
> preemption (and Smalltalk's processes in general) were harder. After some research I came to the conclusion that this
> was doable if, instead of doing a direct pretty-printing of the Smalltalk nodes to Javascript, we also used the
> translation process to transform the code in continuation passing style. Then non-local returns become trivial and
> preemption can be implemented with closures, without needing access to the underlying execution stack.
> An interrupted context would have a no-arg closure representing the continuation instead of a pc. In general, only
> preemption points (which all have a corresponding continuation closure) would have to be mapped, and this would happen
> at image read time as well. The exception would be the debugger - I am not sure about that one yet.
> The primitive code would be inlined in the primitive methods, followed by a preemption point and the failure code.
> Unfortunately invocation would still not be direct, but looked up (and invoking DNU if needed), but I would store all
> the translated methods directly in the class prototype, so there would be no need for explicit superclass chain lookup.
> The instvars would also be stored directly in the class prototype (but with a prefix, to not conflict with the methods
> or with reserved keywords), and they would be accessed directly (with dot notation), except for assignments, which would
> record the owner (and the index in the owner), for all non-primitive types (not sure what to do about strings).
> Every method (and formerly Smalltalk block closure) would have a single temp called "thisContext", which would be an
> owner for the actual temps. The owners info would be used for implementing become: and allReferences.
> 
> The ProtoObject and Object methods coming from Smalltalk would be stored in Object.prototype. Proxy classes would have
> their prototypes cleared and only contain the ProtoObject methods.
> Primitive type classes would have to be massaged a little: Number would have a union of methods from the Smalltalk
> Number subclasses, as well as the methods inherited from Magnitude.
> String would also have the methods inherited from Collection and SequenceableCollection, as well as from Character (and
> Magnitude) and Symbol - this one could be a little nastier, but I think it could be made to work.
> I would also map Array to Array, IdentitySet to Set and IdentityDictionary to Map. Weak collections are harder, because
> Javascript decided to make them not enumerable. Because of this, allInstances would also be a challenge.
> 
> I am not sure yet about the bootstrap process. I just have a fuzzy feeling that Craig's Context running under SqueakJS
> might make it easier.
> 
> I hope this gives a general idea about the approach. Please do point out weaknesses that I may have missed. For me this
> is fun and I will proceed slowly, as time permits, since I cannot do it at work.
> Of course, I am very interested to hear Bert's opinion :)
> 
> Florin



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4115 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20150119/45e77383/smime.bin


More information about the Squeak-dev mailing list