The big question is whether to work on blocks next. Out of the the four key missing features blocks are the most disruptive. I'll have to create my own context objects which is a major but small change to the interpreter. It will also make the Exupery shutdown code more important because an unmodified VM will not understand these new context objects and require image support code to deal with compiled contexts (say for debugging).
The top four items are: * blocks * super sends * new * integration (making primitive inlining usable and the the profiling compiler effective)
They have been chosen based on profiling slow operations including the Magma LargeInteger benchmark, the bytecode compiler, and opening explorers on large Arrays.
I suspect that with all of the above done Exupery will have a minimally useful core. It would still need a few more critical primitives but which ones will then depend on what you want to speed up. At the moment, everything needs at least three of the four features above.
At the moment, I'm leaning towards blocks next.
Any thought?
Bryce
Bryce Kampjes wrote:
The big question is whether to work on blocks next.
Yes please! I'm looking forward to throwing away the compiler hacks that inline ifTrue:ifFalse: and friends.
It will also make the Exupery shutdown code more important because an unmodified VM will not understand these new context objects and require image support code to deal with compiled contexts (say for debugging).
That's very interesting. Are you going to pursue a Self-style revert-to-simpler-representations-on-demand strategy?
The top four items are:
- blocks
- super sends
- new
- integration (making primitive inlining usable and the the profiling compiler effective)
What's different about super sends as compared to regular sends? Isn't the statically-available information better for super sends, and otherwise they're more-or-less the same? (What am I missing? I should perhaps have my morning coffee.)
I'd vote for blocks, new, integration, super-sends to be the order.
Cheers, Tony
Am 09.11.2005 um 11:50 schrieb Tony Garnock-Jones:
Bryce Kampjes wrote:
The big question is whether to work on blocks next.
Yes please! I'm looking forward to throwing away the compiler hacks that inline ifTrue:ifFalse: and friends.
Eeek! Why on earth would you want to do that? I'm sure this was not what Bryce meant with "working on blocks".
If you have a simple ifTrue: in your code, what else should Exupery compile this to than a simple conditional jump? Why replace this with a full message send, block activation etc.? And loops would require recursion then, which also is a full message send. To be anywhere near useful you would at least have to do tail recursion elimination.
Or are you proposing to have Exupery inline blocks? This would certainly be possible in precisely the cases where the current compiler issues a jump byte code instead of full block, except you do it in a much more complicated way without any gain in speed or flexibility.
- Bert -
Bert Freudenberg wrote:
Or are you proposing to have Exupery inline blocks? This would certainly be possible in precisely the cases where the current compiler issues a jump byte code instead of full block
Yes, this is what I'm hoping Exupery will do.
, except you do it in a much more complicated way without any gain in speed or flexibility.
I disagree - it isn't quite "for free", but it's similar to the way other message-sends are inlined; and you *do* gain both speed and flexibility in terms of defining control structures in userland rather than requiring ad-hoc compiler hacks for efficiency. The Self system demonstrated the benefits of the technique quite nicely, IMO.
Cheers, Tony
Tony Garnock-Jones writes:
Bert Freudenberg wrote:
Or are you proposing to have Exupery inline blocks? This would certainly be possible in precisely the cases where the current compiler issues a jump byte code instead of full block
Yes, this is what I'm hoping Exupery will do.
What you want for that is full method inlining. That's not what I'm talking about doing next. It's a very good optimisation for Smalltalk like languages. It will bring enough code and loops into one big inlined method to give a classical optimiser something to optimise.
There are two separate questions here: 1) Should Exupery optimise sends sufficiently so the interpreters inlining isn't needed? 2) Should we remove the interpreter's inlining?
My plans are to optimise normal sends enough eventually. There are many loops that use do: or other constructs that can not be optimised by the interpreter but could be optimised by dynamic inlining. Dynamic inlining is also the best way to make common sends nearly free excluding the type test (which isn't needed for self or super sends).
But I don't think we should remove the interpreter's ifTrue: optimisation even when Exupery's inlining is good enough. Exupery will always add more work to porting, being portable is important. Also while it's possible to dynamically inline ifTrue: like Self did, it'll mean the system runs slower until all critical ifTrue:s have been dynamically inlined, initial speed may be important sometimes.
The ifTrue: inlining is ugly but it provides a lot of speed with a simple implementation which doesn't cause practical problems. That's still valuable. In time, Exupery will inline any ifTrue:s that are left but having good performance without Exupery will still be important.
Bryce
Tony Garnock-Jones writes:
Bryce Kampjes wrote:
The big question is whether to work on blocks next.
Yes please! I'm looking forward to throwing away the compiler hacks that inline ifTrue:ifFalse: and friends.
At the moment, I'm only talking about compiling methods that create blocks. Full method (and block) inlining is planned but later. It's a very nice optimisation but not needed to make Exupery useful. It'll make it much more useful though.
It will also make the Exupery shutdown code more important because an unmodified VM will not understand these new context objects and require image support code to deal with compiled contexts (say for debugging).
That's very interesting. Are you going to pursue a Self-style revert-to-simpler-representations-on-demand strategy?
I already deconvert contexts when shutting down or restarting Exupery. But that only means clearing out the native code program counter stored in the method context's spare slot. Contexts need to be deconverted before clearing the code cache, otherwise there'll be dangling pointers to machine code no longer there which causes amusing bugs.
With ExuperyContext's I'll need to create normal interpreter contexts when deconverting the contexts. That should happen on demand to allow single stepping in the debugger.
With full method inlining it'll need to un-inline like Self but I have no plans to have a stack cache like VisualWorks or Self (I think) because hopefully inlining will remove enough send overhead.
The top four items are:
- blocks
- super sends
- new
- integration (making primitive inlining usable and the the profiling compiler effective)
What's different about super sends as compared to regular sends? Isn't the statically-available information better for super sends, and otherwise they're more-or-less the same? (What am I missing? I should perhaps have my morning coffee.)
Super sends are basically the same as normal sends except the lookup starts with the super class. Lookup is done by Slang support code at the moment. Exupery can't compile them yet but they should be fairly easy to add.
I'd vote for blocks, new, integration, super-sends to be the order.
I'll probably do blocks, super-sends, then new with integration work scattered between them. But I'll re-evaluate after doing whatever's first. Blocks are the most work (probably as much as everything else) and the most risky.
The decisions are driven by profiling. Not being able to compile methods with blocks stops the most methods from being compiled, then it's not being able to compile super sends. After that each cause only has a small benefit. Object creation shows up a lot in profiles.
The two problems I see when compiling real programs with Exupery are it can't compile enough methods so execution in stays compiled code for long enough and object creation is slow. Blocks and super sends will allow almost all normal methods to be compiled.
Sends between compiled code and interpreted code are about as fast as normal interpreted calls so in send heavy code it's vital to stay in compiled code for most sends if compiling is going to provide a speed improvement.
Bryce
exupery@lists.squeakfoundation.org