[Vm-dev] Powerful JIT optimization

Wed Nov 6 04:03:11 UTC 2013

On 11/5/2013 6:30 PM, Eliot Miranda wrote:
>  
>
>
>
>
>
> On Mon, Nov 4, 2013 at 8:42 PM, Florin Mateoc <florin.mateoc at gmail.com <mailto:florin.mateoc at gmail.com>> wrote:
>
>      
>     On 11/4/2013 9:05 PM, Eliot Miranda wrote:
>>      
>>
>>
>>     Hi Florin,
>>
>>     On Mon, Nov 4, 2013 at 12:30 PM, Florin Mateoc <florin.mateoc at gmail.com <mailto:florin.mateoc at gmail.com>> wrote:
>>
>>          
>>         On 11/4/2013 3:07 PM, Eliot Miranda wrote:
>>>         Hi Florin,
>>>
>>>         On Mon, Nov 4, 2013 at 7:09 AM, Florin Mateoc <florin.mateoc at gmail.com <mailto:florin.mateoc at gmail.com>> wrote:
>>>
>>>
>>>             Hi Eliot,
>>>
>>>             I am not sure if this is the right moment to bring this up, when you are so busy with the new garbage
>>>             collector, but,
>>>             since you were also talking about powerful new optimizations and this seems a very good one... I was
>>>             toying with the
>>>             idea before, but I did not have the right formulation for it - I was thinking of doing it on the image
>>>             side, at the AST
>>>             level and then communicating somehow with the VM (this aspect becomes moot if the JIT code is generated
>>>             from Smalltalk),
>>>             but now I stumbled upon it on the web and I think it would be better done inside the JIT. In Rémi Forax'
>>>             formulation:
>>>
>>>             "On thing that trace based JIT has shown is that a loop or function are valid optimization entry points.
>>>             So like you can
>>>             have an inlining cache for function at callsite, you should have a kind of inlining cache at the start
>>>             of a loop."
>>>
>>>             This was in the context of a blog entry by Cliff Click:
>>>             http://www.azulsystems.com/blog/cliff/2011-04-04-fixing-the-inlining-problem
>>>             The comments also contain other useful suggestions.
>>>
>>>             And, the loop inlining cache could also specialize not just on the receiver block, but also on the types
>>>             of the
>>>             arguments (this is true for methods as well, but, in the absence of profiling information, loops are
>>>             more likely to be
>>>             "hot", plus we can easily detect nested loops which reinforce the "hotness")
>>>
>>>
>>>         AFAICT this is subsumed under adaptive optimization/speculative inlining.  i.e. this is one of the potential
>>>         optimizations in an adaptive optimizing VM.  Further, I also believe that by for the best place to do this
>>>         kind of thing is indeed in the image, and to do it at the bytecode-to-bytecode level.  But I've said this
>>>         many times before and don't want to waste cycles waffling again.
>>>
>>>         thanks.
>>>         e.
>>>
>>>             Regards,
>>>             Florin
>>>
>>>
>>>         -- 
>>>         best,
>>>         Eliot
>>         This is a bit like saying that we don't need garbage collection because we can do liveness/escape analysis in
>>         the image. I think there is a place for both sides
>>
>>
>>     No it's not.  If you read my design sketch on bytecode-to-bytecode adaptive optimisation you'll understand that
>>     it's not.  It's simply that one can do bytecode-to-bytecode adaptive optimisation in the image, and that that's a
>>     better place to do adaptive optimisation than in the VM.  But again I've gone into this many times before on the
>>     mailing list and I don't want to get into it again.
>>
>
>     Can't compiler technology (coupled with type inference) also be applied, in the image, to stack
>     allocation/pretenuring/automatic pool allocation... to simplify the garbage collector and potentially obviating
>     the need for a performant one in the vm?
>
>
> I doubt it.  It already is used to e.g. create clean blocks.  This certainly isn't enough of a win to be able to live
> with a poor GC.
>  
>
>     If it can, why doesn't the same argument apply?
>
>
> It can't so the argument doesn't apply.
>  
>
>     And why did you implement inline caches in the VM if they were better done in the image?
>
>
> Because they're not better-done in the image.  In fact, adaptive optimization is heavily dependent on online caches.
>  That's where it gets its type information form.
>
> This doesn't feel like a productive conversation.
>  

Indeed.
The garbage collector example was an afterthought and even a bit facetious, sorry about that. But the initial point
still stands. It is the same kind of optimization like inline caches, not a different kind of adaptive optimization that
is facilitated by them. If it is worth doing inline caches for method calls, it is worth doing them for block
evaluations in loops as well.

Regards,
Florin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20131105/1a592a28/attachment.htm