[Vm-dev] re: Parsing Pharo syntax to C/C++

Clément Bera bera.clement at gmail.com
Tue Sep 16 09:48:46 UTC 2014

2014-09-16 1:46 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:

> Hi Ronie,
> On Mon, Sep 15, 2014 at 2:37 PM, Ronie Salgado <roniesalg at gmail.com>
> wrote:
>> Hello,
>> I am segmenting this mail in several sections.
>> ---------------------------------------------------------------
>> - On Lowcode and Cog
>> I have been working in the last week with the Cog VM, implementing the
>> Lowcode instructions in Cog.
> remember to send me code for integration.  I'm eagerly waiting to use your
> code!
> Lowcode is currently a spec of new bytecode instructions. These
>> instructions can be used for:
>> - Implementing a C like language compiler.
>> - Making FFI calls
>> I am implementing these instructions using a feature of the new bytecode
>> set for SistaV1, which is called "inline primitives". Because of this,
>> these new instructions can be mixed freely with the standard VM bytecode
>> set. This also allows the Sista adaptive optimizer to inline FFI calls.
>> These instructions provides features for:
>> - Int32 and Int64 integer arithmetic without type checking.
>> - Pointers, with arithmetics.
>> - Memory access and memory manipulation.
>> - Single and double precision floating point arithmetics.
>> - Conversion between primitive types.
>> - Boxing and unboxing of primitive types.
>> - Unchecked comparisons.
>> - Native function call. Direct and indirect calls.
>> - The atomic operation compare and swap.
>> - Object pin/unpin (requires Spur).
>> - VM releasing and grabbing for threaded ffi.
>> Current I have implemented the following backends:
>> - A C interpreter plugin.
>> - A LLVM based backend.
>> Currently I am working in getting this working using the Cog code
>> generator. So far I am already generating code for
>> int32/pointer/float32/float64. I am starting to generate C functions calls
>> and object boxing/unboxing.
>> During this work I learned a lot about Cog. Specially that Cog is missing
>> a better Slang generator, that allows to force better inlining and more
>> code reviews. There is a lot of code duplication in Cog, that can be
>> attributed to limitations of Slang. In my opinion, if we could use Slang
>> not only for building the VM we should end with a better code generator. In
>> addition we, need more people working in Cog. We need people that performs
>> code reviews and documentation of Cog.
>> After these weeks, I learned that working in Cogit it is not that hard.
>> Our biggest problem is lack of documentation. Our second problem could be
>> the lack of documentation about Slang.
Lack of documentation ?

About Cog  there are these documentation:
Back to the future <http://ftp.squeak.org/docs/OOPSLA.Squeak.html>
About VMMaker <http://wiki.squeak.org/squeak/2105>
Object engine
General information <http://squeakvm.org/index.html>
Blue book part 4
Deep into Pharo part 4 about blocks and exceptions
VMIL paper about Cogit
The Cog blog <http://www.mirandabanda.org/cogblog/>
About Spur: summary
object format
This post <http://clementbera.wordpress.com/2013/08/09/the-cog-vm-lookup/>
And many useful class and method comments that taught me a lot.

When I try to work with Pharo frameworks, even recent ones, it is very rare
that I see as much documentation than it exists for Cog. Some frameworks
are documented in the Pharo books and a few other as Zinc have good
documentation, but in general, there are few documentation and *even fewer
people writing documentation*. The website about Cog has existed for over 6
years now. I think Cog is far from the worst documented part of Pharo.

> Yes, and that's difficult because it's a moving target and I have been
> lazy, not writing tests, instead using the Cog VM as "the test".
> It's also difficult because the first tests to write are the hardest to

I am so happy to have your involvement.  You and Clément bring such
> strength and competence.
> ---------------------------------------------------------------
>> - Smalltalk -> LLVM ?
>> As for having a Smalltalk -> LLVM code generator. The truth is that we
>> will not gain anything. LLVM is a C compiler, which is designed to optimize
>> things such as loops with lot of arithmetics. It is designed to optimize
>> large sections of code. In Smalltalk, most of our code is composed mostly
>> of message sends. LLVM cannot optimize a message send.
>> To optimize a message send, you have to determine which is the method
>> that is going to respond to the message. Then you have to inline the
>> method. And then you can start performing the actual optimizations, such as
>> constant folding, common subexpressions, dead branch elimination, loop
>> unrolling, and a long etc.
>> Because we don't have information in the actual language (e.g. static
>> types a la C/C++/Java/C#) that tells us what is going to be the actual
>> method invoked by a message send, we have the following alternatives to
>> determine it:
>> - Don't optimize anything.
>> - Perform a costly static global analysis of the whole program.
>> - Measure in runtime and hope for the best.
>> - Extend the language.
>> In other words, our best bet is in the work of Clément in Sista. The only
>> problem with this bet are real time applications.
> Ah!  But!  Sista has an advantage that other adaptive optimizers don't.
>  Because it optimizes from bytecode to bytecode it can be used during a
> training phase and then switched off.

> Real time applications requires an upper bound guarantee in their response
>> time. In some cases, the lack of this guarantee can be just an annoyance,
>> as happens in video games. In some mission critical applications the
>> results can not be good, if this time constraint is not met. An example of
>> a mission critical system could the flight controls of an airplane, or the
>> cooling system of a nuclear reactor.
>> For these application, it is not possible to rely in an adaptive
>> optimizer that can be triggered sometimes. In these application you have to
>> either:
>> - Extend the language to hand optimize some performance critical sections
>> of code.
>> - Use another language to optimize these critical section.
>> - Use another language for the whole project.
> The additional option is to "train" the optimizer by running the
> application before deploying and capturing the optimised methods.  Discuss
> this with Clément and he'll explain how straight-forward it should be.
>  This still leaves the latency in the Cogit when it compiles from bytecode
> to machine code.  But
> a) I've yet to see anybody raise JIT latency as an issue in Cog
> b) it would be easy to extend the VM to cause the Cogit to precompile
> specified methods.  We could easily provide a "lock-down" facility that
> would prevent Cog from discarding specific machine code methods.
> And of course, you have to perform lot of profiling.
> Early and often :-).
> Because we can have complete control over the optimizer, and because Sista
> is byetcode-to-bytecode and can hence store its results in the image in the
> form of optimized methods, I believe that Sista is well-positioned for
> real-time since it can be used before deployment.  In fact we should
> emphasise this in the papers we write on Sista.

The solution of Eliot makes sense.
To write a paper about that I need benchs showing result on real time
So there's quite some work to do before.

> Greetings,
>> Ronie
>> 2014-09-15 16:38 GMT-03:00 Craig Latta <craig at netjam.org>:
>>>      Hear hear!
>>> -C
>>> [1] http://tinyurl.com/m66fx8y (original message)
>>> --
>>> Craig Latta
>>> netjam.org
>>> +31 6 2757 7177 (SMS ok)
>>> + 1 415 287 3547 (no SMS)
> --
> best,
> Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140916/e617f496/attachment-0001.htm

More information about the Vm-dev mailing list