<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2016-03-12 21:19 GMT+01:00 Florin Mateoc <span dir="ltr">&lt;<a href="mailto:florin.mateoc@gmail.com" target="_blank">florin.mateoc@gmail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br>

  <div bgcolor="#FFFFFF" text="#000000">

    <div>On 3/12/2016 2:44 PM, Clément Bera

      wrote:<br>

    </div>

    <blockquote type="cite">

      <pre> ransformations are AST based? Is there any documentation?

</pre>

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>Hello Florin,</div>

            <div><br>

            </div>

            <div><i>How is Sista being developed (is it developed in the

                open (in trunk))? Where is the source code?</i><br>

            </div>

            <div><br>

            </div>

            <div>The sista backend (JIT extension, primitive support,

              etc) is in the trunk in slang. The image-level source code

              is on a separate repository on smalltalkhub. It will be

              merged at some point when sista will be production ready.

              Depending on configurations, you can load the image-level

              code for Squeak or Pharo with different tool extensions.

              In Pharo tools are slightly better as I can use Roassal to

              display control flow graphes, but on Squeak the IDE is way

              more stable on partially broken images like when you are

              actually running on top of the sista VM and incorrect

              optimized method are installed, so you can use what you

              need depending on what you do. Typically, I use Pharo to

              implement/debug optimization pass, then use quick to debug

              the running system and implement VM-image interface.</div>

            <div><br>

            </div>

            <div><i>How does one contribute? </i><br>

            </div>

            <div><br>

            </div>

            <div>Not obvious to answer shortly... Let&#39;s say firstly that

              we would welcome a new contributor.</div>

            <div><br>

            </div>

            <div>To contribute to the backend you need to commit on the

              trunk. There are different things you could do there

              already. Are you familiar with the JIT implementation ?</div>

            <div><br>

            </div>

            <div>To contribute on the image-level code I guess you can

              commit on the Smalltalkhub repo, but ...</div>

            <div><br>

            </div>

            <div>I am currently reworking the dynamic deoptimization

              hence the bleeding edge is not stable at all any more. In

              addition I did lots of debugging and I need to check again

              things like dependencies to other packages and remove

              debugging code. Lastly, I hacked a lot range optimzations

              and the optimization process is generating more efficient

              code but is now unclear and long to run. I think I need to

              fix that part to make it easier to understand as I did for

              inlining.</div>

            <div><br>

            </div>

            <div>A few months ago, when it was fairly stable, I could

              run the game benchs with this kind of results (time to run

              of sista+spur compared regular spur, those were the worst

              cases, sometimes it got 20% better):</div>

            <div>

              <div style="font-size:12.8px">nbody:  -35% </div>

              <div style="font-size:12.8px">threadring: -40%</div>

              <div style="font-size:12.8px">binarytrees: -20%</div>

              <div style="font-size:12.8px">nbody: -35% </div>

              <div style="font-size:12.8px">fibonacci: -35%</div>

              <div style="font-size:12.8px">integer benchmark: -10%</div>

            </div>

            <div style="font-size:12.8px"><br>

            </div>

            <div style="font-size:12.8px">To contribute, we can chat on

              skype to show you what we have and what you can do.</div>

            <div style="font-size:12.8px"><br>

            </div>

            <div style="font-size:12.8px">I am going to work with Eliot

              on April on it and we&#39;re going to introduce some big

              changes, especially on closure support. </div>

            <div style="font-size:12.8px"><br>

            </div>

            <div style="font-size:12.8px">Due to the current status

              (bleeding edge unstable + the big changes we&#39;ll do in

              April), it is currently quite difficult to contribute to

              the image-level code. So, either you wait until mid April,

              and contribute to the back-end until then, or as you look

              motivated, we could talk and see if you can do something

              now, though it won&#39;t be obvious. If I&#39;d know you&#39;d want to

              contribute last month I would have been more careful. </div>

            <div style="font-size:12.8px"><br>

            </div>

            <div style="font-size:12.8px">I don&#39;t have that much time

              right now but we could talk an hour sometime.</div>

            <div style="font-size:12.8px"><br>

            </div>

            <div style="font-size:12.8px"><span style="font-size:small"><i>How

                  do you guys communicate (is there a separate mailing

                  list)?</i></span><br>

            </div>

          </div>

          <br>

        </div>

        <div class="gmail_extra">Well, it&#39;s mainly Eliot and I working

          on it so we skype and send each other mails. Sometimes we

          include Ryan too as he is interested. We could create a

          mailing list I guess, not sure if worth it.</div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra"><font face="arial, helvetica,

            sans-serif"><i>The SSA representation and the

              transformations are AST based?</i></font><br>

        </div>

        <div class="gmail_extra"><font face="arial, helvetica,

            sans-serif"><i><br>

            </i></font></div>

        <div class="gmail_extra"><font face="arial, helvetica,

            sans-serif">Not really, it is also a high-level

            representation, somewhere in between AST and bytecode

            levels. The optimizer decompiles to its own IR, then do the

            passes and generates back the optimized method. The IR i</font><span style="font-family:arial,helvetica,sans-serif">s not a tree

            as the AST or in gcc, but a control flow graph, similar to

            V8 Crankshaft&#39;s IR hydrogen, also similar but on a higher

            level than LLVM IR. It&#39;s not a sea of nodes as Graal and co.

            The nodes are a bit different from the AST as there are no

            temporary variables in SSA representations. A few other

            nodes exist like unchecked operations, phi nodes or guards.</span></div>

        <div class="gmail_extra"><font face="arial, helvetica,

            sans-serif"><br>

          </font></div>

        <div class="gmail_extra"><i>Is there any documentation?</i><font face="arial, helvetica, sans-serif"><br>

          </font></div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra">Well I tried to write documentation at

          some point, but the code based changed a lot a few times, and

          it will change quite a lot again with the changes we will do

          with closures. As long as the first release does not happen, I

          don&#39;t think there will be a lot of documentation due to the

          high rate of changes.</div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra">I believe I kept the code very well

          commented (though it depends on your standards). Each class

          should have a comment describing what her instances do and

          how, sometimes with references on the algorithm used.</div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra">The paper we wrote at ESUG can help, I

          believe it&#39;s quite well written:</div>

        <div class="gmail_extra"><a href="http://esug.org/data/ESUG2014/IWST/Papers/iwst2014_A%20bytecode%20set%20for%20adaptive%20optimizations.pdf" target="_blank">http://esug.org/data/ESUG2014/IWST/Papers/iwst2014_A%20bytecode%20set%20for%20adaptive%20optimizations.pdf</a><br>

        </div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra">Else I am writing another paper on the

          overall architecture right now, I will attempt to submit at

          oopsla even if it will be done quickly, I guess I could share

          with you at this point if you promise *not* to do something

          stupid with unreleased papers, like showing it to the wrong

          people.</div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra">Lastly there are the Aosta (pre-sista

          project) sketches from Eliot and al. I put it in attachment.</div>

        <div class="gmail_extra"><br>

        </div>

        <div class="gmail_extra">My turn to ask question :-) </div>

        <div class="gmail_extra">Are you engineer, student, something

          else ? </div>

        <div class="gmail_extra">Would you contribute to sista a few

          hours a week, a few hours a month, ... ? </div>

        <div class="gmail_extra">Note that aside from new features /

          stabilization / optimizations passes, making the code easier

          to read, writing documentation or having clever remarks are

          valid contributions to us.</div>

        <div class="gmail_extra"><font face="arial, helvetica,

            sans-serif">I can give you a few tasks to get you started if

            you want.</font></div>

        <div class="gmail_extra"><font face="arial, helvetica,

            sans-serif"><br>

          </font></div>

      </div>

    </blockquote>

    <br>

    Hi Clément,<br>

    <br>

    Thank you for taking the time to reply.<br>

    <br>

    Unfortunately my interest and experience mostly relate to

    image-level - I am an Electrical Engineer and I climbed my way up

    from assembly to C, to C++, and up to Smalltalk (and then down to

    Java), but that was a long time ago. More recently I have worked a

    lot on type inferencing for Smalltalk, as well as on

    source-to-source translation (based on ASTs enhanced with type

    information), both Smalltalk to Smalltalk and Smalltalk to Java.<br>

    <br>

    I could contribute to sista a few hours a week.<br></div></blockquote><div><br></div><div>That would be great. </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    And now on to clever remarks :)<br>

    Are you doing type inferencing and have you considered using SSI

    instead of SSA? For inheritance-based languages with deep

    hierarchies like Smalltalk uses, type-discrimination tests (and I

    don&#39;t just mean #isKindOf: or even #is... methods in general) are

    both common and important to take advantage of for accurate type

    inferencing.<br>

    <br></div></blockquote><div> </div><div>I do type inference but I don&#39;t do it based on class hierarchy nor using #isKindOf: .</div><div><br></div><div>For each method or block, the optimizer looks in the inline caches in the machine code to find out the types met for the receiver of each send. This way, the optimizer speculates on the types of specific objects, performs inlining and insert guards that trigger dynamic deoptimization of the code if the type assumptions are no longer correct. Then, specific variables are typed based on the guards. I am also adding with Eliot some constraints on literals so they can&#39;t be become, this way the optimizer know the literal types, and become operations requires dynamic deoptimization of specific methods using the become object as a literal.</div><div><br></div><div>For #isKindOf: and #is..., depending on what is actually inlined, the optimizer can do a good job or not. If #isKindOf: is effectively inlined to a type check, then I guess the type could be effectively inferred but its not currently the case as we have not added yet a feature to forbid literal variable edition, hence the optimizer can&#39;t know for sure that if the literal variable holding the class won&#39;t be mutated later.</div><div><br></div><div>Currently the main performance boost lies with method/closure inlining, array bounds check elimination and constant-folding/global value numbering. Maybe later I will need more accurate type and will consider implementing SSI.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    Florin<br>

    <br>

    <br>

  </div>

<br></blockquote></div><br></div></div>