[squeak-dev] Decompiler buggy (was: AW: [Etoys, Compiler] Help wanted: Trying to embed SyntaxMorphs into other tiles)

Eliot Miranda eliot.miranda at gmail.com
Sun Mar 29 17:49:33 UTC 2020


Hi Christoph,

    please read what I'm about to say carefully.  This message is aimed at
you :-)

On Sat, Mar 28, 2020 at 6:09 AM Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> wrote:

> Hi Christoph,
>
> Le sam. 28 mars 2020 à 01:12, Thiede, Christoph <
> Christoph.Thiede at student.hpi.uni-potsdam.de> a écrit :
>
>> Hi Eliot, hi all,
>>
>>
>> ah, I finally found the bug, but this was a really hard hunt! :D
>>
>>
>> The solution is absolutely simple, again:
>>
>>
>> codeAnySelector: selector
>>
>>
>> ^SelectorNode new
>>
>> key: selector
>>
>> + index: nil
>>
>> - index: 0
>>
>> type: SendType
>>
>>
>> Good find!
>
>> Seriously, did the Decompiler ever reliably produce re-generatable parse
>> trees in the past? But it should do so, shouldn't it? :-)
>>
>> Maybe it did (see below). But I'm not sure that is was a feature...
> Isn't it mostly used for replacing absent source code... that will
> eventually be repasrsed ? (!)
>
> Before the above patch, the following example was broken, too:
>>
>> class := Object newSubclass.
>> class compile: 'foo ^ 1 + 1'.
>> (class >> #foo) decompile generate valueWithReceiver: class new
>> arguments: #(). "SmallInteger does not understand #foo"
>>
>>
>> Now I'm wondering what are the actual semantics of the index variable.
>> Its method comment about "various uses depending on the class of the
>> receiver" is quite generic - do you know some more details about this?
>> Should we also use nil instead of 0 in DecompilerConstructor >> #
>> codeAnyLiteral:? At first glance, senders of #encodeLiteral: do not
>> appear to set it to zero manually (so they leave it nil), but unless there
>> is any documentation of the index meaning, this is speculation only, as I
>> could not find any other example where decompilation + regeneration produce
>> a method that cannot be executed properly.
>>
>> It's very low level, some kind of reflexion of byteCode encoding.
> Once upon a time (< Squeak4.0), the code was even more horrible to follow!
>
> LeafNode>>key: object index: i type: type
>     self key: object code: (self code: i type: type)
>
> LeafNode>>code: index type: type
>     index isNil
>          ifTrue: [^type negated].
>      (CodeLimits at: type) > index
>          ifTrue: [^(CodeBases at: type) + index].
>      ^type * 256 + index
>

Exactly.  This is actually obsolete genius by Dan Ingalls.  If you have a
look at the original Smalltalk-80 bytecode compiler you'll see that the
parse tree nodes both represent the parse tree *and* generate the output
bytecodes,  This was really important on 16-bit Smalltalk-80 since it meant
that the bytecode compiler was extremely compact and concise.  Objects were
in extremely short supply, 32k objects in a normal implementation (with
15-bit SmallIntegers), and 48k objects in a "stretch" implementation that
had 14-bit SmallIntegers.

Now we have 32-bit and 64-bit implementations this concision is obsolete
and what we need is flexibility and clarity.

I had done some reimplementation work on the bytecode compiler in 2009 to
add the closure bytecodes, and to add a proper code generation back end in
the BytecodeEncoder framework, but I never finished the cleanup. The index
and code inst vars in the LeafNode hierarchy are vestiges of the old
implementation.  It would be really good to get rid of the code inst var
altogether and to be left only with index, and index being the literal
index for literal nodes (perhaps negative indices being used for special
selectors), index being the inst var index for inst var nodes, and index
being the temp var offset for temp var nodes, etc.

But this really needs someone with fresh eyes and energy.  My plate is
full.  When I did think of doing this I realized that it is probably wise
to clone the compiler altogether and do the development and testing work in
the clone before moving it back to LeafNode et al for the first functional
commit.  This to avoid breaking the compiler while trying to fix it.

So Christoph, do you accept my challenge and will you try and eliminate the
code inst var from LeafNode?



>
> As you see, index i passed as argument to #code: keyword (? it's because
> it's documenting the output, not the input);
> then code: parameter shadowing the index instance variable...
> And the index instance variable was not set... Kind of brainfuck.
>
> We still have code:type: and index variable shadowing in current trunk...
>
> By the way, here is another interesting one-liner:
>>
>> (Object newSubclass environment: self environment; compile: 'foo
>> ^(ObjectTracer on: nil) class'; >> #foo) decompile generate
>> valueWithReceiver: nil arguments: #()
>>
>>
>> Interestingly, it opens a debugger - in other words, #class is sent as a
>> regular selector. The decompiler does not know anything about special
>> selectors at the moment. Is this desired behavior? I wonder whether it
>> should be the parse tree's responsibility to install such kind of
>> optimizations, rather than the responsibility of the Compiler.
>> Because in reality, Compiler is not the only client that requests code
>> generation from parse trees. Etoys is a good example for a client from
>> another domain that uses this service, too. Should all these other clients
>> be withheld these important optimizations of Smalltalk expressions?
>>
>> After parsing, there are other compilation phases, for analyzing variable
> scope, clean blocks, etc...
> It's possible to scatter the implementation of various phases in the nodes
> themselves, but the trend is rather to use a visitor pattern;
> it gather the handling in some specialized classes that hold all the
> states (rather than pass them as message arguments).
> Pharo team did a complete re-engineering of compiler (OpalCompiler) that
> you culd study.
>
> Best,
>> Christoph
>>
>> ------------------------------
>> *Von:* Thiede, Christoph
>> *Gesendet:* Freitag, 27. März 2020 23:16 Uhr
>> *An:* The general-purpose Squeak developers list
>> *Betreff:* AW: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to
>> embed SyntaxMorphs into other tiles
>>
>>
>> Hi Eliot,
>>
>>
>> > It looks correct.  Can you check it against the old bytecode set too?
>> We don’t want it to break old-style blocks.
>>
>> Good point. I ran
>>
>> (Object >> #asOrderedCollection) decompile generate valueWithReceiver: 42
>> arguments: #().
>>
>>
>> for both bytecode sets, and both were fine.
>>
>> But:
>>
>> (Collection >> #asArray) decompile generate valueWithReceiver: {42}
>> asOrderedCollection arguments: #().
>>
>>
>> breaks - in both bytecode sets. This is weird.
>> I will have a look into it, maybe I can discover what's wrong.
>>
>> In addition, I propose to write tests for this. But it's not the goal of
>> the decompiler to yield exactly the same parse tree or source code as the
>> original method consisted of? In this case, we will need to write a lot of
>> fixtures for the tests.
>>
>> Best,
>> Christoph
>>
>>
>>
>> ------------------------------
>> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
>> Auftrag von Eliot Miranda <eliot.miranda at gmail.com>
>> *Gesendet:* Freitag, 27. März 2020 21:33 Uhr
>> *An:* The general-purpose Squeak developers list
>> *Betreff:* Re: [squeak-dev] [Etoys, Compiler] Help wanted: Trying to
>> embed SyntaxMorphs into other tiles
>>
>> Hi Christoph,
>>
>> On Mar 27, 2020, at 12:45 PM, Thiede, Christoph <
>> Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>
>> 
>>
>> Hi all! :-)
>>
>> Just an update of the decompilation question:
>>
>> Christoph Thiede wrote
>> I don't know how to use #generate: exactly, but other senders usually
>> appear to recompile a method before passing it to #generate:.
>> For comparison:
>>
>> [ (Collection >> #asArray) decompile generate: CompiledMethodTrailer
>> empty ] fails, but
>>
>> [ m := (Collection >> #asArray) decompile.
>>
>>   m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
>>   m generate: CompiledMethodTrailer empty ] works.
>> Why is that recompilation required but decompilation is insufficient? Is
>> this some bug, or is it expected behavior?
>>
>> The general approach seems to be correct, but I think I found an error in
>> the decompilation of literal variables such as Array. I sent
>> Compiler-ct.425 to the inbox which should fix this issue.
>>
>>
>> I moved this to inbox.  It looks correct.  Can you check it against the
>> old bytecode set too?  We don’t want it to break old-style blocks.
>>
>> <http://www.hpi.de/>
>>
>> I am going to complete the implementation of SyntaxMorph >> #parseNode :-)
>>
>> Best,
>> Christoph
>> ------------------------------
>> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
>> Auftrag von Thiede, Christoph
>> *Gesendet:* Dienstag, 15. Oktober 2019 21:08:24
>> *An:* squeak-dev at lists.squeakfoundation.org
>> *Betreff:* [squeak-dev] [Etoys, Compiler] Help wanted: Trying to embed
>> SyntaxMorphs into other tiles
>>
>>
>> Hi all,
>>
>>
>> I'm currently trying to implement #parseNodeWith: on SyntaxMorph, in
>> order to embed SyntaxMorphs into regular tiles. (Did this ever work in
>> past?)
>>
>> I'm afraid the attempt in the commit below does not work yet; you can
>> create a script editor, but parsing is erroneous, so you cannot execute the
>> script.
>>
>>
>> To reproduce:
>>
>> Compile the following:
>>
>> MyPlayer >> examplePlayerCode
>>
>> self forward: 6 * 7.
>>
>> self turn: (11 raisedTo: 13 modulo: 97)
>>
>> and evaluate:
>>
>> | e p |
>> p := Morph new openInWorld assuredPlayer.
>> e := (MyPlayer >> #examplePlayerCode) decompile asScriptEditorFor: p.
>> e openInHand.
>>
>>
>> In Player>>#acceptScript:for:, #generate: is called on node, and when I
>> decompile the result, I get a strange result:
>>
>>
>> examplePlayerCodeTest
>>
>> self forward: 6 * 7.
>>
>> self
>>
>> forward: (#forward: forward: #forward:).
>>
>>
>> I don't know how to use #generate: exactly, but other senders
>> usually appear to recompile a method before passing it to #generate:.
>>
>> For comparison:
>>
>> [ (Collection >> #asArray) decompile generate: CompiledMethodTrailer
>> empty ] fails, but
>>
>> [ m := (Collection >> #asArray) decompile.
>>   m := Compiler new compile: m in: Collection notifying: nil ifFail: #foo.
>>   m generate: CompiledMethodTrailer empty ] works.
>>
>> Why is that recompilation required but decompilation is insufficient? Is
>> this some bug, or is it expected behavior?
>>
>>
>> However, in the case of SyntaxMorph, I don't know how to recompile the
>> node before, as a SyntaxMorph should be able to represent a node of an
>> arbitrary type that must not be constrained to a MessageNode. So how could
>> I solve the problem to generate code from SyntaxMorphs?
>>
>>
>> tl;dr: What is the full story of #generate: and how can it be made to
>> work in this example?
>>
>> Many thanks in advance! :-)
>>
>>
>> Best,
>>
>> Christoph
>>
>>
>> ------------------------------
>> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
>> Auftrag von commits at source.squeak.org <commits at source.squeak.org>
>> *Gesendet:* Dienstag, 15. Oktober 2019 14:46 Uhr
>> *An:* squeak-dev at lists.squeakfoundation.org
>> *Betreff:* [squeak-dev] The Inbox: EToys-ct.367.mcz
>>
>> A new version of EToys was added to project The Inbox:
>> http://source.squeak.org/inbox/EToys-ct.367.mcz
>>
>> ==================== Summary ====================
>>
>> Name: EToys-ct.367
>> Author: ct
>> Time: 15 October 2019, 2:46:24.862129 pm
>> UUID: 1394344f-b1e3-5640-a13a-70c5dffd51f4
>> Ancestors: EToys-mt.361
>>
>> Allow for embedding SyntaxMorphs into test tiles.
>>
>> =============== Diff against EToys-mt.361 ===============
>>
>> Item was added:
>> + ----- Method: SyntaxMorph>>parseNodeWith:asStatement: (in category
>> '*Etoys-Squeakland-code generation') -----
>> + parseNodeWith: encoder asStatement: aBoolean
>> +
>> +        ^ self parseNode!
>>
>>
>>
>>
>>
>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200329/8b374158/attachment-0001.html>


More information about the Squeak-dev mailing list