About the new compiler, Part I
John Brant
brant at refactoryworkers.com
Sat Jan 19 17:11:56 UTC 2008
Marcus Denker wrote:
> 4) The annotated AST now can be used to generate Code. The NewCompiler
> uses a very nice small "Squeak Assembler" to emit
> the code. The class for Code Generation is ASTTranslator. Again a
> visitor, walks over the tree and calls IRBuilder to
> emit code.
> Here in ASTTranslator the magic happens for the inlining of
> if/and/while and so on:
>
> acceptMessageNode: aMessageNode
>
> aMessageNode isInlineIf ifTrue: [^ self emitIfNode: aMessageNode].
> aMessageNode isInlineIfNil ifTrue: [^ self emitIfNilNode:
> aMessageNode].
> aMessageNode isInlineAndOr ifTrue: [^ self emitAndOrNode:
> aMessageNode].
> aMessageNode isInlineWhile ifTrue: [^ self emitWhileNode:
> aMessageNode].
> aMessageNode isInlineToDo ifTrue: [^ self emitToDoNode:
> aMessageNode].
> aMessageNode isInlineCase ifTrue: [^ self emitCaseNode:
> aMessageNode].
> ^ self emitMessageNode: aMessageNode
>
>
> Uncomment the line with "isInlineIfNil" --> no compiler generates
> normal message send for ifNil:. Easy.
Since you have the transformations available, you could use them as
macros to eliminate some of the cases here. For example, #ifNil: can be
transformed into "isNil ifTrue:".
When I was working on #Smalltalk for .NET, I used the transformations
for all of your special cases. After the code was parsed, it would
perform the transformations on the parse tree to get a new tree that was
then compiled. Each transformation was named and could be added or
removed on a class by class basis. For the #ifNil: message, I would add
a global rule:
Name: #ifNil:
Search: ``@a ifNil: ``@b
Replace: ``@a isNil ifTrue: ``@b
If someone wanted to remove the #ifNil: transformation for a particular
class, they could add a class rule with the same name and without any
search/replace expressions.
In addition to the transformations, I also created a special message
that was evaluated in the context of the current compiler. In
#Smalltalk, if you sent a #primitive:* message to Compiler global, it
would evaluate the primitive block in the compiler's context. For
example, you could have code like:
myMethod
| a |
a := self printString.
Compiler
primitive: [:compiler :block |
1 to: block statements size by: 2
do: [:i |
compiler
emitStatement: (block statements at: i)]]
argument: [a := a , '1'. a := a , '2'. a := a , '3'].
^a
When this is compiled, it would get to the Compiler #primitive:argument:
message, and then the compiler would evaluate the #primitive: block
argument passing the compiler and the "[a := a , '1'. a := a , '2'. a :=
a , '3']" parse tree as the arguments. In this example, the primitive
block tells the compiler to emit only the odd statements in the block.
Effectively, this would generate code that would look like:
myMethod
| a |
a := self printString.
a := a , '1'.
a := a , '3'.
^a
In addition to the argument: keyword, I also have an evaluate: keyword
for the #primitive:* message send. While the argument: keyword causes
the corresponding parse tree to be passed to the primitive: block, the
evaluate: keyword causes the compiler to compile the code that pushes
its corresponding argument on the stack. For example, in #Smalltalk, I
handle a == ifTrue: pattern using:
Name: ==ifTrue:
Search: ``@a == ``@b ifTrue: ``@trueBlock `{:node | node isBlock and:
[node arguments isEmpty]}'
Replace: Compiler
primitive:
[:methodCompiler :block |
methodCompiler generateIdentityIfTrue: block ifFalse: nil]
evaluate: ``@a
evaluate: ``@b
argument: ``@trueBlock
When the primitive block is evaluated, it knows that the code to push
the ``@a and ``@b objects has been generated, so it just needs to
generate a jump if equal and the block.
By combining the transformations with the Compiler #primitive:*
messages, we can eliminate almost all of your special cases in your
method above. In #Smalltalk, I only have two special cases. One for the
Compiler #primitive:* messages and another one that handles sending
messages to non #Smalltalk objects. For Squeak, you wouldn't need to
worry about sending messages to non Squeak objects, so you could
simplify the method above to only one special case.
Once I had implemented this for #Smalltalk, it was interesting
performing tests comparing the optimized code to the unoptimized code.
The unoptimized code was much bigger since every ifTrue:, whileTrue:,
etc. block was a real block and needed all of the code for a real block.
Also, it was interesting finding a real benchmark that wouldn't crash
with a stack overflow. For example, if you have no optimizations, then
something as simple as "100000000 timesRepeat: []" will blow the stack.
John Brant
More information about the Squeak-dev
mailing list
|