CorruptVM slang (Was: Re: [squeak-dev] Anyone know the following about Slang?)

Igor Stasenko siguctua at gmail.com
Sat Jul 5 03:41:27 UTC 2008


2008/7/5 tim Rowledge <tim at rowledge.org>:
>
> On 4-Jul-08, at 6:35 AM, Eliot Miranda wrote:
>>
>> [snip]
>> Thanks Tim!  That's what I needed.  Being pointed to the right place.  It
>> has taken 20 minutes to understand the code and 20 minutes to fix it.
>>  Thanks so much!!
>
> Nice to have actually achieved something this week; it's been one of those
> weeks...
>
> Simulating simulating the VM to gather type data seems like a pretty complex
> project. I can't help feeling it would be simpler and faster to simply write
> the VM cleanly, with decent documentation and specs.
>
> Igor, if you can produce a Better Slang With Lambas, do please share the
> code. There has to be some way of cleaning up the current mess.
>

It is already done. :)

The native methods which fully replacing primitives in CorruptVM
having special syntax.
Here an example of <native> method in CorruptVM:

lookup: selector
	<native>
	<variable: #delegate knownAs: #VTable>
	<variable: #vec knownAs: #Vector>
	<variable: #assoc knownAs: #Association>
	| vec delegate |
	
	delegate := self.
	delegate equal: nil whileFalse: [
		vec := delegate bindings.
		1 to: vec size do: [:i |  | assoc |
			assoc := vec at: i.
			assoc key equal: selector ifTrue: [
				^ assoc value
				]
			].
		delegate := delegate delegate.
	].
	^ nil

A <native> pragma tells compiler to switch to different logic. A logic
is simple:
any variables and values is a machine words (32/64/arbirary bit ints),
messages like + , - , * , / , bit shifts are pointer arithmetic , and
they behave exactly like corresponding CPU instructions.

Blocks not allowed in syntax except some special messages, like:
#equal:whileTrue: , #equal:ifTrue: , and variants (replace 'equal'
with 'greater', 'less' etc) simply translated to branching
instructions.

To read memory you simply write:
memValue := someAddressValue readWord.  "there is also a #readByte"

to write, you type:
someAddressValue writeWord: value

But if you look at method example, it looks quite similar to regular
smalltalk. Its because it using a static inlining.
A principle is simple: whenever compiler sees a message send, and its
not a special message (selector not found in CVSpecialMessages class),
then compiler does a method lookup and inlines found method.

Pragmas, like:
<variable: #vec knownAs: #Vector>
helping compiler with hints, where it should look for message implementation.
In example above , i sending #at: to 'vec' variable. In result
compiler inlines #at: method either in Vector class or its parent.
If class of variable not specified, then by default its class assumed
ProtoObject. If implementation not found - then compiler raising a
compile error.
You can't recursively inline same method. There are simple check which
throws an error when you trying this.

Messages to thisContext is treated as messages to compiler itself, and
therefore can be uses a preprocessing directives. For instance , i
found quite useful a following directive (or call it macro):

thisContext ifInlined: [ ... ] ifNotInlined: [ .. ].

With this, i can determine if current method are compiled for inlining
or for call from smalltalk and provide different implementation for
both cases.
An example is tagging small integers:
a method #size, returns a size of array. When inlinined, it returns a
number of array elements in machine word representation, when called
from smalltalk - it returns a small integer. This allows to avoid
excessive tagging/detagging in many places.

So, what about it in essence:
- with native methods you can provide implementation of any low-level
basic behavior.
- things are quite simple and you write code very similar to regular
smalltalk, with exception that you need to keep in mind, that all
message sends in native method either inlined or special low-level
messages.

So, in future system, you can write a package which contains smalltalk
code, and native code both. You don't need plugins or something else
external to make any code working just after it loaded into image.
Imagine a BitBlt package which contains everything in one place. Once
you load it - you got bitblt, and you don't need to care about
compiling/downloading plugins.

As for translation to C:
its really easy to write such. You basically need to write a lambda
transformation which transforms them to C code. This is quite
straightforward, once you got a low-level lambdas.

Simulation: it took me about 2 hours to implement a basic
CVCPUSimulator. It is really dumb and you don't find any complex logic
in it.

-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list