[ANN] Marvin: Self for Squeak

Pavel Krivanek squeak1 at continentalbrno.cz
Sun Sep 4 18:39:18 UTC 2005


What’s Marvin?

Marvin is a Self dialect which combines characteristics of Self 
programming language and Smalltalk-80.


What is the status of this project?

It’s bleeding edge. It’s not usable for practical use but there’s just a 
functional implementation which can compile end execute code.


Why new language?

Smalltalk isn’t suitable for prototype-based systems. It has no literals 
for objects and you have to use pseudo-variable "self" extremely 
frequently. Self programming language can be hardly integrated with Squeak.

In shortcut, Marvin is Self with Squeak literals and conventions. It’s 
integrated with Squeak environment.


Does Marvin have any special interpretation layer?

No, it is compiled directly to the native bytecodes of Squeak and 
executed by virtual machine.


Do I need any special version of virtual machine?

Yes, you need a virtual machine with enhanced sending mechanism.


What are the VM modifications?

Currently Marvin adds two new primitives for definition of prototype 
class and it modifies the sending and resending mechanism. If the 
receiver of message is the prototype class, VM uses another lookup 
algorithm based on delegation.


Is this special virtual machine slower?

Of course yes. This test consumes about 2% of speed now but after 
optimalization there may be only one additional comparison of two 
integer values. So there will be no relevant slowdown.


What’s the prototype class?

Instances of this class (MarvinPrototype) are prototypes – objects with 
slots. Prototypes have no instance variables.


How delegation works?

When you send a message to a prototype, it seeks through its slots and 
if it finds matching slot, it does due operation – read/write value of 
slot or calls compiled method stored in slot. If no matching slot is 
found, this process continues with objects referenced by parent slots.


What’s the physical structure of prototype?

In fact, prototype has indexable pointer variables (like Array). It has 
four sections of slots and every section is separated by nil. Every slot 
takes 1-3 elements in dependence on slot type:

method slot:
- take 2 elements
- the first element is a reference to method selector (like #method)
- the second element is reference to compiled method
- if sent message selector reference is same as the first element, the 
compiled method is executed

writeable data slot
- take 3 elements
- the first element is reference to read message selector (like #variable)
- the second element is reference to write message selector (like 
#variable:)
- the third element is reference to slot value
- if sent message selector is same as the first element, the result of 
message send is the third value
- if sent message selector is same as the second element, VM takes the 
argument from stack and stores it into the third element

read-only data slot
- has the same structure as writable data slot.
- the second element refers the same object as the first element (read 
message selector)

parent slot
- can be read-only or writeable
- has the same structure as data slots (3 elements)

The order of slots is:
- parent slots
- method slots
- data slots
- indexable slots

For example, the Selfs object

( |
parent* = lobby.
method = ( 3+4).
x <- 5.
y = nil.
| )

contains an array of references to this objects:

01: symbol #parent
02: symbol #parent
03: object lobby
04: nil (separator)
05: symbol #method
06: complied method
07: nil (separator)
08: symbol #x
09: symbol #x:
10: number 5
11: symbol #y
12: symbol #y
13: nil (value)
14: nil (separator)

Prototype with no slots is an array with three nils.


Why read-write data and parent slots contain both selectors and not only 
the slot name?

It’s speed optimalization. Virtual machine can simply compare only 
references to selectors and it doesn’t have to concatenate strings and 
compare them.


Why read-only data and parent slots don’t take only 2 elements - slot 
name and value?

We don’t have to establish next two types of slots and every type of 
slots has fixed size.


What are the indexable slots?

The delegation lookup stops at the third separator (nil). The rest of 
prototype can contain arbitrary references. Prototypes so can be used as 
collections. The only disadvantage is that we don’t know index of the 
first element and we have to sequentially find the position of the last 
separator.


Can we create prototypes without Marvin compiler?

Yes, it can be build directly from an array.

p := MarvinPrototype withAll: #( nil nil #x #x: 56 nil).
p x --> 56
p x: 42.
p x --> 42

Or you may use this way:

p := MarvinPrototype new.
p AddAssignSlot: #x value: 56.


Can prototypes refer standard Smalltalk methods?

Yes with one limitation. You may use something like:

lobby := MarvinPrototype new.
lobby AddMethodSlot: #slotNotFound: value: (MarvinPrototype class >> 
#lobbyDNU:).

but if this compiled method contains super send, you have to put 
reference to the method owner prototype as the last literal (it’s method 
holder class in Smalltalk).


Why I have to specify owner of the compiled method?

Because virtual machine modifies super sends too. When the receiver is 
the prototype class, the super send bytecodes are interpreted as resends.


What’s resend?

It’s modified delegation send. The lookup doesn’t start from the 
receiver slots but from parents of object in which the executed method 
is defined.


Does Marvin have temporary variables in methods and blocks?

No, it uses slots like Self, so if we have this Smalltalk method

sum
| a b |
a := 3.
b := 4.
^ a + b

Marvin’s equivalent method slot is:

sum = ( | a. b |
a: 3.
b: 4.
^ a + b )

However local slots of methods and blocks are simulated by temporary 
variables.


Self methods return result of the last expression implicitly. Why 
there’s an explicit return here?

That’s because Marvin uses Squeak conventions for methods and blocks. 
Methods return receiver implicitly and block return result of the last 
expression or nil (if they are empty).


Can you specify a value of local slots of blocks and methods in definition?

Yes, the previous example can look like:

sum = ( | a = [3]. b = [4] |
^ a + b )


Why this ugly square brackets?

If you specify slot value in Self, the assigned expression is evaluated 
in compile time in context of lobby so if you create this object:

( | slot = self | )

it contains one slot, named “slot” with object lobby! Square brackets 
separate it visually.

Square brackets also help to have unambiguous grammar.

In current version the slot can contain only a result of single 
expression (like in Self), but in the next versions it will be whole 
expression sequence (including slot definition).


Ok, but is it readable if you write object literals?

No. Look at this example (delegation demonstration, the result is 10).

(|
parent* = [ (|
a = [6]
|) ].
a = [3].
b = [4].
sum = ( ^ resend a + b)
|) sum

Maybe we will include some syntactical shortcut for sequence of three 
characters [ ( | .


Can I use a block in slot value?

Yes, you can. An example:

(|
sum = [[ |:a. :b | a + b ]].
test = ( ^ sum value: 3 value: 4 )
|) test

Notice that this construction fails in Self because the block in slot 
expires after compilation.


How methods with arguments are defined?

Unlike Self, Marvin has only one way how to define methods (Smalltalk’s way)

(|
a = [3].
sum: b = ( ^ a + b )
|) sum: 4


Can objects contain code?

No, only blocks and methods can contain code (like in Self)


Can methods use methods?

No, nested methods are forbidden (like in Self)


Can methods and blocks contain parent slots?

No, unlike Self. Marvin uses standard Squeak blocks and it has no 
activation objects (parent slots are added to activation objects in 
Self). But it’s not important limitation.


Are there any special naming conventions for slot names (methods)?

Marvin is fully integrated with Squeak so it has to use Squeak naming 
conventions. So it uses ifTrue:ifFalse and not ifTrue:False: etc. The 
only convention it uses is that names of primitive methods begin with 
capital letter.


Does Marvin have literals for characters, arrays etc.?

Yes, Marvin uses all Squeak literals except scaled decimals and 
expression arrays. Both will be added in next versions. Marvin creates 
these literals as standard Smalltalk objects.


Does Marvin have its own class for blocks?

No, even block are standard Smalltalk objects so you can use all its 
capabilities like multitasking.

(|
parent* = [ lobby ].
test = (
| number = [50]. result |
[
result: number factorial.
inform: result asString
] forkAt: Processor userBackgroundPriority )
|) test


Does Marvin have object and slots annotations like Self?

No, it hasn’t. It would be problematic for implementation and it may be 
simply replaced with more general annotation protocol analogous to class 
organizations. It’s not important to support it in grammar.


Does Marvin have primitive methods?

If there’s no matching slot found during delegation, virtual machine 
tries to find method in prototype class (like in any other Smalltalk 
object). That’s why prototypes are very familiar with Squeak and you can 
print them, explore them etc.

Marvin doesn’t need classical primitive methods like Smalltalk because 
it can use standard Squeak infrastructure.


Can I use Squeak classes in Marvin programs?

Yes, if you wish. If your object has lobby as its parent, it can access 
to global objects in Smalltalk system dictionary.


Can I use standard Smalltalk objects and classes as parents of prototypes.

It’s very problematic operation. We may theoretically delegate Smalltalk 
classes but only in case it has no instance variables. The current 
implementation tries only to resend messages to standard objects 
referred by parent slots, but this operation even doesn’t work well.
It’s maybe the most limiting aspect of Marvin’s design.


Are comments in Marvin the same as comments in Smalltalk and Self?

Marvin uses standard comments. Moreover its lexical analyzer supports 
line comments (something like // in C++). They start with doubled 
quotation marks.

(|
a = [3]. "" line comment
b <- [4]. "block comment"
|)


Can I use Unicode?

Yes, when you use Squeak 3.8 you can write non-ASCII characters in 
string literals, comments etc. You cannot use Unicode in identifiers 
(unlike Squeak).


Does Marvin full tree search during delegation?

No, it doesn’t. Unlike Self. Marvin doesn’t check ambiguous calls and 
use only simple Depth First Search algorithm. It depends on parent slots 
order. However thanks this property Marvin is more flexible in 
redefinition of namespaces.


What are the benefits of Marvin?

It brings the power of classless programming in Squeak in the form, 
which can be very familiar to current Squeakers and can combine the main 
advantages of Squeak and Self in one compact system.

Squeak gets multiple inheritance, dynamic inheritance, namespaces, 
mixins etc. It can be very useful especially for UI projects like eToys.


Is there any outliner?

No, still isn’t. Now you can only evaluate code (directly or using SmaCC 
GUI)


Is there any decompiler and debugger?

No, it isn’t.


Are examples in this document runable?

Yes, they are.


Can we see any more advanced example?

Yes, this example (http://www.comtalk.net/Squeak/95) shows how modules 
can be implemented in Marvin. There are two separated parts of 
demonstration system – application and kernel. Application asks kernel 
if it can load a module (the set of traits and globals) and the kernels 
loads a copy of module into the application’s lobby. So application has 
its own namespace and can make modifications of its modules without 
effect on the kernel or other applications. It’s a kind of sandbox.


Where can I find sources?

Here: http://www.comtalk.net/Squeak/95


Is there any prepared virtual machine?

Yes, but only for Windows, sorry. My attempts to build VM for Linux 
failed (I have Gentoo on amd64).


What if I want to build VM for Linux?

Use standard build process using VMMaker. Marvin’s VM for Windows uses 
older version of VMMaker than the Linux versions so be sure all Marvin’s 
modifications are compatible with your version. If you success, please 
publish binaries.


Known problems?

I don’t know any part of current implementation which isn’t problematic :-)

Your help is needed!

-- Pavel Krivanek



More information about the Squeak-dev mailing list