[ANN] Marvin: Self for Squeak
Pavel Krivanek
squeak1 at continentalbrno.cz
Sun Sep 4 18:39:18 UTC 2005
What’s Marvin?
Marvin is a Self dialect which combines characteristics of Self
programming language and Smalltalk-80.
What is the status of this project?
It’s bleeding edge. It’s not usable for practical use but there’s just a
functional implementation which can compile end execute code.
Why new language?
Smalltalk isn’t suitable for prototype-based systems. It has no literals
for objects and you have to use pseudo-variable "self" extremely
frequently. Self programming language can be hardly integrated with Squeak.
In shortcut, Marvin is Self with Squeak literals and conventions. It’s
integrated with Squeak environment.
Does Marvin have any special interpretation layer?
No, it is compiled directly to the native bytecodes of Squeak and
executed by virtual machine.
Do I need any special version of virtual machine?
Yes, you need a virtual machine with enhanced sending mechanism.
What are the VM modifications?
Currently Marvin adds two new primitives for definition of prototype
class and it modifies the sending and resending mechanism. If the
receiver of message is the prototype class, VM uses another lookup
algorithm based on delegation.
Is this special virtual machine slower?
Of course yes. This test consumes about 2% of speed now but after
optimalization there may be only one additional comparison of two
integer values. So there will be no relevant slowdown.
What’s the prototype class?
Instances of this class (MarvinPrototype) are prototypes – objects with
slots. Prototypes have no instance variables.
How delegation works?
When you send a message to a prototype, it seeks through its slots and
if it finds matching slot, it does due operation – read/write value of
slot or calls compiled method stored in slot. If no matching slot is
found, this process continues with objects referenced by parent slots.
What’s the physical structure of prototype?
In fact, prototype has indexable pointer variables (like Array). It has
four sections of slots and every section is separated by nil. Every slot
takes 1-3 elements in dependence on slot type:
method slot:
- take 2 elements
- the first element is a reference to method selector (like #method)
- the second element is reference to compiled method
- if sent message selector reference is same as the first element, the
compiled method is executed
writeable data slot
- take 3 elements
- the first element is reference to read message selector (like #variable)
- the second element is reference to write message selector (like
#variable:)
- the third element is reference to slot value
- if sent message selector is same as the first element, the result of
message send is the third value
- if sent message selector is same as the second element, VM takes the
argument from stack and stores it into the third element
read-only data slot
- has the same structure as writable data slot.
- the second element refers the same object as the first element (read
message selector)
parent slot
- can be read-only or writeable
- has the same structure as data slots (3 elements)
The order of slots is:
- parent slots
- method slots
- data slots
- indexable slots
For example, the Selfs object
( |
parent* = lobby.
method = ( 3+4).
x <- 5.
y = nil.
| )
contains an array of references to this objects:
01: symbol #parent
02: symbol #parent
03: object lobby
04: nil (separator)
05: symbol #method
06: complied method
07: nil (separator)
08: symbol #x
09: symbol #x:
10: number 5
11: symbol #y
12: symbol #y
13: nil (value)
14: nil (separator)
Prototype with no slots is an array with three nils.
Why read-write data and parent slots contain both selectors and not only
the slot name?
It’s speed optimalization. Virtual machine can simply compare only
references to selectors and it doesn’t have to concatenate strings and
compare them.
Why read-only data and parent slots don’t take only 2 elements - slot
name and value?
We don’t have to establish next two types of slots and every type of
slots has fixed size.
What are the indexable slots?
The delegation lookup stops at the third separator (nil). The rest of
prototype can contain arbitrary references. Prototypes so can be used as
collections. The only disadvantage is that we don’t know index of the
first element and we have to sequentially find the position of the last
separator.
Can we create prototypes without Marvin compiler?
Yes, it can be build directly from an array.
p := MarvinPrototype withAll: #( nil nil #x #x: 56 nil).
p x --> 56
p x: 42.
p x --> 42
Or you may use this way:
p := MarvinPrototype new.
p AddAssignSlot: #x value: 56.
Can prototypes refer standard Smalltalk methods?
Yes with one limitation. You may use something like:
lobby := MarvinPrototype new.
lobby AddMethodSlot: #slotNotFound: value: (MarvinPrototype class >>
#lobbyDNU:).
but if this compiled method contains super send, you have to put
reference to the method owner prototype as the last literal (it’s method
holder class in Smalltalk).
Why I have to specify owner of the compiled method?
Because virtual machine modifies super sends too. When the receiver is
the prototype class, the super send bytecodes are interpreted as resends.
What’s resend?
It’s modified delegation send. The lookup doesn’t start from the
receiver slots but from parents of object in which the executed method
is defined.
Does Marvin have temporary variables in methods and blocks?
No, it uses slots like Self, so if we have this Smalltalk method
sum
| a b |
a := 3.
b := 4.
^ a + b
Marvin’s equivalent method slot is:
sum = ( | a. b |
a: 3.
b: 4.
^ a + b )
However local slots of methods and blocks are simulated by temporary
variables.
Self methods return result of the last expression implicitly. Why
there’s an explicit return here?
That’s because Marvin uses Squeak conventions for methods and blocks.
Methods return receiver implicitly and block return result of the last
expression or nil (if they are empty).
Can you specify a value of local slots of blocks and methods in definition?
Yes, the previous example can look like:
sum = ( | a = [3]. b = [4] |
^ a + b )
Why this ugly square brackets?
If you specify slot value in Self, the assigned expression is evaluated
in compile time in context of lobby so if you create this object:
( | slot = self | )
it contains one slot, named “slot” with object lobby! Square brackets
separate it visually.
Square brackets also help to have unambiguous grammar.
In current version the slot can contain only a result of single
expression (like in Self), but in the next versions it will be whole
expression sequence (including slot definition).
Ok, but is it readable if you write object literals?
No. Look at this example (delegation demonstration, the result is 10).
(|
parent* = [ (|
a = [6]
|) ].
a = [3].
b = [4].
sum = ( ^ resend a + b)
|) sum
Maybe we will include some syntactical shortcut for sequence of three
characters [ ( | .
Can I use a block in slot value?
Yes, you can. An example:
(|
sum = [[ |:a. :b | a + b ]].
test = ( ^ sum value: 3 value: 4 )
|) test
Notice that this construction fails in Self because the block in slot
expires after compilation.
How methods with arguments are defined?
Unlike Self, Marvin has only one way how to define methods (Smalltalk’s way)
(|
a = [3].
sum: b = ( ^ a + b )
|) sum: 4
Can objects contain code?
No, only blocks and methods can contain code (like in Self)
Can methods use methods?
No, nested methods are forbidden (like in Self)
Can methods and blocks contain parent slots?
No, unlike Self. Marvin uses standard Squeak blocks and it has no
activation objects (parent slots are added to activation objects in
Self). But it’s not important limitation.
Are there any special naming conventions for slot names (methods)?
Marvin is fully integrated with Squeak so it has to use Squeak naming
conventions. So it uses ifTrue:ifFalse and not ifTrue:False: etc. The
only convention it uses is that names of primitive methods begin with
capital letter.
Does Marvin have literals for characters, arrays etc.?
Yes, Marvin uses all Squeak literals except scaled decimals and
expression arrays. Both will be added in next versions. Marvin creates
these literals as standard Smalltalk objects.
Does Marvin have its own class for blocks?
No, even block are standard Smalltalk objects so you can use all its
capabilities like multitasking.
(|
parent* = [ lobby ].
test = (
| number = [50]. result |
[
result: number factorial.
inform: result asString
] forkAt: Processor userBackgroundPriority )
|) test
Does Marvin have object and slots annotations like Self?
No, it hasn’t. It would be problematic for implementation and it may be
simply replaced with more general annotation protocol analogous to class
organizations. It’s not important to support it in grammar.
Does Marvin have primitive methods?
If there’s no matching slot found during delegation, virtual machine
tries to find method in prototype class (like in any other Smalltalk
object). That’s why prototypes are very familiar with Squeak and you can
print them, explore them etc.
Marvin doesn’t need classical primitive methods like Smalltalk because
it can use standard Squeak infrastructure.
Can I use Squeak classes in Marvin programs?
Yes, if you wish. If your object has lobby as its parent, it can access
to global objects in Smalltalk system dictionary.
Can I use standard Smalltalk objects and classes as parents of prototypes.
It’s very problematic operation. We may theoretically delegate Smalltalk
classes but only in case it has no instance variables. The current
implementation tries only to resend messages to standard objects
referred by parent slots, but this operation even doesn’t work well.
It’s maybe the most limiting aspect of Marvin’s design.
Are comments in Marvin the same as comments in Smalltalk and Self?
Marvin uses standard comments. Moreover its lexical analyzer supports
line comments (something like // in C++). They start with doubled
quotation marks.
(|
a = [3]. "" line comment
b <- [4]. "block comment"
|)
Can I use Unicode?
Yes, when you use Squeak 3.8 you can write non-ASCII characters in
string literals, comments etc. You cannot use Unicode in identifiers
(unlike Squeak).
Does Marvin full tree search during delegation?
No, it doesn’t. Unlike Self. Marvin doesn’t check ambiguous calls and
use only simple Depth First Search algorithm. It depends on parent slots
order. However thanks this property Marvin is more flexible in
redefinition of namespaces.
What are the benefits of Marvin?
It brings the power of classless programming in Squeak in the form,
which can be very familiar to current Squeakers and can combine the main
advantages of Squeak and Self in one compact system.
Squeak gets multiple inheritance, dynamic inheritance, namespaces,
mixins etc. It can be very useful especially for UI projects like eToys.
Is there any outliner?
No, still isn’t. Now you can only evaluate code (directly or using SmaCC
GUI)
Is there any decompiler and debugger?
No, it isn’t.
Are examples in this document runable?
Yes, they are.
Can we see any more advanced example?
Yes, this example (http://www.comtalk.net/Squeak/95) shows how modules
can be implemented in Marvin. There are two separated parts of
demonstration system – application and kernel. Application asks kernel
if it can load a module (the set of traits and globals) and the kernels
loads a copy of module into the application’s lobby. So application has
its own namespace and can make modifications of its modules without
effect on the kernel or other applications. It’s a kind of sandbox.
Where can I find sources?
Here: http://www.comtalk.net/Squeak/95
Is there any prepared virtual machine?
Yes, but only for Windows, sorry. My attempts to build VM for Linux
failed (I have Gentoo on amd64).
What if I want to build VM for Linux?
Use standard build process using VMMaker. Marvin’s VM for Windows uses
older version of VMMaker than the Linux versions so be sure all Marvin’s
modifications are compatible with your version. If you success, please
publish binaries.
Known problems?
I don’t know any part of current implementation which isn’t problematic :-)
Your help is needed!
-- Pavel Krivanek
More information about the Squeak-dev
mailing list
|