Neo (was: Traits approaching mainstream Squeak)

Sun Sep 4 01:40:10 UTC 2005

Jecel Assumpcao Jr wrote on Tue, 30 Aug 2005 22:53:14 -0300
> Daniel Vainsencher wrote on Sun, 28 Aug 2005 14:01:59 -0700
> > Andreas Raab wrote:
> > > I see. It would be interesting for me to see an actual design from first 
> > > principles that ignores the restrictions we have right now (like VM 
> > > dependencies) and just tries to build a comprehensive set of 
> > > abstractions.
> > Jecel, maybe you can describe the relevant parts of NeoSmalltalk?
> 
> I will be happy to, though it would have to be a little long to include
> at least some background for the ideas. This will probably have to wait
> until Friday. Another email I sent to the list several hours ago seems
> to be missing - I hope this one makes it through.

One characteristic of the project that might be obvious from the above
is a problem with deadlines :-(

Neo Smalltalk (previously called Self/R and Merlin OS before that) is
the system level software of a low cost computer for students. Not only
should this machine be useful as a tool for learning about math,
science, history, languages and so on but it should also be an
interesting object to study in itself. A very high priority goal is that
the path from casual user to script writer (eToys level programming) to
application writer (Smalltalk-80 level programming) to system developer
to hardware hacker should be as smooth as possible.

Note that I expect only a small fraction of a group at one level to want
to move on to the next level. It is just fine with me if only 200 out of
a million users become hardware hackers. But the system must be designed
in such a way that it doesn't become an obstacle to anyone who is
interested. So the same ideas should be used in all levels (no switching
from Smalltalk-80 to C in order to explore further) and all learning
styles should be supported. In order not to favor abstract thinking only
(math types) the ideas are presented in as concrete and visual way as
possible. This makes describing them just with words in this email a
little more complicated. My key design criteria is "can I explain this
to a bright twelve year old?" and if the answer is "no" then I look for
other alternatives.

One source of confusion in my explanations is that there are two
versions of Neo Smalltalk - the one for the student computer (36 bit
version) and another for embedded applications (16 bit version). The
main difference is that the first is an open system where the image is
split into pieces and only a fraction of those are loaded into the
machine at any one time. You might think of it as a "world wide image".
Traditional Smalltalk notions like #allInstances don't work in such an
environment. In fact, there are multiple viewpoints which means that
even some object that is present in the RAM might be partially loaded:
looking at that EllipseMorph might let you see everything there is in
the current viewpoint but there might be other parts somewhere out there
that you are not aware of. This system organization will make it simple
to implement several kinds of applications that I am interested in and
which are hard in Squeak (for example) and hopefully will get the
students used to the scientific outlook of the world (partial models).

The 16 bit version only has a single viewpoint and its image is entirely
stored in the local Flash memory. But the bytecodes are the same, much
of the GUI and so on. Some design decisions don't make much sense for
this version and only reflect the goal to make the two versions as
similar as possible. I will only mention the common features below.

In Neo an object includes a sequence of "facets" where each facet is
simply a set of methods. Facets don't exist outside of objects and when
an object is copied it shares all its facets with the new object. If the
new object is edited by changing one of the facets this will affect the
original and any other objects that also share that particular facet. So
a very common style of development is to add a new empty facet to the
object and make the changes there (which can't, by definition, affect
anything else in the system). Normally the new facet will be placed at
the head of the sequence so its methods can override methods with the
same name in any older facets but the programmer is free to arrange and
rearrange the order of the facets.

State is implied by the presence of setter methods. These methods are
part of some facet but the state itself belongs to the object. While I
normally try to show visually a close representation of reality, in this
case I felt convenience was more important and show something like "_ x
<- 3" as both defining a method named "x <-" (blanks are allowed in
selectors) in some facet and as showing the currently associated value
in the particular object we are looking at. Viewing that same method
definition inside some other object might show "_ x <- 7". Given that
facets are only shown inside of objects a single one might appear
several times on the screen and it would look identical except for the
values shown as arguments in the setter methods. This representation is
also used in the Self debugger and I think it works very well. The idea
is to focus on behavior instead of state.

As described so far the system doesn't have classes but the programming
environment should have tools to show collections of related objects.
There aren't any globals either but objects can be inserted directly
into a method's source (there is no literal syntax so this is needed
even for something as simple as "3 + 4") and the programming environment
includes pallets of interesting objects which can be used this way.

Facets don't have names but it is a good idea to add comments to them.
An object's organization as facets will depend mostly on its development
history. With a style where new facets are introduced only when changing
any of the existing facets would break some other object, the tendency
will be for facets to correspond roughly to something between classes
and traits in Squeak. Often exploratory programming will result in an
inconvenient division into facets for one or more objects and they will
need to be refactored. Sadly this system in itself makes this very
awkward but the programming environment can make up for this with some
set of tools.

The main idea is to present each object as a self contained entity. It
does refer to other objects through its state and also all the literals
in its methods but this list is just the absolute minimum it needs to
know to get its job done. This is similar to the "principle of least
authority" in security. Everything there is to know about the object
(from a given viewpoint) should be in one place. Eliminating inheritance
and globals really helps with this.

Facets are manually placed in a simple sequence so they lack the
sophisticated meta-information and composition of Traits. Other than
that they are rather similar and so Neo gives a hint of what a
Traits-only Squeak might be like.

Compared with Self, Neo doesn't have a problem with incomplete trait
objects (that implement methods only their children can execute but not
themselves), lack of inheritance of structure (thanks to state being
implied by setter methods that are "inherited"), program structure
leaking into the base level (parent slots) but is missing dynamic
inheritance (which hasn't been used in practice so far), shared state
(data slots in parent objects) and a universal scoping story (in Self
block context objects inherit from their lexical scoped context objects
which inherits from the receiver which inherits from its trait which
inherits from its trait which inherits from the lobby which inherits
from the globals).

By the way, as far as I know the word "traits" was introduced by Adele
Goldberg in the boxes example when teaching Smalltalk-72 to children at
Xerox PARC. Smalltalk did not yet have inheritance back then but she
wanted a name to describe on paper the common set of features of a group
of objects as a separate idea from the individual features of each
object. So traits on paper were an informal introduction to the formal
idea of classes. When Self was created a name had to be selected to
describe parent objects that functioned as classes and "trait" was used
to avoid confusion that "class" would have caused. Other parent objects
were called "mixins" since that term was well known since the Flavors
object system for Lisp. StrongTalk built its classes from smaller
behavior-only pieces which they also called "mixins" as they also fit
the traditional definition. For Squeak the behavior-only pieces gained a
lot of new functionality so the term "mixin" would have given the wrong
idea. "Trait" was selected though there was no relation with how that
name was used in Self, but given how few people know that language there
was far less total confusion this way. The term "facet" is used by the
free Electric integrated circuit CAD software and I don't think the
author will mind lending it to me ;-)

If anyone needs better explanations or more information then please
write to me directly for this is very off topic for squeak-dev.

-- Jecel