Design Principles Behind Smalltalk, Revisited

Mon Dec 25 22:10:47 UTC 2006

When I was looking at GST vs. Ruby benchmarks today,
http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=gst&lang2=ruby
I came across a link at the bottom to the original "Design Principles 
Behind Smalltalk" paper by Dan Ingalls, see:
http://users.ipa.net/~dwighth/smalltalk/byte_augc81/design_principles_behind_smalltalk.html

This essay attempts to look at Dan's 1981 essay and move beyond it, 
especially by considering supporting creativity by a group instead of 
creativity by an isolated individual, and also by calling into question 
"objects" as a sole major metaphor for a system supporting creativity. 
Some of this thinking about "objects" is informed by the late William 
Kent's work, especiallyKent's book "Data & Reality":
   http://www.bkent.net/
   http://www.bkent.net/Doc/darxrp.htm
Presumably the original paper reflects not just Dan's work and thinking, 
but that of Alan Kay and the larger Learning Research Group at Xerox Parc 
at the time, but I will refer to it as Dan's writing, because his is the 
only name on it.

Mainly I will consider the first half of the paper. This essay is perhaps 
a little in the spirit of 'The Rise of "Worse is Better"',
   http://www.ai.mit.edu/docs/articles/good-news/subsection3.2.1.html
and is intended to help in understanding why, say, Python has been so 
successful in capturing the hearts and minds of the last decade of 
development, running core systems from Google to NASA, whereas Squeak 
Smalltalk has remained a niche project during that time. And this is true 
even though we all know that Squeak is better than Python in oh so many 
ways (with a more expandable and more self-documenting syntax using 
key:words: instead of functions, better transparency from top to bottom of 
the system, better core graphics engine, better community in terms of very 
bright people capable of handling a high level of abstraction, better core 
tools, more consistent language model, better streams and number classes, 
more portable VM, better dynamic development where you can code in the 
debugger and restart a method instead of an application, and so on). 
Still, for all those Squeak advantages, I think, the same applies for 
Squeak Smalltalk as Richard Gabriel of 'The Rise of "Worse is Better"' 
says of Lisp, "... one can conclude only that the Lisp community needs to 
seriously rethink its position on Lisp design."

That "Worse is Better" paper probably has had little effect on changing 
Lisp the language, and I doubt this note will have much effect on Squeak 
the system. :-) Ultimately languages (and the mailing lists that support 
them) are somewhat self-selecting -- if you have major problems with the 
language or paradigm, you probably are not using Squeak or on the Squeak 
development list. Still, I found it of value to me to write up these 
issues, in terms of thinking of the next generation of tools and users, 
and I hope some Squeakers out there find it of value to read.

First off, I agree with Dan's stated goal of a quarter century back: "The 
purpose of the Smalltalk project is to provide computer support for the 
creative spirit in everyone." So, overall, there is no difference in goal 
when broadly construed.

This essay will outline some points of disagreement with how to best 
support that goal. Some of this disagreement will be from a coming at the 
notion of how to support creativity different perspective, especially 
given how the computing landscape has changed due to the very success of 
the object-oriented and networked GUI paradigms which Smalltalk (and the 
Alto it was developed on) pioneered. As Steve Jobs said:
   http://americanhistory.si.edu/collections/comphist/sj1.html
"SJ: ... I saw their early computer called the Alto which was a phenomenal 
computer and they actually showed me three things there that they had 
working in 1976. I saw them in 1979. Things that took really until a few 
years ago for us to fully recreate, for the industry to fully recreate in 
this case with NeXTStep. However, I didn't see all three of those things. 
I only saw the first one which was so incredible to me that it saturated 
me. It blinded me to see the other two. It took me years to recreate them 
and rediscover them and incorporate them back into the model but they were 
very far ahead in their thinking. They didn't have it totally right, but 
they had the germ of the idea of all three things. And the three things 
were graphical user interfaces, object oriented computing and networking."

=== creativity by an individual versus creativity by a group ===

Dan wrote in the paper: "If a system is to serve the creative spirit, it 
must be entirely comprehensible to a single individual."

I strongly disagree with this, as much as I still agree with his later 
statement of: "Any barrier that exists between the user and some part of 
the system will eventually be a barrier to creative expression."

The disagreement comes from considering the idea of creativity of a 
community involving building on the work of others (where such work you 
use may be beyond your ability to either fully comprehend or modify). Or, 
is what is best for the individual in conflict with what is best for the 
group? And if so, which should win? While ideally, any system should have 
no barriers to the individual user, in practice, it may, and yet may still 
support a large amount of creativity by the group, sometimes directly 
because of those barriers. Another way to think of a barrier is an 
"interface" or a "firewall", and those have some positive connotations. 
Even Smalltalk's very success is due in part to creating a strong barrier 
at run-time  between the user image of objects and the VM which supports 
it; this is a barrier which people have complained about, with the phrase 
"if you can't crash it, you're not doing the driving. " :-)

Consider, for example, the success of Python, which is a mostly 
object-oriented language core written in C, where lots of other libraries 
have been bolted on to it by the community. It is this widespread 
availability of useful libraries which drives Python adoption more than 
any other thing. (Same with Perl adoption.) Python has other ideas which 
proves popular, like its use of significant white space like Occam used, 
having dictionaries built in and easy to use, and looking a lot like C, 
but it is the libraries and the related modularity which is one of the 
biggest wins for Python. Well, and that you can write a program in a few 
lines in a text editor that use those great libraries, making it easy to 
build small things with little learning. I think this sentiment of 
focusing on empowering the individual primarily, indirectly at the expense 
of empowering the community, was why Squeak Smalltalk has suffered from 
poor modularity so often and why, for example, it took so long to get 
namespaces and such into the mainstream, and also why it has struggled to 
have as many libraries as Python offers.

So while I think Dan's original goal is a nice ideal, in practice it is 
not needed in the extreme, since the creativity of a group sharing, say, a 
community mailing list, will still move beyond the creativity of any 
individual in the group. So, it is *more* important in the internet age to 
have techniques for supporting group creativity, including modularity, 
than it is for the system to not have any barriers. That is probably one 
reason a system like Python is actually used in practice by more people to 
do creative things than Smalltalk. Python makes it easy for many 
individuals to do small things. While Python as a language and as an 
environment in practice does not scale as well as Smalltalk, the aggregate 
amount of all  those individuals doing small things overwhelms what any 
one Smalltalker can do (or even a small group of them stepping on each 
others' toes and watching their contributions suffer from "bit rot").

Another issue here is that Dan was writing 25 years ago in the context of 
a *proprietary* system. So, availability of all the code within that 
system to the user was essential for creativity, even if the code was 
controlled by someone else, since otherwise the individual had no access 
to the code ever. But, when working with open systems based on free 
software, the code and tools to work with it are accessible to anyone, 
even if they are in other languages or supported by other communities, 
say, GCC community, than the community one is currently working in (say, 
GNU Smalltalk).

Again, to contrast with Python, Squeak wants to run the show, but Python 
plays nice with all the other free tools of the GNU/Linux ecosystem. When 
you use Python, your environment is not just Python; it is really more 
like GNU/Linux. So, free (as in "freedom") software -- with accompanying 
free licenses like the GPL that work as de-facto constitutions for 
collaborative communities -- has shifted the landscape, and the 
development ideals may need to shift with it. This is another reason why 
Python, which has always been free and had a core community with those 
values, has been able to succeed so quickly over the past decade, whereas 
Smalltalk, which was originally proprietary, has struggled, even though a 
free-ish Smalltalk like Squeak is much more accessible to easier 
modification in many ways than Python.

=== technical versus conceptual barriers ====

Another issue here is that Dan is talking about "technical" barriers, 
which are not the only, or even the biggest, barriers to creativity. There 
often exist "conceptual" barriers. It is in the "failure of the 
imagination" that we face our biggest hurdles. Imagination is indeed "the 
ultimate resource".

For example, the code to generate the VM in Squeak needs a certain mindset 
to understand; one has to think about the domain of bit manipulations. 
Even though the syntax looks like Smalltalk, the domain is very different 
from your run-of-the-mill GUI application or eToy. While it is indeed an 
innovation to use Smalltalk syntax in an uncommon domain, and indeed 
Smalltalk's syntax is a marvel which can make unnecessary many "domain 
specific languages", one can not get around the conceptual barrier of a 
programmer understanding a new domain, as much as a familiar syntax might 
help with the task.

Thus, barriers will always exist in any programming system, since all 
interesting programs probably address new domains (or old domains in new 
ways). So, again, while the goal Dan defined twenty five years ago is 
"ideal", it is an ideal that can never be reached because of conceptual 
barriers encountered when working in multiple domains, even in a pure 
Smalltalk system which minimizes arbitrary technical barriers like 
differing syntax. Forcing everyone to work in Smalltalk using Smalltalk 
tools, as good as they are, means that other innovations developed in 
other languages with other tools, for example, Java, are lost to the 
Squeak community. Yes, in theory, anything is possible, especially with 
Squeaks interface to loadable modules; I am speaking more of tone and 
emphasis and culture here.

And, if it is so often the conceptual barrier that is the ultimate hurdle, 
then are technical barriers of different syntax and different tools really 
so big? Many humans become fluent in multiple human languages and their 
accompanying cultures, which is typically a harder thing than learning new 
computer languages. If one needs to switch mental gears conceptually to 
work on the VM, then is it *really* so bad if the VM is written directly 
in C like GNU Smalltalk does? Now, I know a lot of positive arguments can 
be made for the utility and convenience of the Squeak VM written in 
Smalltalk (especially given C's quirks as a not-quite-cross-platform 
language), but in the end, my point is, keeping everything in one syntax 
may not really save that much time for the community, all things 
considered. Even when the syntax is the same, the underlying domain 
semantics may be very different, and those semantics, or meaning of all 
the objects, are what take the time to learn. To build a new VM, one still 
needs to spend a long time understanding what a VM is and how it could 
work, and no choice of familiar tools or use of one single syntax will 
make that extremely easier (a little easier, yes). A better choice of 
abstraction perhaps might make maintain a VM easier for those who get the 
abstraction, but not a choice of language by itself all other things being 
equal. Were the Squeak VM coded in some other portable language (like Java 
or Free Pascal or OCaml) then it might not take very much more trouble to 
maintain -- and such a VM might even be easier to develop, as one could 
use the complete tool set of that other system to debug the VM in the 
language it was maintained in, rather than facing a technical barrier :-) 
of seeing C code for the VM in the debugger instead of the original 
Smalltalk source which was translated from. Granted, if the Squeak VM was 
coded in, say, OCaml, one would have a barrier to an VM maintainer of 
learning that language and its paradigms, but I would argue that the 
barrier would remain more conceptual than technical, and the syntax 
problem would be the lesser issue.

Right now, I think Squeak on the JVM, like Talks2 is a step towards, could 
be a really big win for the Squeak community, and translating the VM from 
an abstract representation (in Smalltalk) to a specific language is a big 
win there. Still, the VM could have been in any translatable abstraction 
(XML, Lisp, ANTLR parse tree from a VM-specific language, Parrot, etc.) 
and generating Java would still be easily doable (though of course 
Smalltalk encoding is preferable for Smalltalkers). Also it is not clear 
even if this is a big win, if it is a big enough win to justify other 
aspects making VMs harder to debug and maintain by having an intermediate 
translation step, compared to just working in a language that is more 
cross-platform by design than C, where somebody else does the hard work of 
maintaining that other platform. Again, here is the issue of community 
support versus individual support, and related assumptions.

=== Language versus "group computation" =====

I really like the "Figure 1" diagram in the original paper. And it remains 
a useful illumination of the problem area. Still, the bold statement 
"Purpose of Language: To provide a framework for communication" may not be 
the entire picture. What if one drops the big idea of "Language" entirely 
and focuses instead on "computation"? Consider if two or more people are 
not so much engaging in "language" as they talk, but instead are engaging 
in a "group computation", where utterances between group members plays a 
facilitating role. So, from this point of view, the major goal has to be 
allowing the group to compute effectively, whatever that takes. Language 
is one aspect of this. But the, so are licenses. So are communications 
channels. So are formal and informal community processes. And so on to all 
of sociology and politics. In a way, by elevating "language" as a 
paradigm, many of these other aspects could be missed.  Squeak the 
community has certainly suffered on some of these issues at various times 
(though recent efforts, especially on relicensing even it just had PR 
value, are hopeful sign).

Also, consider, the latest thinking on cognitive psychology and AI 
includes that the human brain simultaneously thinks about problems in 
multiple representations, and chooses second-by-second the representation 
that pays off the most in making progress towards goals. So for example, 
we may look out the window at a rainy day with a goal of going to the 
store by car, and we may simultaneously imagine becoming wet as a sort of 
3D world simulation of rain falling on our heads as we venture to the car 
in our imagination, engage in formal logic based on linguistic experience 
("if rain, then take umbrella"), use neural net pattern matching to get 
the most common behavior in the gestalt of the situation (it just feels 
right to reach for the umbrella based on the gestalt of the situation), 
plus we be mentally making a two-D map of a route an areas with obstacles 
(rain between self and car) and considering ways to make progress through 
the 2D representation of a rain obstacle. So here we might have four 
different representations we might be using simultaneously, which each 
have parsed the world differently, perhaps into "objects" of various sorts 
or perhaps not. One or more of them may prove most useful and drive our 
behavior for that moment.

Language is only playing a direct role in one of those representations in 
this case, the formal symbolic-logical process. Language may play a role 
in the others representations as well as we internally reflect with 
language (generating internal questions like "why do I feel like reaching 
for the umbrella?" Or, "How can I overcome the rain barrier?" etc.). 
However the other representational schemes may also be applied to the 
formal linguistic-symbolic representation or to each other.

In short, we now know that viewing the mind as solely about "language" is 
an overly simplistic way of thinking about it. And if the paradigm has 
grown, then so too should our computer support systems, in order to honor 
the insight in the original paper of: "The mechanisms of human thought and 
communication have been engineered for millions of years, and we should 
respect them as being of sound design. Moreover, since we must work with 
this design for the next million years, it will save time if we make our 
computer models compatible with the mind, rather that the other way around."

One of those forces shaping the mechanisms of human thought has been how 
it is the *group* which survives in the wilderness; the lone *individual* 
is rapidly picked off by accidents (say, a broken leg) or runs into 
trouble (say, a pack of coyotes) beyond his or her individual ability to 
cope. (That's one reason it's foolish to think you can survive an 
apocalyptic disaster long term by running away to the wilderness on your 
own.) When a village defends itself against a large pack of coyotes, even 
with verbal shouts and grunts to the coyotes or between villagers, what is 
going on is in some ways is primarily a coyote defense "computation" 
involving all the villagers and all their thinking (which may be operating 
lots of simultaneous decision making models), not just a "discussion" 
among villagers about coyotes, as useful as language may be in helping 
that larger group computation to come to a successful conclusion. So in a 
tool to enhance group creativity, we must consider all the ways to enhance 
these creative group computations, and those go beyond just supporting a 
common language.

== objects are an illusions, but useful ones ===

In my undergraduate work in psychology I wrote a senior paper in 1985 
entitled: "Why intelligence: Object, Evolution, Stability, and Model" 
where I argued the impression of a world of well-defined objects is an 
illusion, but a useful one. Considered in the context of the section 
above, we can also see that how you parse the world into objects may 
depend on the particular goal you have (reaching your car without being 
wet) or the particular approach you are taking to reaching the goal 
(either the strategy, walking outside, or any helping tool used, like a 
neural net or 2D map). Yet, the world is the same, even as what we 
consider to be an "object" may vary from time to time; in one situation 
"rain" might be an object, in another a "rain drop" might be an object, in 
another the weather might be of little interest. So objects are a 
*convenience* to reaching goals (in terms of internal states), not reality 
(which our best physics says is more continuous than anything else in 
terms of quantum probabilities, or at best, more conventionally a 
particle-wave duality). So objects, as tools of thought, then have no 
meaning apart from the context in which we create them -- and the contexts 
include our viewpoints, our goals, our tools, or history, or relations to 
the community, and so on.

Consider Dan's statement of "A computer language should support the 
concept of "object" and provide a uniform means for referring to the 
objects in its universe." That appears to me to have made a classical 
mistake of thinking the universe has only one parsing into one object 
hierarchy and that the objects exist in some sort of Platonic ideal. See 
Plato's "Allegory of the Cave" for the best example of the mistaken notion 
of only one true parsing, even though as the social commentary it may 
still be accurate: :-).
   http://faculty.washington.edu/smcohen/320/cave.htm
   http://www.ship.edu/~cgboeree/platoscave.html
As discussed above, the world does not have just one unique parsing into 
objects. Or, to bend Plato's allegory, that we sometimes find apparently 
discrete "shadows" useful to perceive and think about as "objects" does 
not mean there really are discrete ideal things out there casting those 
shadows, with any sort of one-to-one correspondence. Again, this is not to 
say objects are not useful, just that they are a tool.

To use an example from the paper, when Dan wrote: "Every time you want to 
talk about "that chair over there", you must repeat the entire processes 
of distinguishing that chair. This is where the act of reference comes in: 
we can associate a unique identifier with an object, and, from that time 
on, only the mention of that identifier is necessary to refer to the 
original object." That sounds really nice on the surface. But consider, 
what if the "chair" is glued to the floor? You think it is an "object" but 
there is no clear real boundary between it and the floor. And when you 
attempt to move it, what if the floor boards come up with it? You now have 
an entity you are manipulating which is not quite a "chair" and not quite 
a "floor" -- what is it? There is no neat "class" to put it in. Clearly 
you, the reader, can think about this entity so the human brain supports 
this fluidity in changing our definitions of objects and not requiring a 
one-to-one mapping to ideal classes to think about them, but a computer 
language like Smalltalk would have many problems representing this. 
William Kent, in the book _Data & Reality_ discusses these sorts of 
problems at length.

Sure, you could make a new object for the combined entity, but what if 
then you decided to take apart the chair into cushions, legs (with floor 
boards still glued on), a back, and lots of bolts? Now you have lots of 
new objects? Sure, but then how do you reference about all the objects and 
all their relations in all possible permutations consistently? Your mind 
can do it easily; a Smalltalk class hierarchy and related application 
would struggle to do it, at best.

Sure there are design patterns for some of these things (like "Facade") 
but they are not completely reflected in a system which has an overly 
static notion of "object". Smalltalk has some ways to deal with these 
things, like "becomes:", but that is not dealing with this problem in all 
its generality. You can simultaneously think about an original chair, a 
chair with floor boards stuck on it, and a chair taken apart -- so your 
mind is capable of much more imaginative representational power than a 
simple notion of "objects".

Again, discrete objects are a useful tool to think with, but they are not 
the only tool, and they are not as stable a tool as one might think at 
first glance. Objects are useful within contexts. Yet, Smalltalk lacks a 
formal notion of an object having a context (or imaginative world) which 
defines its meaning. When people (including Alan Kay) talk about Smalltalk 
they often say to the effect that objects are self-contained. But clearly 
they are not. Their meaning emerges out of their interactions with a world 
of other objects. Yet modern Smalltalk have not formalized the notion of a 
world of objects beyond a very coarse-grained kitchen-sink "image". It 
would seem one needs finer-grained contexts, be they "worlds", "modules", 
or some other thing.

==== talking to an object vs. manipulating it ====

Consider this statement from the paper: "Computing should be viewed as an 
intrinsic capability of objects that can be uniformly invoked by sending 
messages." It sounds uniform (from an implementation point of view), yet 
it violates the human notion of how most of our time is spent actually 
interacting with "objects". When we use language, we are generally talking 
to ourself or other people; most items in the world don't respond directly 
to language. We use our hands or feet or whatever to manipulate them -- to 
move them or change their internal configuration. Not every object in the 
real world has a name or knows what the best inspector is for it; in fact, 
very few of them do, except perhaps people. When we pick up a rock, we try 
different tools on it if we want to observe it. We classify it ourselves; 
we don't ask it its class. We may stick a label on it and put it in a 
museum, but that is an active effort of categorization. And we may later 
change our minds about how to label it. Or we may break it into two parts 
(plus some rock fragments), and grind one of the parts up into rock dust 
and put some of the dust through various chemical processes over a period 
of years (say, if it was originally a "moon rock"). A moon rock does not 
know how to perform chemical analysis on itself or even split itself in 
two, yet Smalltalk philosophy encourages making models of reality as if a 
moon rock did.

I think the issue shows why languages like Lisp or Python (a Lisp 
derivative in some ways) or even C++ have hung on so well, both as 
philosophies and communities. In those languages you often have data 
structures which are operated on externally by large subroutines. And 
people who like these languages claim they like to sometimes do OO when 
they want it (OO meaning behavior emerges from a lot of interacting 
objects), or at other times to do these sorts of external manipulations on 
sets of inert objects using complex routines. Manipulating otherwise 
inert-seeming objects according to our fancy of the moment is something 
that people are comfortable with, and likely a big part of our mind is 
structured to do that well. Yes, we do talk to people or certain animals 
(or now certain devices). But we also do a lot of manipulation by hand (or 
foot etc.) and classification by eye (or ear or touch).

So, this suggests perhaps it is a mistake to have an object hierarchy 
where at the top everything knows its name or how to put data into its own 
slots. What's wrong with, say, asking the VM to put data into an objects 
slots? Or asking the VM for the ID of an object? Why should objects be 
expected to be so smart when we programmers are surrounded in the real 
world with objects which are usually quite dumb? This a violation of the 
good principles that Dan starts out the paper with -- to make a system 
which maps well onto how humans think. Humans both talk and manipulate, so 
it would seem a system should support both styles of interaction. Granted, 
it is almost trivial in Smalltalk to reach into other objects and 
manipulate them, but my point is that Smalltalk is not presented that way, 
and such interactions are generally discouraged as bad practice. Somehow I 
  feel this issue needs to be revisited. Among newer GUI interfaces, like 
Morphic, there is an emphasis on direct manipulation. Yet, somehow, this 
notion is discouraged in programming. There is a paradigm conflict here 
which needs to be addressed.

== summing up==

I don't have time or energy right now to go into the rest of this 
excellent paper in detail; much is either on the value of modularity, 
which I agree with (even as Squeak Smalltalk may not have enough of it in 
practice), or implementation strategy or GUI (which is a whole other can 
of worms).

But let me say these criticisms are made with (perhaps) 20/20 hindsight a 
quarter century later. Dan himself may have come to these insights or 
better ones by now; no doubt like many people when contemplating their 
earlier work of decades gone by, they can be both proud of it and 
embarrassed by it at the same time (I know I am). As Dan said insightfully 
in the conclusion: "There are clearly other aspects to human thought that 
have not been addressed in this paper. These must be identified as 
metaphors that can complement the existing models of the language."
This essay is intended along those very lines Dan mapped out so long ago.

For its time, the original paper is a remarkable achievement, as is 
Smalltalk-80. It is only because of such great work that we can think 
about moving forward onto even greater projects.

But what I find most illuminating stumbling across this paper again (I 
probably read it in Byte way back when) is that now, in retrospect, it 
seems to explain both the ways Smalltalk would *succeed* spectacularly in 
the goal of making (*some*) individuals more creative (though, contrast 
with Howard Gardener's theory of *multiple intelligences* including 
non-language ones, see the list here:)
   http://www.infed.org/thinkers/gardner.htm
and also the ways Smalltalk would *fail* (somewhat) in making groups more 
creative (which it was not designed to do), compared to, say, Python 
(which is not as scalable for individual user's creative projects, but has 
other advantages in a group context).

Subconsciously it may be these sorts of issues that motivate would-be 
Squeakers to have an interest in, say, Python. And these sorts of issues 
may be implicitly behind some of the specific issues Squeak has wrestled 
with both as an implementation and as a community. Obviously, the Squeak 
community, especially with, say, Monticello or Croquet, is trying to 
bridge this gap to support group creativity. OpenAugment, based on Squeak, 
is indirectly another such project.
   http://www.openaugment.org/
So it is not like there are no attempts to recognize some of these issues 
and move forward. But perhaps the original "Design Principles Behind 
Smalltalk" paper (as it unconsciously resonates about in the Smalltalk 
community, and even in the minds of core Squeakers) now holds Smalltalk 
(and Squeak) back, as much as that paper propelled Smalltalk forward for a 
quarter century.

--Paul Fernhout
(I hearby place this essay under the GPL, version 2 or later; and also the 
GFDL with no invariant sections).