Design Principles Behind Smalltalk, Revisited
Paul D. Fernhout
pdfernhout at kurtz-fernhout.com
Tue Dec 26 13:52:26 UTC 2006
J J wrote:
> One doesn't have to look far for this. C became popular for no other
> reason then it was "close enough" and we could use it "right now" to
> build systems. The more advanced languages used resources when there
> were no resources to use.
Paul Graham has an essay on why languages become popular:
In the case of both C and C++, one should not discount the wight of AT&T,
one of the largest and most widespread and visible companies of the time
(as it ran a telephone monopoly). Similarly, without the backing of both
Sun and IBM, Java might well never have taken off. Clearly Smalltalk was
much better than Java in many ways when it was released:
And, before Java,. people were actively converting too it as a "COBOL for
the 1990s"; and either C++ or Smalltalk were both so different from COBOL,
that there was no huge difference in ease of understanding either syntax
for COBOL programmers; in fact, Smalltalk was closer to COBOL's use of
complete words without arbitrary abbreviations if anything.
> "The rise of worse is better" largely misses the point here. The point
> is: getting 80% today is infinitely better then 100% "some day". And
> all the rest is just the incredible weight of "backward compatibility".
> And the concept of backward compatibility isn't just software and
> hardware. It extends to workers as well. The average programmer is
> just not very good (and I don't speak about the worth of the people as
> human beings. There are just so many in the field with no interest in
> it other then money, which is totally ok. It just doesn't make for good
> programmers). The cost of moving people who are barely keeping up from
> C++ to Java isn't so bad. It actually makes things simpler: just the
> same syntax again with much of what they didn't understand taken out.
> But moving these folks away from a C based syntax is out of the
> question. And getting rid of them in favor of more talented programmers
> would be just as out of the question.
Well, it is also true one big issue is that an Algol-like syntax with
operator precedence (times over plus) is taught in K-12 school. That is a
big advantage for a computer language to build on that, even as that
precedence is arbitrary and Smalltalk is more consistent. And you are
right on how Java seemed an easy move for C++ programmers. Of course, now
Ruby seems an easy move for Java programmers (and much of Ruby is based on
Smalltalk ideas), so in a matter of time, we may see Ruby developers
making the leap to a more self-documenting and flexible syntax. :-)
Still, Smalltalk syntax was supposedly designed to be easy for kids to
learn. It is not that hard to learn the syntax. I've helped people in
business learn it. It takes at most week to become proficient in it (and
often just a day). What is hard is to learn all the libraries. But, with
more and more programmers learning things like Java or Python or Ruby, all
systems with rich libraries, Ruby's being almost exactly Smalltalk's in
many ways, making the leap to a new syntax would be a minor investment
(and one worth taking because Smalltalk syntax is more extensible and
self-documenting than any of those other languages').
People are changing languages all the time. People have moved to Python;
people are moving to Ruby; people have even moved to languages like Perl,
which have much more tortured syntaxes or PHP which have much more limited
libraries. People learned HTML out of the blue because they wanted to do
web sites, and HTML is a much harder syntax to work in than Smalltalk's in
many ways (though you can edit in vi and then see immediate results in
your local web browser). So, why not people moving to Smalltalk (Squeak
especially)? People in Python or Perl or PHP or Ruby camps are not
bemoaning "backward compatibility" as the reason for limited success and
adoption. While everything you say it true, it is not true enough IMHO to
be the main reason. What are the others and how can they be addressed to
produce a popular free Smalltalk?
>> Again, to contrast with Python, Squeak wants to run the show, but
>> Python plays nice with all the other free tools of the GNU/Linux
> I keep on seeing this, but it appears largely overstated. Java has it's
> own VM, threads etc. as well. And it is easier to connect to the
> outside world in Squeak then Java, because in java you are in "your on
> your own!" land. In Squeak you always were so there is no need to be
> afraid of this step if you need it. In at least Squeak and Dolphin
> smalltalk you can call "extern C" style functions directly from
> smalltalk (thought in squeak you need to load FFI first). That is at
> least as good as any of the other languages.
True. Though there can still be a difference in "culture" of the
communities surrounding a language. Clearly Smalltalk's (or Squeak's)
culture is very different than Python's. I wrote something on that here,
in terms of how the cultures of the communities relate to their histories:
> And if you mean more to address the tools, well yes you *can* edit Java
> code in vi if you really want to. But no one really wants to. And if
> your interface to the language is through some program anyway, then the
> "barrier" of the code not being on the file system disappears.
Well, there is a bigger difference here between Python (which I mentioned)
and Java (which you mentioned). Python plays nicer with UNIX-y systems
than Java in many ways, mostly because Python is smaller, historically had
a faster startup time, and earlier had more comprehensive libraries for
interfacing with UNIX-y libraries. My point was more for Python, which is
being billed as a "glue" languages -- something to glue together your C
Java is different, as you point out. However, Java is so different, and
received so much attention, and incorporated so many Smalltalk-pioneered
ideas in the JVM design and class libraries (Swing) that ten years after
it has been introduced, it finally mostly works right as a self-contained
environment. Not quite VisualWorks, but darn close in many ways by now,
and it is free as in beer and is becoming free as in freedom (GPL). :-)
But for both Java and Python, being able to be easily edited in vi (or
emacs) or being able to use a conventional text oriented version control
system were indeed big wins, as they reduced the learning curve and
initial commitment to new ideas. Being able to use the familiar file
manager to look at code was also of value. And going beyond vi, the fact
that Java IDEs started to look like C++ IDEs was another big win on
familiarity. And seeing each class in a separate file in the good old
reliable file system was also comforting -- at least you knew where your
source is, and could use grep or other tools to search and manipulate it
and back it up in a familiar fashion. Talks2 shows this is possible --
having a directory of Smalltalk class files. It is possible to generate
text files from an image -- any Smalltalk can typically export such
classes. And it isn't that hard to export instances as text either (I made
something in Python that does it for instances in that language; any
Smalltalk could do much the same) which gives you an image defined by
textual program code to rebuild a world of objects.
>> Forcing everyone to work in Smalltalk using Smalltalk tools, as good
>> as they are, means that other innovations developed in other languages
>> with other tools, for example, Java, are lost to the Squeak community.
> Um... What innovations in Java?
Extensive tested and debugged libraries on a variety of topics.
>> Many humans become fluent in multiple human languages and their
>> accompanying cultures, which is typically a harder thing than learning
>> new computer languages. If one needs to switch mental gears
>> conceptually to work on the VM, then is it *really* so bad if the VM
>> is written directly in C like GNU Smalltalk does?
> Typically? It is harder in every case, no matter how badly designed the
> programming language.
Well, Spanish to Portuguese might be easier than COBOL to OCaml? But COBOL
to OCaml is hard for different reasons than syntax. :-)
> And what do you want to gain here? If the squeak community came out
> today and said "Ok! You can write the squeak VM in anything you want,
> we don't care", they wouldn't suddenly get volunteers knocking the doors
> down to work on squeak. They would only lose people who can work on the
> VM today (not because these people *can't* do it, but because they
> wouldn't want to anymore).
> While I agree that squeak is not required to be written in a subset of
> smalltalk, it *is* and changing it wont gain anything. Getting squeak
> to run on strong talk might, but I haven't seen anyone forbidding that.
My point here wasn't that Squeak should change; it was just an example of
how being different and staying entirely in Smalltalk might not have been
a big win, compared to just having a VM written in, say, C. There remains
the "conceptual" barrier of the VM domain, even as the "technical" one of
syntax is removed.
I am not against translating a VM from an abstract representation, in, say
Smalltalk. I think it is a clever idea, especially since it already has
been done. And with some more work, it might even gain the elegance of say
ANTLR's plugin for Eclipse, or ANTLRWorks, where you can step through the
abstraction in an IDE without seeing the underlying code (Java in ANTLR's
case). (Maybe Squeak can already do this by now?)
Still, having said that, a Smalltalk VM is so simple, consider this 47K
Public Domain one that does most of the work (from the Java version of A
Little Smalltalk, now called SmallWorld):
so how hard is that to maintain a Smalltalk VM the original Java?
Translating primitives into C or Java, like for sound manipulation, seems
like a bigger win. But even then, you have to be writing that code (or
rewriting that code) in such a non-Smalltalk way semantically that it is
still not clear to me if there is a lot of value in it. Especially when
the alternative might be to just call an existing sound synthesis library
written in Java or C. We now have Java for a good cross-platform language
with equivalent to C++ performance, so it would have been a harder choice
ten years previously as to what cross-platform language to use if not C
with all its quirks (Free Pascal?).
>> Right now, I think Squeak on the JVM, like Talks2 is a step towards,
>> could be a really big win for the Squeak community, and translating
>> the VM from an abstract representation (in Smalltalk) to a specific
>> language is a big win there. Still, the VM could have been in any
>> translatable abstraction (XML, Lisp, ANTLR parse tree from a
>> VM-specific language, Parrot, etc.) and generating Java would still be
>> easily doable (though of course Smalltalk encoding is preferable for
> Java isn't the end-all/be-all here. Microsoft is moving to a more
> dynamic VM already, and because of this Java will be forced to as well.
> Java has always been behind pre-existing technologies and this area is
> no different. If you want to move into the future it is best not to
> follow a group that is always behind.
The value of Squeak on Java is a separate issue. The value is mostly to be
able to reduce deployment overhead, especially for systems that mix
Smalltalk and faster native-y code written in Java or another JVM
language; Talks2 already did a lot of this work.
But here again is an issue of culture. Who cares if Sun is "behind"; or if
Squeak runs 30% slower without some extra dynamic dispatch opcode in the
JVM? Speed is not Squeak's main problem. Being able to leverage Sun's JVM
and the fact that you can call AWT classes in the same way for any
platform Java runs on is a big win for Squeak IMHO, as it would reduce the
maintenance burden of it in terms of complexity of the common code base,
and would also make it easy to install one common package for any platform
Java runs on. Ten years ago, or even five, I myself would have laughed at
the value of this idea (as Java was so buggy and unstable and slow). But
most of the bugs have been fixed, the 1.5 JVM shares memory across JVMs
and does dynamic translation for speed, so Java finally, now that it is
going free under the GPL, has the potential to be a great cross-platform
tool where you get both a common base GUI window system as well as the
ability to deliver fast primitives written in Java, as well as access to a
lot of libraries someone else has already written and debugged for you.
The Squeak community could admit that it would be a big win to leverage
that "pink plane" success, even if it is "behind" and decide to move
forward on top of it, but in other "blue plane" directions. Or it can
continue to spend a lot of time dealing with time consuming basic issues
relating to packaging and testing C code for lots of platforms (which
essentially just duplicates the work the Java community is doing, but not
as well because of more limited people power).
dot net is a non-starter because it is proprietary (and may be covered by
patents). And I would not make this suggestion without basing it on Sun's
move to the GPL for Java. There are several JVM Smalltalk already of course.
But none have the power of Squeak. And, building on Squeak's strengths, it
could be an opportune time to also shake off licensing problems, say by
carefully comparing with and using GNU Smalltalk code when possible, or by
using an approach like Bistro to leverage Java libraries temporarily until
replacement versions in Smalltalk could be written in a true "clean room"
But the bigger point, along the lines of this main "revisited" thread, is
that building on others work in a comprehensive way, like having a Squeak
on top of Java, even though it has been done somewhat with the excellent
Talks2, is something that goes against the grain of the community (and
quite possibly to its disadvantage).
Python, by contrast, runs on the JVM, using Jython, and has great
integration with Java. It has issues, and lags the main release, but
overall it is production quality (at least in earlier releases); and since
Java is such a difficult language to develop in because it is so verbose
with braces and passing through exceptions and types and such, Jython may
well be the big thing that makes Java continue to succeed. :-)
" Jython, lest you do not know of it, is the most compelling weapon the
Java platform has for its survival into the 21st century:-) —Sean
McGrath, CTO, Propylon"
Why not have Squeak in that role too? But the deeper question is, why is
it not there already, and why has, say, Talks2 not gotten more effort
behind it? And I think that issue has to do with community issues and also
licensing issues than technology issues. (I myself would build on Talks2,
right now except it is stuck in the same licensing ambiguity Squeak is;
I'm hoping when Squeak gets that cleared up for itself, that Talks2 might
>> == objects are an illusions, but useful ones ===
> To me this was the most insightful point in the whole essay. Though,
> honestly I thought this was pretty well understood. Object Orientation
> is simply a way of organizing code in a way that makes sense from the
> perspective of the problem domain it is related to. But since
> programming is a task of managing complexity, correct organization is a
> critical piece of the puzzle.
When one thinks deeply about this, perhaps your point about organization
is the big missing piece of the puzzle. Yes, you are right, people build
models of systems with objects, and should admit those models are
imperfect. But there is no formal support for this process in the
environment, or between people, other than using basic Smalltalk tools
(Browser, Debugger, maybe Refactoring tools). Well, I guess you could use
one of the formal OO modelling approaches, like CRC cards, but even that
is oriented to getting one model -- not to managing a variety of possible
representations to be used simultaneously as appropriate. Perhaps a next
generation of OO systems needs to explicitly support this process somehow.
How, I do not know. I just have the question here, not the solution. I do
think having objects point to a context or world is perhaps a start, and I
did that in a couple frameworks I have made in either Python or Smalltalk.
> But this observation is the reason OO databases haven't really taken
> off: An OO database will tend to model things how *your* application
> wants to see them. A traditional relational DBA will model things in
> the most generic way he can so that *all* the applications can build the
> view they need easily. Relational DBA's tend to be of the view point:
> The data will exist for the life of the company, while the applications
> that access it come and go like the tide. And one only needs to look at
> the huge Java rewrites going on to know they are right.
>> Consider Dan's statement of "A computer language should support the
>> concept of "object" and provide a uniform means for referring to the
>> objects in its universe." That appears to me to have made a classical
>> mistake of thinking the universe has only one parsing into one object
>> hierarchy and that the objects exist in some sort of Platonic ideal.
> Actually I think this applies more to C++ derived OO languages (e.g.
> Java). It is those languages that have huge hierarchies of things that
> are not that related due to the brain-dead typing systems. In smalltalk
> the only hierarchy that has to be is inheriting from Object. And you
> don't even have to do that.
> But I think this works just fine: We are choosing to code something, so
> we have to model it in the point of view appropriate to how we are going
> to solve the problem. And this implies some organization technique.
> And among organization techniques, (correct) OO has had the most success
> in my opinion.
All true, but if you look at how people teach OO, and how people talk
about it, especially n Smalltalk circles, I think the community and its
culture is somehow at odds with a greater flexibility. It's hard for me to
detail this precisely; it more has to do with tone. Certainly I like the
Smalltalk approach; it is just not enough.
>> <chair part snipped> There is no neat "class" to put it in.
> I wouldn't expect it to be in a class. I would expect classes to know
> how to stick to each other. :)
>> Clearly you, the reader, can think about this entity so the human
>> brain supports this fluidity in changing our definitions of objects
>> and not requiring a one-to-one mapping to ideal classes to think about
>> them, but a computer language like Smalltalk would have many problems
>> representing this. William Kent, in the book _Data & Reality_
>> discusses these sorts of problems at length.
> I disagree with where the focus is placed here. An entity typically
> does have just one name and would make sense to be called one thing in
> the system. What you are describing sounds more like interface
> protocols. This might be an area that could use more research, but
> honestly I would want to know what is bought by formalizing this
> existing practice more (e.g. making protocols first class objects
> themselves or something).
> For an example of what I mean, in case it isn't that clear, we could
> think about Lists. They have a collection protocol: a series of
> messages that conform to what other collections can do. But they could
> also have a "stack" protocol: a series of messages for treating the list
> as though it were a stack.
> This could be seen as what we do in real life. Due to necessity I may
> find myself driving a nail into a piece of wood with a screw driver.
> But I would never call what is in my hand a hammer. I would simply be
> using it's "blunt object" interface momentarily.
That is one of the reasons I have been attracted to Prototypes, which
attempt to address issues like:
"Experience with early OO languages like Smalltalk showed that this sort
of issue came up again and again. Systems would tend to grow to a point
and then become very rigid, as the basic classes deep below the
programmer's code grew to be simply "wrong". Without some way to easily
change the original class, serious problems could arise"
But prototypes have other problems, the biggest being difficulty being
self-documenting the way classes are. As someone put it to me, "if you
want to share something, you probably have to name it".
Formalizing protocols is one possible idea. Dan mentions it in his paper.
I thin the solutions to this issue lie in deeper directions. Classes or
instances or prototypes could be building blocks, perhaps, but we could
use other abstractions and better tools somehow. What these are, I do not
know for sure. Still, like Bill Kent, I think these may lie in the
direction of being able to model "relations" somehow.
More information about the Squeak-dev