HELP! formalizing OO

Alan C. Kay alank at wdi.disney.com
Sun Apr 26 15:26:52 UTC 1998


At 3:27 AM -0400 4/26/98, Florin Mateoc wrote:
--snip --

>But I do wish to learn from your experience, if possible, so could you
>please elaborate ? What was the mathematical baggage that you were carrying
>and how is it reflected in the design of Smalltalk ? Have you published
>anything in this direction ?

Florin --

OK, I'll bite... Remember, I'm taking Russell's definition of math here as "p implies q".

(a) A beautiful mathematical object that straddles the classical math (CM) world and the computer math (CptrM) world is McCarthy's orginal "Lisp in itself" interpreter from ca 1959-62. Its only real flaw is that it's purely functional nature doesn't relflect the state space that is one of the most useful and powerful features of computing. However, its beauty led to considerable good work by Strachey, Landin, and many others to develop what are now called functional languages. There is much to be learned here. 
     My favorite of these (and I think of Dan's as well) is LUCID by Ed Ashcroft, which is an excellent attempt to allow the computer's state space while retaining as much of the safety and clarity that the functional approach provides. We made up several designs at PARC which used some of these ideas and combined them with safe viewing procedures ... but we never implemented them.
     If we ever get around to revamping Squeak/Smalltalk, I believe that some of the changes will be along these lines.
     For the dawn of Smalltalk(72), I tried to do an interpreter definition that would be as small and as clear as McCarthy's but that would also have a state space, and would allow a readable syntax. An example of this is in the Appendix of my HOPL II "Early History of Smalltalk" paper (ca. '93,'95). Also in an Appendix is Dan's very pretty one page formulation of Smalltalk '76 showing structures and bytecode interpreter from his POPL '78 paper.

(b) In my opinion, the best work that Klaus Wirth ever did was in his formalization of Euler (CACM, Jan-Feb 1966). Klaus thought (rightly I believe) that the most important thing about formalization is clarity and providing lots of opportunity for thinking about the resulting artifact. What he did was to take a very simple state space language -- a higher level syntax for which each expression was a single simple machine instruction (as in BCPL but even simpler) -- and proceeded to write a B5000-like interpreter for Euler bytecodes -- which were generated by a compiler (that could have been really simple, but was a little more complex than needed). The result was a few pages of symbols that worked very well both as an "inference machine", and as an object to be thought about.
     A year later to make FLEX, my first OOP language, I did to Euler what Nygaard and Dahl did to Algol to make Simula. Klaus' "new math" approach helped me to think things through.

(c) Also in the sixties, Seymour Papert pointed out that the computer equivalent of calculus:
                loop
                   x <- x + deltaX
(where the loop is the integration, and the assignment statement is the differential equation), could have a form that was completely sensible and even deliciously powerful for children. As a former classical mathematician he realized (as few did) that the computer constituted a new, beautiful and highly useful form of mathematics that could reach down to childhood, and that it would a good idea if we tried to invent and define it. 
     I still think this is one of the big wins today, and I wish more people would help us work on it ....

(d) On the other hand, the various attempts over the years to mathematize computers using CM have not really born fruit. The immense amount of work that went into formal language theories (for translation and compiling) and into representation (for proving programs correct) have been mostly moot. One reason is that CM is a weaker way to represent in many respects than CptrM, and the results are often too large to understand!  One of the things I used to point out to the "proving programs correct" crowd is that if the formalism was indeed a more compact way of completely describing the algorithm, then why shouldn't we just program in the formalism, and concentrate on automatic optimization? The answer generally turned out that the formalisms were neither more compact for many things, nor did they tend to be complete. Since most of the formalisms weren't runable, the bugs were harder to catch ....
     I think objects have done much more to help create complex descriptions that mean something, and get big programs working, if only because they confine parts of the computation and various kinds of errors to pieces of state and code short enough to understand. Two hundred pages of set theoretic formalism might look just like CM, but if a human cannot understand whether all the cases have been covered then the formalism is moot. This is a problem in many areas of Classical Math today (c.f. 4 color theorem, etc.).
     I think that a huge problem here is simply academia and its blessings of respectibility, desirability and appearence of CM in papers, whether it actually helps or not.
     As Ned Irons once pointed out about compiling (and he was one of the few that could use mathematical techniques to good ends) that any language that a human could read and understand, will not have more than very limited bounds of ambiguity -- and this means that nothing really fancy has to be done to translate the syntax part, but because of metaphor in natural languages, much more work is likely to be required in the non syntax part of understanding. The result of this is that all of the languages used today can be easily compiled (and most of them are) with the very first top-down recursive recognizers that Ned Irons was the original inventer of.

Now let's look at Smalltalk. 

(a) One of the things we could do to make it more usefully mathematical in the CptrM sense would be to separate out the pragmatics from the semantics in somewhat the same way that some of the primitives are written. The idea here would be to initially try to write methods that are as simple, straightforward, readable and to the point as possible. This is the math part -- the object being to get an understandable and useful prototype going as soon as possible. Many of us do this now. The problem comes when we start to optimize, and the optimizations start to show up in the formerly clean methods.
     Instead, we could change the language so that we can add on separate "pramatic methods" that, if present, will over-ride the simple "semantic" methods. The pragmods will have all the case analysis, etc (even changes of representation) to make things go as fast as they need. As a check, the semods can be run in parallel with the pragmods whilst debugging, etc.

(b) another thing we could do is to try to introduce more meaning into inheritance. Right now there is no "algebra" that will tell us anything useful about what a subclass might be relative to its super classes. Other oops have made a few gestures here -- even Eiffel, has the possibility of constraining a subclass to be like a super class, and to let the programmer know it. 
     Slot objects and slot inheritance would help give meaning to and enforce polymophism. I have written about this long ago ...
     Smalltalk uses inheritance for so many things that we can't reason with it. This is my definition of poor mathematics, and we can and should change this for the better over the next few years.

(c) Also along these lines would be to restructure the system into a "Basic Smalltalk" (as Dan has suggested on numerous occasions) that has a special place in the whole. These -- perhaps 50 -- classes would be the essence and foundation of the system, and could be learned first. The simplicity and clarity of Basic Smalltalk would aid CptrM reasoning everywhere in the system.

There is much more that could be said here. My main point though is simply that the real issues in programming systems are of a (new) mathematical nature, and have only peripheral connection to the issues and reasons for existance of set theory and abstract algebras in CM.

Cheers,

Alan





More information about the Squeak-dev mailing list