I had expected to see more in this direction over the years. A famous (in its time and place) precursor with this approach was ROSIE, a very nice expert systems language done at RAND in the late 70s and early 80s. Have you seen it?
There was some by now classical research done on natural language interfaces in the early 80s. Put simply, the general conclusion was that it is a bad idea to make a computer mimic humans, because users will then attribute human capabilities to it that it doesn't have. (Think ELIZA.) This is a basic social principle that we need in order to interact with others (we can't read their minds). When we then attribute too much to the computer, the breakdown that follows is very harsh, much more costly than what is gained by familiarity when it works.
It is much better to present the computer so that we will make reasonable attributions. Math-like languages make us regard it as a math machine, which is not that unreasonable.
But of course, this didn't affect AI researchers much. The real reason we haven't heard about progress in AI is that there hasn't been any to speak of. The small successes in restricted, well-chosen domains have never generalized, be it natural language processing, computer vision, symbolic induction, or whatever. The term "AI winter" was coined in the Lisp community in the late eighties. Today it seems that Bill Gates is about the only one who still thinks that AI will be "the next big breakthrough". For the last 40 years, AI news stories have begun like this: "Soon they will be here--computers that xxxx". "Don't hold your breath" is one of my favorite American expressions.
The trick in most of these systems is not how hard it is for the system to recognize a restricted English syntax, but how hard is it for a random human to learn to write in the restricted syntax.
Would another way to put this be that the supposed advantage, being easier to learn than more standard programming syntaxes, is not there? "Restricted" here means that it looks like it's ordinary language but it isn't. It doesn't let people use ordinary language.
In fact, a main feature of natural language is not the syntax--but the fact that most of the time we needn't be very precise at all with the syntax to be understandable. _This_ is what a "natural" syntax suggests to a user. And this is not what you want a user/programmer to belive, right?
Syntax is a big deal in programming languages but not real ones; this is a point that CS researches miss all the time. In fact, eg. Chomsky's theories apply much better to programming languages than real ones. With all the research on syntax, Markov chains (statistics about what words occur close to each other) are still better predictors of word order in "real" natural language than any theory of syntax.
The hard problem in programming is figuring out what to do--you have to understand the domain of available means (the language capabilities) as well as well as the problem you are solving. Addressing syntax leaves this alone, it only concerns how you express the solution once you have figured it out. Programming languages are just meant to be means for expressing solutions. What you need to address is the process of reaching solutions. The interactive environment of Smalltalk is the most important advance that has been made so far: a compiler merely processes the solution, whereas an interactive environment supports you while working out a solution.
So to condense my point: we need to shift the focus from the form of the expressed solution, to the process which produces it (and tools to support this). But the nature of such cognitive tools has only begun to be addressed in the last 5-10 years.
Oops, there I did that rant again.
Henrik
Henrik --
Well, sort of...
There is lots more to the story if one wants to break the ice with human beings. First, we have to note that using mathematical notation is only a little better with regard to meaning (since the computer conventions are often very different (e.g. how "=" is used in many languages, etc.). This is pretty fake.
More important to me is that even though "syntax is not important", most humans (even programmers) think that it is. "Even programmers" tend to imprint on the first syntax they learn (like Lorentz's ducks) and resist learning other syntaxes. (A striking example is C's syntax for conditionals, which was a hack, a bad idea, and not as nice as conditional syntaxes that came before -- yet it is recapitulated in many languages that came after just for reasons of familiarity.) One of the most wonderful languages ever invented is LISP and many many programmers have never realized it because they were put off by its syntax, etc.
If I were interested in having the main users for the media stuff in Squeak be programmers, then I would consider taking some syntax they are used to and putting it in as an alternative (and there are a number of people on this list that continue to discuss this as an option). But I am much more interested in children, parents, teachers, artists, etc., as users. I want programs they see to look "reasonable" (even if there are new ideas to be learned underneath -- and this is true of writing about new ideas in a natural language like English or Swedish). I want the programs to be "gistable" -- meaning that they can be skimmed and some familar elements can be recognized. A hero of mine here is Martin Luther, who considered whether he should try to find out how to teach Latin to all of Germany so they could read the bible for themselves, or whether he should try to give the German language more structure so that it could more easily have the bible translated into it. He wisely chose the latter, and I think this was one of the big "user interface" insights in history: that you should always find a way to start where the endusers are, and then help them grow into the big ideas.
One of the tricks here is to get away from thinking that programs have to be composed with only a simple text editor. For the last several years we have been experimenting with children doing "tile programming" (as in the current etoy authoring in Squeak). The idea here is that the system only lets you construct syntactically correct programs. If this works, then the syntax can be as readable as one wishes. This is an old idea, and the problems with this approach over the years have been awkwardness, fatigue and distraction while trying to think through how the program should work. Almost by accident this time around, we took phrases derived from message headings (with sample values inserted in the parameters as defaults) as the basic building block. With some sweet work by SqC the act of picking up tile phrases and dropping them in scripts became a pleasant (even somewhat sensual) act, and we (I was anyway) quite surprised in extensive testing with children that a far far larger percentage of them became extremely motivated creators and programmers in this system than in 25 years of previous experience. (There were a few other factors operating here as well that contributed to the success of these experiments.)
We realized that it would be really neat to be able to make something similar work with Squeak in general, and that an interesting way to approach a readable yet highly learnable enduser (we call them "omniusers") syntax would be exhibit programs in a highly readable syntax and that editing and composing would be mediated by generalizations of the UI techniques that had worked with the children. Quite a bit has been accomplished over the last six months, and we will shortly release the first version of this authoring system for the critiques of the Squeak List.
Cheers,
Alan
--------
At 1:06 PM +0100 11/23/00, Henrik Gedenryd wrote:
I had expected to see more in this direction over the years. A famous (in its time and place) precursor with this approach was ROSIE, a very nice expert systems language done at RAND in the late 70s and early 80s. Have you seen it?
There was some by now classical research done on natural language interfaces in the early 80s. Put simply, the general conclusion was that it is a bad idea to make a computer mimic humans, because users will then attribute human capabilities to it that it doesn't have. (Think ELIZA.) This is a basic social principle that we need in order to interact with others (we can't read their minds). When we then attribute too much to the computer, the breakdown that follows is very harsh, much more costly than what is gained by familiarity when it works.
It is much better to present the computer so that we will make reasonable attributions. Math-like languages make us regard it as a math machine, which is not that unreasonable.
But of course, this didn't affect AI researchers much. The real reason we haven't heard about progress in AI is that there hasn't been any to speak of. The small successes in restricted, well-chosen domains have never generalized, be it natural language processing, computer vision, symbolic induction, or whatever. The term "AI winter" was coined in the Lisp community in the late eighties. Today it seems that Bill Gates is about the only one who still thinks that AI will be "the next big breakthrough". For the last 40 years, AI news stories have begun like this: "Soon they will be here--computers that xxxx". "Don't hold your breath" is one of my favorite American expressions.
The trick in most of these systems is not how hard it is for the system to recognize a restricted English syntax, but how hard is it for a random human to learn to write in the restricted syntax.
Would another way to put this be that the supposed advantage, being easier to learn than more standard programming syntaxes, is not there? "Restricted" here means that it looks like it's ordinary language but it isn't. It doesn't let people use ordinary language.
In fact, a main feature of natural language is not the syntax--but the fact that most of the time we needn't be very precise at all with the syntax to be understandable. _This_ is what a "natural" syntax suggests to a user. And this is not what you want a user/programmer to belive, right?
Syntax is a big deal in programming languages but not real ones; this is a point that CS researches miss all the time. In fact, eg. Chomsky's theories apply much better to programming languages than real ones. With all the research on syntax, Markov chains (statistics about what words occur close to each other) are still better predictors of word order in "real" natural language than any theory of syntax.
The hard problem in programming is figuring out what to do--you have to understand the domain of available means (the language capabilities) as well as well as the problem you are solving. Addressing syntax leaves this alone, it only concerns how you express the solution once you have figured it out. Programming languages are just meant to be means for expressing solutions. What you need to address is the process of reaching solutions. The interactive environment of Smalltalk is the most important advance that has been made so far: a compiler merely processes the solution, whereas an interactive environment supports you while working out a solution.
So to condense my point: we need to shift the focus from the form of the expressed solution, to the process which produces it (and tools to support this). But the nature of such cognitive tools has only begun to be addressed in the last 5-10 years.
Oops, there I did that rant again.
Henrik
squeak-dev@lists.squeakfoundation.org