Hi there,
I've been writing an evaluator for a toy Smalltalk-like language, and reading the paper "The Evolution of Smalltalk" by Dan Ingalls which discusses the history from ST-72 to Squeak to get a better understanding of how STs worked under the hood.
My understanding from this paper (mainly, pages 4-6) and others is that in early implementations, objects were responsible for the evaluation of their own messages in a fairly low level sense
That is, ST-72 programs were first parsed in a sort of pre-processing step -- in such a way where objects could be initialized and differentiated from the tokens representing their messages. Then the messages as raw token streams were parsed into ASTs by the objects themselves, which then interpreted them. Is this actually how it worked?
It makes sense to me that it would work in this way -- I was thinking when writing a parser for my toy Smalltalk that the parser should probably be fairly ignorant about the nature of the messages it parses, outside of being able to distinguish selectors and arguments, and binary messages vs keyword vs unary messages.
In particular it probably does not know anything about precedence rules, since that would violate encapsulation and modularity -- e.g. in 3 + 2*4, it should be whatever method is invoked by the + selector that determines what happens to the tokens 2*3 rather than the parser -- i.e. whether we add 2 first and then multiply the sum by 3, or perform the multiplication and then add the product as in standard arithmetic.
But it also seems like this would force developers to mix tedious parsing code with domain code, which feels kind of messy. I would really appreciate any clarification on this -- apologies if this is the wrong list for this sort of question, I'm very new to the community
Thank you
Hugo Saavedra asked on Tue, 29 Aug 2023 18:01:13 -0600
I've been writing an evaluator for a toy Smalltalk-like language, and reading the paper "The Evolution of Smalltalk" by Dan Ingalls which discusses the history from ST-72 to Squeak to get a better understanding of how STs worked under the hood.
You might want to also look at the one page description at the end (page 49) of Alan Kay's "Early History of Smalltalk":
http://stephane.ducasse.free.fr/FreeBooks/SmalltalkHistoryHOPL.pdf
To understand how to use Smalltalk-72, the manual is probably the best option:
https://bitsavers.org/pdf/xerox/smalltalk/Smalltalk-72_Instruction_Manual_Ma...
My understanding from this paper (mainly, pages 4-6) and others is that in early implementations, objects were responsible for the evaluation of their own messages in a fairly low level sense
Exactly. This was true for ST-72 and ST-74 but in ST-76 Dan Ingalls defined the fixed syntax we have today and translated that into bytecodes that are then interpreted.
That is, ST-72 programs were first parsed in a sort of pre-processing step -- in such a way where objects could be initialized and differentiated from the tokens representing their messages.
Indeed, something like ( token1 token2 ( token3 token4 ) token 5) would get translated into two Array objects (called Vectors in ST-72 if I remember correctly) where one has 4 elements and the second has 2 elements. That is pretty much what the Lisp reader does as well.
Then the messages as raw token streams were parsed into ASTs by the objects themselves, which then interpreted them. Is this actually how it worked?
Each class has a single method and that is called with the message stream as the argument. There is no formal AST but the method explicitly pulls in the next tokens from the message using special commands (closed colon and open colon).
It makes sense to me that it would work in this way -- I was thinking when writing a parser for my toy Smalltalk that the parser should probably be fairly ignorant about the nature of the messages it parses, outside of being able to distinguish selectors and arguments, and binary messages vs keyword vs unary messages.
ST-72 didn't have any of these things, just token streams.
In particular it probably does not know anything about precedence rules, since that would violate encapsulation and modularity -- e.g. in 3 + 2*4, it should be whatever method is invoked by the + selector that determines what happens to the tokens 2*3 rather than the parser -- i.e. whether we add 2 first and then multiply the sum by 3, or perform the multiplication and then add the product as in standard arithmetic.
APL was one of the inspirations for Smalltalk and, despite being a very mathematical language, it doesn't have precendence either.
But it also seems like this would force developers to mix tedious parsing code with domain code, which feels kind of messy. I would really appreciate any clarification on this -- apologies if this is the wrong list for this sort of question, I'm very new to the community
Forcing users to explicitly use parenthesis is just a minor inconvenience:
(3 + 2) * 4
or
3 + (2 * 4)
-- Jecel
beginners@lists.squeakfoundation.org