Hi--
contents: introduction shared variables and compilation/execution classes other shared variables the root class the special objects array method references to "Smalltalk" another reminder about live behavior transfer why do this now
introduction
Previously I mentioned that I wanted to rid Spoon of the system dictionary. Here's how I'm currently doing it, and, more generally, how I'm supporting shared variables in Spoon. Thanks in advance for any feedback! First, a recap of the system dictionary concept and why I don't like it. :)
shared variables and compilation/execution
Shared variables in Smalltalk are stored as associations; the key is a shared variable name, and the value is an object associated with that name. When the compiler compiles source for a method that refers to a shared variable's name, it attempts to find an appropriate shared-variable association for that name. It stores that association in the "literal frame" of the resulting compiled method (currently, a span of the method's bytes after the header and before the instructions).
There are instructions for pushing the value of a particular shared-variable association from a method's literal frame onto the stack (or "temporary frame") of a context running that method (see Interpreter>>pushLiteralVariableBytecode, the implementation of interpreter operations 16r40 to 16r5F).
Traditionally, all of these shared-variables associations are stored in dictionaries that the compiler knows about. As far as the compiler is concerned, the outermost shared-variable scope is represented by the "system dictionary", a singleton instance of the SystemDictionary class called "Smalltalk". (Just to review, note that the system dictionary has an association whose key is the symbol #Smalltalk and whose value is the system dictionary itself. This association is used in the compiler methods that make use of the system dictionary.)
classes
Most of the associations in the system dictionary refer to classes. The key of each such association indicates the name of the corresponding class as far as the compiler is concerned. Additionally, each class has a "name" instance variable. That is, the name of each class is stored in two distinct places: in the system dictionary, and in the class itself. In effect, the compiler's notion of a class' name and the class' own notion of its name are distinct (and possibly conflicting).
I propose to make the compiler use the classes' notion of names directly, so that there is only one naming scheme, and that the classes themselves are responsible for it. To do this, instead of storing a class' name symbol in its "name" instance variable, we can just store the shared-variable association that the compiled methods use (and which used to be in the system dictionary).
When the compiler wants to find a class with some name in some source a human just wrote, it can search the class hierarchy from the root (class Object). As I discussed earlier here with Ralph Johnson, it's typically not as fast as a dictionary lookup, but it's acceptable (the compiler tends not to be a part of the system that needs every cycle squeezed out of it).
If the compiler finds multiple classes with some name in source submitted for compilation, it can present other information about these classes (e.g., class category or module) to the human, and ask for a choice. When already-compiled methods are transferred between systems, there is no ambiguity, since class names aren't used at all (see "another reminder about live behavior transfer" below).
the root class
As you might guess, this means that Spoon will not have multiple root classes. So far, all the non-primary root classes in Squeak were motivated by a desire to use method lookup failure for various "proxyish" features. I support such features in Spoon directly with the interpreter (see for example, class "Other"), so it's not necessary to have more than one root class (it's also not necessary to have the "ProtoObject" class).
As for how to access the root class, there are a couple of options. We could store the root class' shared-variable association directly in methods, or we could store the root class in the "special objects array" (it could take the system dictionary's place there, in fact).
the special objects array
This brings me to the special objects array. :) I've always found it odd that it's chock-full of well-known and relatively unchanging things, but it doesn't have its own class and protocol. I've never liked the name "special objects array" either; it seems too vague. Metaphorically, I think the special objects array represents the grip that the interpreter has (and needs to have) on the object memory. So for Spoon I've created a class called "InterpreterGrip" whose sole instance is a collection of the objects that the interpreter knows about. I call each of these objects a "grip point". There is protocol for accessing them (for example, a "rootClass" message). I find this more pleasant than the current scheme.
other shared variables
Anyway, back to the system dictionary. I addressed the associations there that refer to classes, but there are others. These are the other so-called "global" variables (like Display, the primary display) as well as all the "shared pools" (like TextConstants and, strictly speaking, Undeclared). I think each global variable should be the responsibility of some class. So the primary display could be something you get by sending "primary" to DisplayScreen.
Shared pools are dictionaries of shared-variable associations, similar to the system dictionary (in fact, I'd call the system dictionary just another shared pool). I know some think we should simply banish all shared pools, but I'll assume for the moment that we're keeping them. I find them useful, I just think some class should take responsibility for each one. I've added a "publishedPools" instance variable to Class, which stores all the shared pool dictionaries for which a class has responsibility (i.e., the class that introduced the pool into the system). I renamed the traditional "sharedPools" instance variable in Class to "receivedPools"; these are the pools that a class merely uses. Finally, I renamed the "classPool" instance variable to "classVariablesPool", just to be clearer.
When you want to use a shared pool, you access the pool by sending a message to the responsible class, rather than relying on its name being a global variable.
method references to "Smalltalk"
So now we've got new homes for all the shared-variable associations which used to be reachable through the system dictionary. The other thing to do is refactor the methods which use the shared-variable association for the system dictionary itself (the methods which refer to "Smalltalk"). I'm working on this now. There are about a thousand of them in a "full" object memory, but for most of them it's clear which class should actually take responsibility. For example, there are several methods which (in my opinion) are rightly the responsibility of the Interpreter class (like the garbage collection messages). I've also written some refactoring tools that automate a lot of this (e.g., a tool which replaces the push of one literal variable with another when followed by the sending of a particular message).
another reminder about live behavior transfer
Some of these decisions would be problematic if we were limited to using source code ("fileouts") to transfer behavior between systems. Since Spoon can transfer methods directly, without recompilation (or even source code) and without referring to shared-variable names at all, it works (see the MethodLiteralTransmissionMarker hierarchy for details).
why do this now
This work was always lurking in the future, but now the issue is forced by my work on Naiad (Spoon's module system). I'm making a module which reattaches the primary display (the system is initially headless), and that meant deciding how to access it. Since access is traditionally through a global variable (Display), the can of worms was opened. :)
***
Again, thanks in advance for any feedback or questions. I'm usually around on the Squeak IRC channel from 1700 to 0500 GMT, and I read the squeak-dev and Spoon lists.
thanks again,
-C
I like this kind of email. I like to check whether what I know is correct :)
Shared variables in Smalltalk are stored as associations; the key is a shared variable name, and the value is an object associated with that name. When the compiler compiles source for a method that refers to a shared variable's name, it attempts to find an appropriate shared-variable association for that name. It stores that association in the "literal frame" of the resulting compiled method (currently, a span of the method's bytes after the header and before the instructions).
Is my hypothesis really true? "is the binding (and not the class) stored because that way a class can be recompiled/changed and the new object representing the class can be changed without having to go over all the method literal frames?"
A positive answer to my question implies that the Smalltalk bindings are shared between the literal frames else we would have to change them or change the value (the class too).
There are instructions for pushing the value of a particular shared-variable association from a method's literal frame onto the stack (or "temporary frame") of a context running that method (see Interpreter>>pushLiteralVariableBytecode, the implementation of interpreter operations 16r40 to 16r5F).
I propose to make the compiler use the classes' notion of names directly, so that there is only one naming scheme, and that the classes themselves are responsible for it. To do this, instead of storing a class' name symbol in its "name" instance variable, we can just store the shared-variable association that the compiled methods use (and which used to be in the system dictionary).
So this means that the name would be a binding <name-pointer to self>
When the compiler wants to find a class with some name in some source a human just wrote, it can search the class hierarchy from the root (class Object). As I discussed earlier here with Ralph Johnson, it's typically not as fast as a dictionary lookup, but it's acceptable (the compiler tends not to be a part of the system that needs every cycle squeezed out of it).
other shared variables
Anyway, back to the system dictionary. I addressed the associations there that refer to classes, but there are others. These are the other so-called "global" variables (like Display, the primary display) as well as all the "shared pools" (like TextConstants and, strictly speaking, Undeclared). I think each global variable should be the responsibility of some class. So the primary display could be something you get by sending "primary" to DisplayScreen.
Shared pools are dictionaries of shared-variable associations, similar to the system dictionary (in fact, I'd call the system dictionary just another shared pool). I know some think we should simply banish all shared pools, but I'll assume for the moment that we're keeping them. I find them useful, I just think some class should take responsibility for each one. I've added a "publishedPools" instance variable to Class, which stores all the shared pool dictionaries for which a class has responsibility (i.e., the class that introduced the pool into the system). I renamed the traditional "sharedPools" instance variable in Class to "receivedPools"; these are the pools that a class merely uses. Finally, I renamed the "classPool" instance variable to "classVariablesPool", just to be clearer.
This is someone related. I always thought that classVariables should be renamed SharedVariables. I like that they did it in VW unifying SharedPools and SharedVariables.
I like the idea that we know the class introducing the Pools.
When you want to use a shared pool, you access the pool by sending a message to the responsible class, rather than relying on its name being a global variable.
method references to "Smalltalk"
So now we've got new homes for all the shared-variable associations which used to be reachable through the system dictionary. The other thing to do is refactor the methods which use the shared-variable association for the system dictionary itself (the methods which refer to "Smalltalk"). I'm working on this now.
In the latest version we fixed a lot of them: lot of the fixes were fixed by using self environment. Also lot of functionality of Smalltalk were successfully moved to SystemNavigation. What was less satisfactory is in 3.8 and 3.9 is that the system is in the middle of a refactoring. Originally we were thinking that we could have Smalltalk as only a namespace so we moved most of the non namespace behavior to SmalltalkImage. But it failed because people continued to add stuff there.
So I proposed that we create another class Namespace (Smalltalk would delegate to it for backward compatibility) and self environment would refer to it directly and that we would move back the code of SmalltalkImage to Smalltalk and that we rename SystemDictionary to reveal its real behavior SmalltalkImage (or the mess).
There are about a thousand of them in a "full" object memory, but for most of them it's clear which class should actually take responsibility. For example, there are several methods which (in my opinion) are rightly the responsibility of the Interpreter class (like the garbage collection messages).
Yes. You have also source management, system navigation, ... Have you look at 3.8 or 3.9 because we cleaned a lot of them already.
I've also written some refactoring tools that automate a lot of this (e.g., a tool which replaces the push of one literal variable with another when followed by the sending of a particular message).
another reminder about live behavior transfer
Some of these decisions would be problematic if we were limited to using source code ("fileouts") to transfer behavior between systems. Since Spoon can transfer methods directly, without recompilation (or even source code) and without referring to shared-variable names at all, it works (see the MethodLiteralTransmissionMarker hierarchy for details).
why do this now
This work was always lurking in the future, but now the issue is forced by my work on Naiad (Spoon's module system). I'm making a module which reattaches the primary display (the system is initially headless), and that meant deciding how to access it. Since access is traditionally through a global variable (Display), the can of worms was opened. :)
Craig do you only transfer method or also object? Roel will start to work on a parcel like system for Squeak.
Again, thanks in advance for any feedback or questions. I'm usually around on the Squeak IRC channel from 1700 to 0500 GMT, and I read the squeak-dev and Spoon lists.
Hi craig
I really like this kind of email. It is always cool to get another look at what what you believe you know :)
Shared variables in Smalltalk are stored as associations; the key is a shared variable name, and the value is an object associated with that name. When the compiler compiles source for a method that refers to a shared variable's name, it attempts to find an appropriate shared-variable association for that name. It stores that association in the "literal frame" of the resulting compiled method (currently, a span of the method's bytes after the header and before the instructions).
There are instructions for pushing the value of a particular shared-variable association from a method's literal frame onto the stack (or "temporary frame") of a context running that method (see Interpreter>>pushLiteralVariableBytecode, the implementation of interpreter operations 16r40 to 16r5F).
Traditionally, all of these shared-variables associations are stored in dictionaries that the compiler knows about. As far as the compiler is concerned, the outermost shared-variable scope is represented by the "system dictionary", a singleton instance of the SystemDictionary class called "Smalltalk". (Just to review, note that the system dictionary has an association whose key is the symbol #Smalltalk and whose value is the system dictionary itself. This association is used in the compiler methods that make use of the system dictionary.)
classes
Most of the associations in the system dictionary refer to classes. The key of each such association indicates the name of the corresponding class as far as the compiler is concerned. Additionally, each class has a "name" instance variable. That is, the name of each class is stored in two distinct places: in the system dictionary, and in the class itself. In effect, the compiler's notion of a class' name and the class' own notion of its name are distinct (and possibly conflicting).
I propose to make the compiler use the classes' notion of names directly, so that there is only one naming scheme, and that the classes themselves are responsible for it. To do this, instead of storing a class' name symbol in its "name" instance variable, we can just store the shared-variable association that the compiled methods use (and which used to be in the system dictionary).
When the compiler wants to find a class with some name in some source a human just wrote, it can search the class hierarchy from the root (class Object). As I discussed earlier here with Ralph Johnson, it's typically not as fast as a dictionary lookup, but it's acceptable (the compiler tends not to be a part of the system that needs every cycle squeezed out of it).
If the compiler finds multiple classes with some name in source submitted for compilation, it can present other information about these classes (e.g., class category or module) to the human, and ask for a choice. When already-compiled methods are transferred between systems, there is no ambiguity, since class names aren't used at all (see "another reminder about live behavior transfer" below).
the root class
As you might guess, this means that Spoon will not have multiple root classes. So far, all the non-primary root classes in Squeak were motivated by a desire to use method lookup failure for various "proxyish" features. I support such features in Spoon directly with the interpreter (see for example, class "Other"), so it's not necessary to have more than one root class (it's also not necessary to have the "ProtoObject" class).
As for how to access the root class, there are a couple of options. We could store the root class' shared-variable association directly in methods, or we could store the root class in the "special objects array" (it could take the system dictionary's place there, in fact).
the special objects array
This brings me to the special objects array. :) I've always found it odd that it's chock-full of well-known and relatively unchanging things, but it doesn't have its own class and protocol. I've never liked the name "special objects array" either; it seems too vague. Metaphorically, I think the special objects array represents the grip that the interpreter has (and needs to have) on the object memory. So for Spoon I've created a class called "InterpreterGrip" whose sole instance is a collection of the objects that the interpreter knows about. I call each of these objects a "grip point". There is protocol for accessing them (for example, a "rootClass" message). I find this more pleasant than the current scheme.
other shared variables
Anyway, back to the system dictionary. I addressed the associations there that refer to classes, but there are others. These are the other so-called "global" variables (like Display, the primary display) as well as all the "shared pools" (like TextConstants and, strictly speaking, Undeclared). I think each global variable should be the responsibility of some class. So the primary display could be something you get by sending "primary" to DisplayScreen.
Shared pools are dictionaries of shared-variable associations, similar to the system dictionary (in fact, I'd call the system dictionary just another shared pool). I know some think we should simply banish all shared pools, but I'll assume for the moment that we're keeping them. I find them useful, I just think some class should take responsibility for each one. I've added a "publishedPools" instance variable to Class, which stores all the shared pool dictionaries for which a class has responsibility (i.e., the class that introduced the pool into the system). I renamed the traditional "sharedPools" instance variable in Class to "receivedPools"; these are the pools that a class merely uses. Finally, I renamed the "classPool" instance variable to "classVariablesPool", just to be clearer.
When you want to use a shared pool, you access the pool by sending a message to the responsible class, rather than relying on its name being a global variable.
method references to "Smalltalk"
So now we've got new homes for all the shared-variable associations which used to be reachable through the system dictionary. The other thing to do is refactor the methods which use the shared-variable association for the system dictionary itself (the methods which refer to "Smalltalk"). I'm working on this now. There are about a thousand of them in a "full" object memory, but for most of them it's clear which class should actually take responsibility. For example, there are several methods which (in my opinion) are rightly the responsibility of the Interpreter class (like the garbage collection messages). I've also written some refactoring tools that automate a lot of this (e.g., a tool which replaces the push of one literal variable with another when followed by the sending of a particular message).
another reminder about live behavior transfer
Some of these decisions would be problematic if we were limited to using source code ("fileouts") to transfer behavior between systems. Since Spoon can transfer methods directly, without recompilation (or even source code) and without referring to shared-variable names at all, it works (see the MethodLiteralTransmissionMarker hierarchy for details).
why do this now
This work was always lurking in the future, but now the issue is forced by my work on Naiad (Spoon's module system). I'm making a module which reattaches the primary display (the system is initially headless), and that meant deciding how to access it. Since access is traditionally through a global variable (Display), the can of worms was opened. :)
Again, thanks in advance for any feedback or questions. I'm usually around on the Squeak IRC channel from 1700 to 0500 GMT, and I read the squeak-dev and Spoon lists.
thanks again,
-C
-- Craig Latta http://netjam.org/resume
Spoon mailing list Spoon@lists.squeakfoundation.org http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/spoon
spoon@lists.squeakfoundation.org