Spoon progress 27 July 2006: shared variables (including "globals")

Fri Jul 28 23:50:46 UTC 2006

Hi Craig,
I have a couple of points i'd like you to make clearer:

Le Vendredi 28 Juillet 2006 09:14, Craig Latta a écrit :
> Hi--
>
> contents:
>  introduction
>  shared variables and compilation/execution
>  classes
>  other shared variables
>  the root class
>  the special objects array
>  method references to "Smalltalk"
>  another reminder about live behavior transfer
>  why do this now
>
> introduction
>
>  Previously I mentioned that I wanted to rid Spoon of the system
> dictionary. Here's how I'm currently doing it, and, more generally, how
> I'm supporting shared variables in Spoon. Thanks in advance for any
> feedback! First, a recap of the system dictionary concept and why I
> don't like it. :)
>
> shared variables and compilation/execution
>
>  Shared variables in Smalltalk are stored as associations; the key is a
> shared variable name, and the value is an object associated with that
> name. When the compiler compiles source for a method that refers to a
> shared variable's name, it attempts to find an appropriate
> shared-variable association for that name. It stores that association in
> the "literal frame" of the resulting compiled method (currently, a span
> of the method's bytes after the header and before the instructions).
>
>  There are instructions for pushing the value of a particular
> shared-variable association from a method's literal frame onto the stack
> (or "temporary frame") of a context running that method (see
> Interpreter>>pushLiteralVariableBytecode, the implementation of
> interpreter operations 16r40 to 16r5F).
>
>  Traditionally, all of these shared-variables associations are stored in
> dictionaries that the compiler knows about. As far as the compiler is
> concerned, the outermost shared-variable scope is represented by the
> "system dictionary", a singleton instance of the SystemDictionary class
> called "Smalltalk". (Just to review, note that the system dictionary has
> an association whose key is the symbol #Smalltalk and whose value is the
> system dictionary itself. This association is used in the compiler
> methods that make use of the system dictionary.)
>
> classes
>
>  Most of the associations in the system dictionary refer to classes. The
> key of each such association indicates the name of the corresponding
> class as far as the compiler is concerned. Additionally, each class has
> a "name" instance variable. That is, the name of each class is stored in
> two distinct places: in the system dictionary, and in the class itself.
> In effect, the compiler's notion of a class' name and the class' own
> notion of its name are distinct (and possibly conflicting).
>
>  I propose to make the compiler use the classes' notion of names
> directly, so that there is only one naming scheme, and that the classes
> themselves are responsible for it. To do this, instead of storing a
> class' name symbol in its "name" instance variable, we can just store
> the shared-variable association that the compiled methods use (and which
> used to be in the system dictionary).
>
>  When the compiler wants to find a class with some name in some source a
> human just wrote, it can search the class hierarchy from the root (class
> Object). As I discussed earlier here with Ralph Johnson, it's typically
> not as fast as a dictionary lookup, but it's acceptable (the compiler
> tends not to be a part of the system that needs every cycle squeezed out
> of it).
>
>  If the compiler finds multiple classes with some name in source
> submitted for compilation, it can present other information about these
> classes (e.g., class category or module) to the human, and ask for a
> choice. When already-compiled methods are transferred between systems,
> there is no ambiguity, since class names aren't used at all (see
> "another reminder about live behavior transfer" below).
>

Compiler is no more heavily used to rebuild packages from sources, since you 
transfer compiled objects.

You beat Namespaces with simplicity by asking user to resolve conflict by 
menu. It's not declarative, it's interactive. But then, how will we 
understand code when reading from source ? hyperlink navigation ?

The only concern is if one ever wanted to rebuild a system from source... 
Being asked for many choices, it would be quite boring... And it would be 
very hard to investigate code from others...
We should better not rebuild once the system grow.

In case you want to merge two equivalent classes because they are doing mostly 
the same job, you'll have to recompile from source... Then you are exposed to 
name clashes and its flow of menus...

I imagine you could maybe give a hint to the compiler so that it does not ask 
you twice the same question within the same compiling unit...
Or should we have tools able to relink ?

> the root class
>
>  As you might guess, this means that Spoon will not have multiple root
> classes. So far, all the non-primary root classes in Squeak were
> motivated by a desire to use method lookup failure for various
> "proxyish" features. I support such features in Spoon directly with the
> interpreter (see for example, class "Other"), so it's not necessary to
> have more than one root class (it's also not necessary to have the
> "ProtoObject" class).
>
>  As for how to access the root class, there are a couple of options. We
> could store the root class' shared-variable association directly in
> methods, or we could store the root class in the "special objects array"
> (it could take the system dictionary's place there, in fact).
>
> the special objects array
>
>  This brings me to the special objects array. :)  I've always found it
> odd that it's chock-full of well-known and relatively unchanging things,
> but it doesn't have its own class and protocol. I've never liked the
> name "special objects array" either; it seems too vague. Metaphorically,
> I think the special objects array represents the grip that the
> interpreter has (and needs to have) on the object memory. So for Spoon
> I've created a class called "InterpreterGrip" whose sole instance is a
> collection of the objects that the interpreter knows about. I call each
> of these objects a "grip point". There is protocol for accessing them
> (for example, a "rootClass" message). I find this more pleasant than the
> current scheme.
>
> other shared variables
>
>  Anyway, back to the system dictionary. I addressed the associations
> there that refer to classes, but there are others. These are the other
> so-called "global" variables (like Display, the primary display) as well
> as all the "shared pools" (like TextConstants and, strictly speaking,
> Undeclared). I think each global variable should be the responsibility
> of some class. So the primary display could be something you get by
> sending "primary" to DisplayScreen.
>

Yes but then DisplayScreen should become an abstract class with convenient 
factory to hook PrimaryDisplay at image startup to a concrete 
SqueakDisplayScreen (the squeak window), an OSDisplayScreen (if you want to 
use OS windows) or maybe a FakeScreen if you are headless

And sure, if you drive more than one Screen, one single Display global does 
not make sense... Better use messages, you are right.

>  Shared pools are dictionaries of shared-variable associations, similar
> to the system dictionary (in fact, I'd call the system dictionary just
> another shared pool). I know some think we should simply banish all
> shared pools, but I'll assume for the moment that we're keeping them. I
> find them useful, I just think some class should take responsibility for
> each one. I've added a "publishedPools" instance variable to Class,
> which stores all the shared pool dictionaries for which a class has
> responsibility (i.e., the class that introduced the pool into the
> system). I renamed the traditional "sharedPools" instance variable in
> Class to "receivedPools"; these are the pools that a class merely uses.
> Finally, I renamed the "classPool" instance variable to
> "classVariablesPool", just to be clearer.
>
>  When you want to use a shared pool, you access the pool by sending a
> message to the responsible class, rather than relying on its name being
> a global variable.
>

Very true, from compiler's point of view Smalltalk is just another sharedPool 
like TextConstants, except it does not need being declared as 
poolDictionary...
Something VW also generalized with Namespaces. They multiplied the 
SystemDIctionary, It's fun to see you take an opposite direction.

Of course, SharedPool are usefull, because you can simply write (CR) in your 
code instead of (Text constants at: #CR), and that's more efficient in term 
of bytecodes, second case having same code for accessing association value, 
plus two message sends plus a Symbol literal...

If you want to use shared pool keys in your code, you have to declare it 
somewhere (in class definition by now). Do you write something like 
(poolDictionaries: {Text constants}) instead of (poolDictionaries: 
'TextConstants') ?

If that is the case, i see a little problem. If we store and initialize 
TextConstants in a Text classVariable, how would Text use CR constant 
itself ?
There is a bootstrap problem in class definition:

ArrayedCollection subclass: #Text
 instanceVariableNames: 'string runs'
 classVariableNames: 'TextConstants'
 poolDictionaries: {Text constants}
 category: 'Collections-Text'

Should TextConstants be declared and initialized in a neutral place ?
Of course, if you never rebuild code but just do the bootstrap once and then 
only transfer resulting compiled objects, then maybe you do not bother... is 
that it ?

Does per method meta declarations like 
<thisCompiler useSharedPool: Text constants>
would make any sense ?
like a Ada/C++/Fortran90 with package, use,...

> method references to "Smalltalk"
>
>  So now we've got new homes for all the shared-variable associations
> which used to be reachable through the system dictionary. The other
> thing to do is refactor the methods which use the shared-variable
> association for the system dictionary itself (the methods which refer to
> "Smalltalk"). I'm working on this now. There are about a thousand of
> them in a "full" object memory, but for most of them it's clear which
> class should actually take responsibility. For example, there are
> several methods which (in my opinion) are rightly the responsibility of
> the Interpreter class (like the garbage collection messages). I've also
> written some refactoring tools that automate a lot of this (e.g., a tool
> which replaces the push of one literal variable with another when
> followed by the sending of a particular message).
>

True, Smalltalk being the root object, it also has been considered with a 
semantic slip as the ObjectMemory, the ImageSettingsRepository or the 
Interpreter...

Sometimes we have to check for existence of a class.
This can be done with (Smalltalk includesKey:  #MyClass).
VW has #{MyClass} construct...
What is your replacement? Root class tree recursion?
Of course, you cannot rely only on name anymore...

> another reminder about live behavior transfer
>
>  Some of these decisions would be problematic if we were limited to
> using source code ("fileouts") to transfer behavior between systems.
> Since Spoon can transfer methods directly, without recompilation (or
> even source code) and without referring to shared-variable names at all,
> it works (see the MethodLiteralTransmissionMarker hierarchy for details).
>
> why do this now
>
>  This work was always lurking in the future, but now the issue is forced
> by my work on Naiad (Spoon's module system). I'm making a module which
> reattaches the primary display (the system is initially headless), and
> that meant deciding how to access it. Since access is traditionally
> through a global variable (Display), the can of worms was opened. :)
>
> ***
>
>  Again, thanks in advance for any feedback or questions. I'm usually
> around on the Squeak IRC channel from 1700 to 0500 GMT, and I read the
> squeak-dev and Spoon lists.
>
>
>  thanks again,
>
> -C

From what you explained, i do not see major weak point in your approach.
It seems consistent and quite solid to me

Nicolas