Feedback on Naiad documentation
craig at netjam.org
Tue Dec 30 01:36:47 UTC 2008
> > By separating class name from identity, Naiad makes Smalltalk more
> > approachable for newcomers, and more productive for developer and
> > user communities.
> Expand on this.
Well, that's what I attempt to do in the rest of the document. :)
The first four paragraphs are an abstract, not a conclusion. I don't
expect anyone to simply believe what I've said by that point; I'm
summarizing what I intend to describe.
But, expanding... By separating class name from identity, we
remove a critical source of ambiguity, and we can transfer methods more
accurately, with less manual labor. We can do this transfer both between
an author's system and other people's systems, and from an author's
system to itself at another point in time. This makes every person's
work more amenable to study by anyone else in the community. It also
makes it easier for each person in the community to express their ideas
clearly through their work. I think this combination of transparency and
expressiveness would make Smalltalk more approachable for newcomers, and
more productive for developer and user communities.
> What has identity?
In this paper I'm referring to the identity of classes.
> On what does the identity depend.
I'm taking advantage of the fact that every class, simply by being
a Smalltalk object, has its own identity by definition.
> How does the Class name get detached from identity.
I detach the name of a class from the identity of that class by
offering an alternative to source code as the medium of code transfer.
> What does the class name represent?
The name of a class represents a class, but exactly which one is
ambiguous in source code. One needs additional context to identify the
class to which a name refers at some point in time. The mechanisms
described in the rest of the paper provide that context, and go so far
as to make source code optional when transferring code.
> > An Edit represents the activation of some edition at a point in
> > time.
> Describe activation.
When an edition is activated, the compiled method to which it
corresponds becomes active in the object memory whose history we are
> > The history memory replaces the current changes and sources
> > files
> I need diagrams.
Okay, I'll make diagrams. (I'm not against them, but my time is
limited and I think the text is a higher priority. Or, put another way,
I think diagrams without text would be harder to understand than text
with no diagrams, and getting one out before both are done is better
than waiting for both.)
> What does subject memory id represent? When in time do subject memory
> id's change?
A subject memory's system ID identifies the subject memory. A
subject memory's ID is set when the memory is considered distinct from
other memories. I set the system ID of the minimal memory before
releasing it, and I imagine that memory will change its ID when next
saved by someone receiving it. Traditionally, the identity of an object
memory remains the same after a snapshot via "save", and it changes
after a snapshot via "save as...". I expect the system ID to reflect
that; generally, the system ID for a memory doesn't change.
> I.e. The subject memory with two subject ids, how do they differ?
There's no such thing as a subject memory with two system IDs
(please let me know if I slipped up and wrote otherwise somewhere).
> > The subject memory keeps a remote reference to the history
> > memory's instance of EditHistory as a class variable of the local
> > EditHistory class, and interacts with it using utility messages sent
> > to the local EditHistory class. The history memory also keeps that
> > EditHistory instance as a class variable of its local EditHistory
> > class, but as a local reference.
> Explain Remote messaging as it applies here.
I'm using the term in the usual sense: an object in one memory is
sending messages to an object in another memory, by interacting with a
special proxy object that represents the remote object. Since everything
in Smalltalk happens by sending messages to objects, the proxy object is
indistinguishable from the remote object under normal circumstances.
> Who or what receives the messages?
An instance of EditHistory in the history memory receives the
> Why are [the messages] Remote?
By design, all the historical information for the subject memory
is kept in a separate history memory. This is both to enable crash
recovery, and to ease separation of a deployed system from its
historical information when the time comes.
> What is a History memory snapshot?
The history memory is a bunch of objects describing all the
editions of a subject memory's classes, methods, authors, modules,
checkpoints, comments, and tags.
> Where is the History memory located?
The history memory is usually located alongside the subject
memory, but it can be on any net-accessible machine.
> Where is the snapshot located?
The subject memory is located wherever the developer wants to put
it, as now.
> > An edition typically elides some of its references when it is
> > transferred out of a history memory. For example, a transferred
> > edition will usually omit the references to its next and previous
> > editions.
> > The requesting subject memory can calculate the ID of those
> > editions and obtain them with a separate request, if necessary.
In general, a next or previous edition's ID differs only in
version number, and version numbers form a simple linear sequence over time.
> Also if it can do that then you haven't really elided the
> references only made them implicit?
I'm referring to how the system leaves the reconstructed edition's
nextEdition and previousEdition fields set to nil, rather than going to
the trouble of serializing further editions. This is because they're
often not relevant to the original edition request. Following the
nextEdition and previousEdition references exhaustively during
serialization would take a significant amount of time and net traffic,
for no good reason in most cases.
> > A subject memory may elect to keep its EditHistory instance as a
> > local object, such as in a situation where one wants some limited
> > immutable history for debugging purposes, and no crash recovery
> > support. Whether in this scenario or in normal development the same
> > EditHistory utility messages suffice, since no special code need be
> > written to support remote objects.
> How do histories know about each other?
Someone using a history memory can decide to connect it to another
history memory, if the person using the latter memory approves.
Typically, someone who has discovered a module (using mechanisms
described in the "checkpoints and modules" section) requests a
connection with a history memory that has the module. (Once connected,
the history memories synchronize so that the receiving system has the
module, with a minimum of traffic.)
> Can histories move about in space and time?
You can put them in any net-accessible place. I'm not sure what
you mean about time here...
> Clone themselves?
Yes (you might want to do this when discarding rarely-used
editions, for example).
> How is the mesh kept track of?
Each system keeps track of the other systems to which it is
connected, and can report this information if permitted by the person
using the system. Typically, one can ask any system in a connected set
for the IDs of the whole set.
> What is [the] minimal subject memory?
The minimal subject memory contains the objects needed to start
and extend the system (and nothing else).
> How (and who) releases it? initially?
I'm preparing it for release, to be available from websites and
other download locations. I expect this work to be easier, doable by
more people, and less frequent as the module distribution mechanism
matures. Over time, I envision nearly all activity taking place in
modules which are added to and removed from the minimal memory, not in
the minimal memory itself. At some point, I expect analysis to prove it
minimal, and that further work on it will be rare.
> How easily (frequently) do versions get created?
One may create them as easily as before, and faster than one is
ever likely to create them from interactive use. Human activity will be
the rate-determining step. I don't expect developers' traditional
artifact-creation rates to change radically with the introduction of
this system (it's the quality of the artifacts that I expect to change,
for the better :).
The traditional maximum UUID allocation rate (as cited by the
Leach/Salz UUID specification, for example) is ten million per second
per machine. Since UUIDs are used to identify classes and authors, I
don't think we'll have a problem distinguishing one class from another,
or one author from another. Each author may create 65,535 versions of
each class in the system, as fast as remote messages can be sent between
the subject and history memories on a single machine. That rate varies
by hardware; for the sake of argument, let's say it's something very
slow, yielding 100 classes created per second. I think that's far more
than was ever likely to occur from interactive use in the past, and even
acceptable as part of the automated installation of an application.
For each version of each class that each author has created, the
author may create 65,535 versions of each method. This means that, for
example, if you add an instance variable to class Foo, you can now
create 65,535 more versions of every single method that existed for Foo
before (because you've made a new version of class Foo). So for each
class/selector pair, you can create 4,294,836,225 distinct method
versions (four billion versions of Array>>new, for example). And each
other developer could create their own four billion total versions of
that one method.
I think it's very unlikely that anyone has created even ten
thousand versions of any particular method across *all* versions of the
class that holds it, much less for only one version of that class. It
also seems unlikely that any class in Smalltalk has gone through even a
thousand versions from the same author in the entire history of the
system so far.
I understand it's tempting to draw a comparison between these
claims and previous predictions which turned out to be wrong. (I suppose
the most infamous example is Gates' apocryphal exclamation that 640K
of memory should be enough for anyone.) But I think there's an important
difference here: we're estimating the capacity of people to produce, not
to consume. Also, the medium is relatively obscure; we're talking about
development artifact indices here, not the size of the artifacts
themselves (although we have some practical limits imposed on us there,
All that said, one could easily make variable-length versions. I
just think it's space overhead on every ID (and therefore time overhead,
during transmission) that's not worth paying.
> How big is a MethodEdition? a ClassEdition?
The size of a MethodEdition varies depending on whether the
method's instructions or source are included, and, if so, how big they
are. The size of a ClassEdition depends on how many methods are
associated with the class it describes. I assume you're asking for
aggregate sizes, following all references. Otherwise, they're on the
order of 32 bytes each.
> When in time and space do they reside?
They occupy long-term space in the history memory, and briefly
occupy space in the subject memory (in response to queries by
> How does one construct the compiled method directly?
You can create a compiled method given a desired header value and
number of instructions. Then you set the literal values appropriately
(in this case, from a method edition's literal markers), and set the
instructions. After that you can install it in a class and run it. It's
the literal markers which make this all work. Just as one can create
compiled methods directly, one can also create each kind of method
literal directly. Literal markers contain the information needed for
that construction, and can carry it out.
> What is missing is the basic "How do I start from a seed and build a
> sunflower". <How do things start and grow the metephor refers to a
> current squeak project of mine>
Well, I don't think that's missing. You start with a system that
has enough methods in it to start and accept more methods, then add
methods written by yourself or others. The essence of it is not complicated.
> > Method literal markers are used to transmit a compiled method's
> > literal frame values between object memories. There are method
> > literal marker classes to support references to classes, class
> > variables, other pool variables, and literal objects, and to support
> > methods which perform class-side super-sends. Each method literal
> > marker instance knows how to serialize itself as part of Spoon's
> > remote messaging system. In particular, when a method literal that
> > refers to a class transmits itself, it transmits the ClassID of that
> > class, not the name of the class.
> I still at this point don't understand why this is a good thing. Why
> would I, as a human, not want to know the difference between, say,
> True and False?
I'm not trying to obscure the identity of anything. On the
contrary, I want to introduce identifiers which are less ambiguous than
the textual ones we've used before. When a person says "class Foo", it's
not clear to another person, much less a machine, which version of class
Foo they mean, and by which author. Also, I don't intend for developers
to ever see things like ClassIDs unless they want to. At any given
moment, the browsers we've been using before can display an appropriate
textual equivalent for an ID.
> > This gets at the namesake concept of Naiad, "Name And Identity
> > Are Distinct". When referring to a class, we never need to use its
> > name. Each version of each class is an object with a distinct
> > identity. By using ClassIDs to refer to each of them, we can avoid
> > using class names at all when storing history or distributing
> > code. This means that name of each class can be anything, as far as
> > the system is concerned.
> Ok then. For human understanding how do I get back to names?
You will never use or see ClassIDs in method sources (the browsers
see to that, aided by the information in class editions). ClassIDs are
what machines use to refer to classes when transferring compiled methods.
> > ...every shared variable pool is the responsibility of
> > some class in the system.
> If I wanted to refer to a pool variable how would I do so?
When composing method source, the same rules as before would
apply. The benefit I propose is always being able to select a shared
variable that you see in method source, then having the browsers tell
you which class is responsible for it (and, in turn, which *authors* are
responsible for it).
> Where in time and space do Checkpoints live?
Checkpoints are the same as other editions in that regard. They
occupy long-term space in the history memory, and briefly occupy space
in the subject memory (in response to queries by development tools).
> What are postrequisite [modules]?
A postrequisite module is a module that must be loaded as a
consequence of loading some other module. Conceptually, it's the
complement of a prerequisite.
> How does a module know them?
Each module has an instance variable which is a collection of
postrequisite modules (see the object model summary in the Naiad design
document). There's another instance variable for prerequisites, too.
> Who transfers a module out of a history memory? <What dialogue is
> taking place at this point and among which parties?>
A module is transferred from a providing history memory to a
consuming history memory, when the consuming history memory requests it.
The transfer is done via remote message-sending.
> > ...each module edition has a URI by which someone at a remote site
> > may install the module. That URI represents a command to a Spoon
> > system running on a requestor's local machine; it refers to a
> > standard port on localhost. Its path is a text-encoded action,
> > containing an instruction ([for example] "install a module"), the
> > hostname and port of a Spoon system providing the module, and the
> > module's ID.
> > ...
> > The encoded URIs can serve other functions as well, such as
> > listing a system's installed modules, removing an installed module,
> > making a snapshot, and quitting the system.
The instruction part of command URI can indicate one of several
commands, such as the ones mentioned above. For example, an instruction
value of one could mean "install a module", two could mean "list the
local system's installed modules", and so on. For each instruction,
there are parameters one needs to carry out the instruction (for
installing a module, those parameters are the hostname and port of a
Spoon system providing the module, and the module's ID). The instruction
and all the parameters are concatenated and encoded as text to form the
path of the command URI. Any system that implements the specification
can interpret the command URI and carry out its instruction.
thanks for the comments!
 Consider, for example, the number of humans who have ever lived. Two
hundred billion seems to be a generous guess currently; see, e.g.,
improvisational musical informaticist
Smalltalkers do: [:it | All with: Class, (And love: it)]
improvisational musical informaticist
Smalltalkers do: [:it | All with: Class, (And love: it)]
More information about the Spoon