[squeak-dev] Re: second call for feedback on Naiad design

Sun Jan 18 00:09:01 UTC 2009

On Nov 21, 2008, at 5:48 PM, Craig Latta wrote:

>
> Hi--
>
>     Thanks for the comments! I'm responding to the comments so far  
> in this single message. I see no reason to restrict Naiad-related  
> discussion to a single thread; hopefully threads will emerge around  
> particular specific issues, rather than particular people. :)   
> Please feel free to break issues out into new threads... for this  
> message, there's such a grab-bag going that I decided to deal with  
> it all in one place.
>
>     Karl writes:
>
> > When this system works, won't image size be an issue, like an
> > ever-growing web browser cache that has no size limit?

I had two concerns along these lines.  I'm not so concerned about  
absolute size, since disk is cheap.   But I wonder about the time/CPU  
it will take to snapshot the whole image each time you edit a method  
or evaluate something in the workspace (DoIts are recorded in the  
history image, right?).  Appending to a file is an O(1) operation, but  
snapshotting an image is O(n), where n is the total number of updates.

Another concern is data integrity.  What happens if your machine  
crashes while you're snapshotting?  If you're simply appending to  
a .changes file, there's no problem.  Of course, this is surmountable,  
but the solution will be more complicated than a changes file.

>
>
>     I imagine the history memory will have various utilities, like:
>
> -    dumping all the compiled method info, because the subject memory
>     will always have a compiler
>
> -    dumping all the method source, because the subject memory will
>     never have a compiler :)

This level of flexibility worries me a bit.  It's cool that the model  
supports these types of uses, but if it's going to replace  
the .changes file, it needs to be simple and stable.

I realize that this is a bit unfair, because you're talking about the  
interesting characteristics of the model, not focusing on how to make  
the transition from .changes to Naiad.  At some point, though, we'll  
have to have that conversation.

>
>
> -    storing its less-frequently-accessed editions in one or more
>     separate history memories, which spend most of their time as
>     suspended snapshot files, but which can be activated when
>     necessary. Remote message-sending is a fundamental part of
>     Spoon; there's no inherent reason why the history memory can't  
> be a
>     federation of history memories instead.
>
>     Of course, one might decide to put editions in another object
>     database at any point instead (e.g., Magma or Gemstone). I just
>     want to provide something that provides the bare minimum
>     functionality "out of the box".

Good to see the focus on simplicity.  But what is the use-case driving  
your definition of "bare minimum functionality"?  Is it a prototype  
for people to gain exposure to Naiad?  Is it something that can  
replace .changes for Joe Squeaker's day-to-day use?

Thanks for taking the time to document your design,
Josh

>
>
> -    purging certain editions entirely (rather like when we made new
>     sources files with the traditional setup)
>
> ***
>
>     Wolfgang writes:
>
> > For me the main issue is the protocol that is used between the two
> > images (subject and history). There is little written about it.
>
>     This is true, I haven't finished that documentation yet. One can  
> look at the implementation of remote message-sending from the last  
> Spoon release, but I haven't described it in prose yet, and the  
> Naiad design document is the most prose I've written about how the  
> subject and history memories communicate at a higher level.  
> Eventually all this stuff will be in the Spoon book[1].
>
> > Just for thought, what if the history memory would be a web server.
> > What would the protocol look like?
>
>     Well, there is already a (tiny) webserver in the subject memory,  
> to  provide the initial user interface when first run. One could  
> load its conveying module into the history memory and do lots of  
> interesting things with it, yes.
>
> > Can the low-level protocol be hacked to support this?
>
>     Yes.
>
> > And one thing I am suspicious is that there is so much knowledge in
> > the IDs.
>
>     Since they're going to be flying back and forth over sockets,  
> sometimes in large numbers, they need to be as small as possible; so  
> I've thought carefully about minimizing them. At this point I'm  
> simply open to discussion about what anyone would leave out. :)  I  
> think have a good argument for every bit in every ID (likewise for  
> every bit in the minimal subject memory).
>
> > And limits to the maximum number of editions etc.
>
>     So far I've decided that it's not worth any extra bits  
> expressing variable-length sizes, but again I'm open to discussion  
> about that.
>
> > I'd rather have proper objects that those IDs, with  
> LargeIntegers :-)
>
>     (The size argument applies here, too.)
>
> ***
>
>     Michael writes:
>
> > I think the main reason people aren't commenting is because that's a
> > lot of reading!
>
>     Sure, I'll just keep asserting that the importance justifies the  
> time. :)
>
> > Perhaps "versions" is a better name than "editions"? That's the name
> > we're more familiar with.
>
>     In this case I think the familiarity is a disadvantage;  
> "version" has multiple strong meanings to people. A "version" is  
> sometimes an artifact which has multiple interesting states over  
> time, and sometimes it's an identifier used to refer to such an  
> artifact. I think it's better to use a less-used term here, and I  
> like the resonance between "edition" and "edit".
>
> > Do we need to run two instances of Squeak to edit code, one for the
> > current version and one for managing the edit history? I assume  
> that's
> > what you mean by needing two object memories.
>
>     That's right. The typical case is one person using one subject  
> memory connected to one history memory that is mostly that person's  
> editions, over a localhost socket connection.
>
> > If so, is it intended for the edit history object memory to be a  
> live
> > central repository shared by developers?
>
>     That's also an option, yes; it's just not the default.
>
> > Does the system work if it can't contact the edit history object
> > memory?
>
>     Yes, but the tools would show decompiled method source, and some  
> history features like regression would be unavailable, (similar to  
> what happens if you don't have the changes/sources files with the  
> current setup). But the typical case is that you have the history  
> memory snapshot on the local machine, so it seems no more likely  
> this would happen than it would be for one to lose the old changes/ 
> sources files, or indeed the subject memory itself.
>
> > What do your remote references look like?
>
>     Each one is an object which holds a special hash for a remote  
> object, and stream on a socket connected to the remote system. So...
>
> > How stable are they? Do they rely on, e.g. IP address to find a  
> remote
> > object memory? If somebody changes IP, are the remote references  
> still
> > valid?
>
>     ...currently, they do not survive suspension or termination of  
> the object memory in which they live. They are *not* like URLs, as  
> your comment implies. They are not a description of how to reach a  
> remote object, they are an active connection to a remote object  
> which behaves in all ways like the remote object. In general, they  
> are created by sending messages to other remote objects. The first  
> remote objects in a session are created specially as part of the  
> connection handshake between object memories.
>
>     If the object memory of the reference is suspended (saved and  
> quit), the reference is nilled on resumption of the memory.
>
>     I implemented this part of the system in 2003; it's been in all  
> the Spoon releases so far.
>
> > I assume a class now contains a ClassID and a collection of  
> MethodIDs?
>
>     No, a class has a "base ID", which is a UUID. The subject memory  
> as a whole also has a UUID. The history memory knows the UUID of the  
> subject memory it is tracking, and has "class editions" for each of  
> the classes that have ever existed in the subject memory. Each class  
> edition has "method editions" for all of the methods which have ever  
> existed for that class as defined at a certain point in time.
>
> > Why is ClassID so complex?
>
>     It's complex? It's just a base UUID, an author UUID, and a  
> version number. I think if it were any simpler we'd lose something  
> important.
>
> > Why not just assign each class a new UUID for each new version of  
> that
> > class, with authorship and versioning being metadata of that class?
>
>     It seems to me that it would be useful to have a single unique  
> identifier that can refer to the definition of a class at all points  
> in time, as expressed by all authors. When you want to get more  
> specific as to author and point in time, you can append additional  
> bits to it.
>
>     Also, I explicitly want to keep history information separate  
> from the artifact objects they describe, so that they may be easily  
> left behind during production.
>
> > Limiting to 65,536 versions per author is going to create problems  
> in
> > 10 years time.
>
>     I disagree. Remember, these are editions of a class *definition*  
> (instance variable format, etc.). If you add a method to a class,  
> you're not creating a new edition of that class, you're merely  
> creating a new method edition. From my experience (which encompasses  
> more than ten years ;), authors tend to create entirely new classes  
> much more often than they revise class definitions, and they simply  
> use the classes as they exist a lot more often than that. Frankly,  
> I'd expect 1,024 to suffice here. Sixteen bits is simply the first  
> sufficient number of bytes, so it's convenient as well.
>
> > Isn't having the author and version in the [class] IDs going to  
> cause
> > conflict problems? What happens if the author is careless and ends  
> up
> > with two different versions of a method with the same unique
> > identifier?
>
>     Another good reason for keeping the history information in a  
> separate (and headless) object memory is so it can take of itself  
> without most developers bothering with it. :)  The typical developer  
> uses tools in the subject memory. Those tools only make requests to  
> the history memory for new editions to be added; they have no say in  
> how the corresponding IDs are made. In particular, the history  
> memory decides what the next available version number is for a  
> particular combination of class base ID, author, and selector.
>
> > Are author UUIDs going to be able to be looked up to get email
> > addresses and names somehow?
>
>     Each history memory stores author editions; each author edition  
> associates an author UUID with all that info and more (see the class  
> tree at [2]). When you receive a module from another author's  
> system, you get the relevant author editions as well. When you use a  
> new system for the first time, you can create an author edition for  
> yourself.
>
> > Methods shouldn't have an author. The changes between methods
> > versions/editions should have an author.
>
>     I disagree. I think it's less work over time to figure out those  
> changes when necessary.
>
> > I think you're taking the "minimal memory usage" idea too far.
>
>     I think it's necessary to make the system as easy to learn and  
> maintain as I want it to be.
>
> > In my design for distributable packages... packages (cf: classes in
> > Naiad)...
>
>     I would expect them to correspond to Naiad's modules, not classes.
>
> > ...they need to (deep-)copy it...
>
>     Uh-oh... "deep copy" is one of those phrases that immediately  
> makes me suspect something is wrong (almost as bad as someone saying  
> "dude" ;).
>
> > I've separated source from bytecodes.
>
>     Naiad does that, too.
>
> > I'm not sure it's a good idea to propose an unstable system as the
> > next version of Squeak though.
>
>     Well, this is two major versions out, not one. I think we can  
> get plenty of testing in. And I think we're in serious danger of  
> stagnation as it is. For better or worse, I think this history stuff  
> is the sort of thing that has to be done with a relatively  
> provocative step. Sometimes this is good (insert your favorite Alan  
> Kay quote here ;).
>
>
>     thanks again!
>
> -C
>
> [1] http://netjam.org/spoon/book
> [2] http://netjam.org/spoon/naiad
>
>