Partitioning the image (was Re: Shrinking sucks!)

Mon Feb 14 22:04:15 UTC 2005

Hi Ned,

Thanks for your detailed contribution to the discussion! Here are my
thoughts.

Ned Konz <ned at squeakland.org> wrote:
> It seems to me (having used PI instances, PI subclasses, and (for a while, I 
> think) having had problems maintaining an instance of a PI subclass) that we 
> need to lock down at least part of the responsibility of a PI.
Just to avoid any misunderstanding. I propose to not only allow PI
subclasses but to have only PI subclasses. All PI subclasses should be
abstract. None of them should be instantiated. So there will be no
problems of maintaining instances.

> I agree with this part of Lex's comment:
> 
> > Subclasses are great for experimenting, but if we want to move forward
> > and write a bunch of package-aware tools, we need to commit to a single
> > model that the tools can rely on. ÊExperimenters can continue to make
> > subclasses if they want, but the tool-writers shouldn't need to deal
> > with arbitrary changes. Ê
> [...] 
> > It looks like people have a pretty good feel what should go into a
> > PackageInfo at this point. ÊThe standard #classes, etc., methods are
> > very good I think. 
I don't really understand what Lex means with this. See my answer to
Lex' mail.

> From what I think I know and have heard, it seems like a PI:
> 
> * Must be able to list the classes that its package introduces (with the 
> assumption that the package includes the entire contents of those classes, 
> minus explicit extension methods from other packages)
> 
> * Must be able to list the extension methods that its package introduces 
> (which of course implies a dependency on some other classes)
> 
> * May be able to report explicit dependencies on other packages (perhaps 
> including version dependencies)
> 
> * May be able to provide other metadata about the package, including (some 
> of?) the data needed for the package-level SM card.
> 
> * May be able to provide or point to package installation or deinstallation 
> methods.
> 
> * May be able to provide or point to other utility methods like old instance 
> migration, etc.
> 
> But whether or not you use subclasses, the problem of actually instantiating a 
> PI to use it remains.
> 
> Right now PIs come into being as soon as you tell a tool like the MC browser 
> the name of a package. Because they can map the package name into a list of 
> class categories (and hence classes) and can also spot extension method 
> categories in other classes, they can at least enumerate their contents.
> 
> This satisfies the first two "must" responsibilities above, but doesn't touch 
> any of the others.
> 
> Going further than those two behaviors would require loading additional 
> information from somewhere. The obvious sources of that information include:
> 
> 1. a package file
> 2. a registry entry somewhere on the net (SM, MC, etc.)
> 3. direct communication with the source of the package (like for instance 
> custom HTTP headers or other response to a DAV-like query to the server on 
> which the package lives)
> 
> *Requiring* #2 or #3 means that we're going to prevent one of the modes of 
> working that we've been comfortable with in the past: easy and casual 
> distribution of package files via email or other means (FTP servers, 
> downloads from a Swiki, etc.).
> 
> I think it would be a mistake to lose that ease.
> 
> The reason that each .mcz file carries around with it its entire ancestry is 
> to make it possible to email a single .mcz file and have it immediately 
> usable. If you happen to have some of its ancestors in hand, that gives you 
> more usable information (you can look at version changes and do merges, for 
> instance). You can browse a MCZ file without loading it, and see not only its 
> code but its ancestry and dependencies on other MC packages.
> 
> CS files, of course, have much less file-level metadata (limited to preamble, 
> postscript, and version stamp). But we have made tools that use both the 
> date/version stamp (Conflict Checker) and allow you to browse the actual 
> contents without loading the code.
> 
> I think that we should allow but not require #2 or #3. That is, I believe that 
> *all* package-level metadata should be *able to be* included in a PI, and 
> that it should be possible to read a PI from a package file without loading 
> that package's (non-PI) code or prerequisites. 
Yes, I think it might be good to have a two step approach. First load
the PackageInfo so that you have minimal information about a package and
only then load the code.

> Lex also said: 
> > And thus, Joe Developer shouldn't be making a  subclass, either, if they 
> want the standard tools to work with their package.
> 
> Whether or not we allow PI subclasses doesn't much matter much if we made the 
> standard tools able to instantiate PIs from a package file. In the case of a 
> PI subclass, obviously its code would have to be loaded and installed from 
> the file first. But I don't see this as too much of a problem, except that if 
> the PI wants to provide other services it would tend to be dependent on other 
> (as yet unloaded) code in the package file.
> 
> But in the interest of simplicity, I'd be in favor of not supporting that 
> model, and just coming up with some way to instantiate a PI from a package 
> file without having to compile any code.
Why do you think it is necessarily simpler if you avoid compiling code?
A file in of a simple class - and I am arguing for very simple PI
subclasses - seems very simple to me.

> And Lex also said:
> >ÊAlso, all PI's should have an optional link to a SM
> > entry (and, if we want to be serious about making stable releases, a
> > Universes entry). ÊAll packages should have installation and
> > deinstallation code. Ê(and re-configure code....)
> 
> If the PI is allowed to carry with it the metadata that is necessary to create 
> a new SM entry (as well as being able to link to an existing SM entry) then 
> we can decouple the tasks of package creation and SM registration, and can 
> allow automatic registration based on the contents of sufficiently well 
> identified packages.
> 
> So what are we missing? Easy: we don't at present have a package 
> representation that includes a standard way to hold the information necessary 
> to create a PI instance.
> 
> My suggestion, then, is this:
> 
> * We should define the required interface of a PI.
> 
> * We should keep the existing name-based (class category/*method category 
> based) simple PI definition for cases where there are not yet package files, 
> or where PI instances have not been added to package files. This would give 
> us compatibility with existing file formats.
> 
> * We should require that a PI be able to provide its basic services without 
> having to compile custom code (this doesn't disallow PI subclasses, but might 
> require that PIs created from files get converted into instances of the 
> appropriate subclass upon loading the package).
> 
> * We should come up with a standard way to serialize and deserialize PI 
> instances. And that serialized representation should probably be textual, so 
> that it can be humanly readable, mailable without damage, and easily included 
> in changesets and other less-structured files.
> 
> * We might come up with a way to add this serialized PI data to the preamble 
> or postscript of change sets, so that we can leverage existing tools and 
> still be backwards-compatible.
> 
> * We have two existing Zip file formats that represent or can represent 
> packages: SAR and MCZ (and MCD too, I guess). Both of these should define 
> optional members to hold the serialized PI. This would give us backwards 
> compatibility (those members would be ignored by older versions of MC or 
> SARInstaller).
> 
> * We should modify existing tools (MC browser, SARInstaller, Code browser, CS 
> loader, etc.) to support instantiating PI instances from package files where 
> possible.
>
> * We should allow well-defined package-level metadata to be optionally 
> included in a PI. This would, for instance, let us create new SM cards from 
> package files that include sufficient metadata.
> 
> * We should allow future extension of the metadata carried in a PI (perhaps by 
> allowing arbitrary name/value pairs) in such a way that we don't break 
> backwards compatibility.
> 
> What do you all think?
Well, all in all that does not sound to me like TSTTMPW. I must admit
that I have not thought at all about backward compatibility, though.
What I can't really see is how you would handle the code parts a
PackageInfo needs?

Thanks for all your contributions to Squeak!

- Bernhard