Partitioning the image (was Re: Shrinking sucks!)

Ned Konz ned at squeakland.org
Fri Feb 18 22:40:45 UTC 2005


On Friday 18 February 2005 7:40 am, Lex Spoon wrote:
> Great analysis, Ned.

Thanks.

> Still, note that using PI subclasses means that you either must compile
> code in order to process the PI, or you must plan to replace a bare PI
> with a PI subclass when a package gets loaded.

Yes.

> Since there really seem to be two parts of a package -- the part that
> makes sense before a package is loaded, and the part that makes sense
> only when it has been loaded -- it would seem to make sense to have two
> objects.  The latter part can then make lots of references to package
> internals, because it will know the code has actually been loaded.

That could work too. I just figured that it would make more sense to do a 
single deserialization followed by the usual code loading (probably in 
pieces).

> But, swizzling PI's in place of each other as packages are loaded and
> unloaded, also appears to work fine.  Are you thinking of basing it on
> become: ?

Could do that or we could create a new object of the PI subclass (if any) that 
reads its metadata from the PI instance; that new object would replace the PI 
instance (I'm assuming here that there will be a central place to find these 
PI (sub)instances).

> I think the following restrictions might be harsher than necessary:
> > That is, we wouldn't allow extension of the
> > instance variables in PI subclasses, and we also would require that the
> > methods used to report metadata wouldn't be overridden. So we serialize
> > and deserialize the PI instances as instances of PackageInfo, *not* as
> > subinstances.

That restriction was merely to allow deserialization of the PI (sub)instance 
without having to worry about shape changes, and before the PI subclass is 
compiled. After all, we don't want to have to compile code just to see what a 
package has in it!

> Still, some restrictions need to be there, it seems.  I am glad people
> are generally agreeing that overriding #classes and the like is not
> worth the extra flexibility.

I think the main argument for having an explicit enumeration of classes would 
be that you might want to define classes that happen to live in different 
categories.

And the main argument for having an explicit enumeration of extension methods 
would be that you didn't want to hijack method categories for package 
identification purposes (granted, this is just a prefix, but still it tends 
to clutter up the names of the categories).

What hasn't been addressed -- and this is independent from how packages are 
represented and stored -- is the issue of conflicting class definitions or 
extension method definitions (or method categorization) between multiple 
packages.

> Style questions aside, let's be careful about how the serialization
> happens.  Dumping a PI with SmartRefStream is great for prototyping, but
> may cause maintenance problems down the road if we ever modify what
> variables a PI has.  It seems better to design a package format that can
> last over time. That might mean something as dumping some nested arrays
> of strings using SmartRefStream.  Or it could be done with XML, if we
> don't mind it running slowly.  Either way, it is good to really *design*
> file formats (and network protocols) if they are intended to last over
> time, instead of simply dumping the object model to disk.

I agree. Remember, though, that SmartRefStream has the ability to convert 
between class versions. Still, I'd hoped that the serialized representation 
of the PI would be humanly readable. I don't see any need to use XML, 
especially since we aren't including it in the Basic image (last I looked at 
3.8g, anyway).

-- 
Ned Konz
http://bike-nomad.com/squeak/



More information about the Squeak-dev mailing list