Partitioning the image (was Re: Shrinking sucks!)

Mon Feb 7 09:37:48 UTC 2005

Hi all!

(trying to squeeze everything in right now - it is hectic times in
Squeak country)

Doug Way <dway at mailcan.com> wrote:
> On Feb 4, 2005, at 4:12 AM, goran.krampe at bluefish.se wrote:
> > [SNIP]
> >> Another issue concerning image cleanup is (and it was also discussed
> >> here many times already) to find a way how to unload any package
> >> completely. Even linux distributions do NOT clean up correctly in all
> >
> > Frankly - I don't think this part is at all the real problem. The
> > problem is untangling.
> 
> Agreed.
> 
> Although it would be nice to be able to unload packages that are there.  
>   But you can unload Monticello packages (and any other package type  
> based on PackageInfo), which I guess you already know.

Yup, I know. :) And hey, I will add it to the SqueakMap Package Loader.
I just haven't since it felt like a "teaser" that will only awaken the
deamons. ;)

> > But IMHO we should care that much about untangling for starters - the
> > goals of TFNR (a dormant project) was to make sure we *assign
> > maintainers* to *clearly defined parts* of the image in order to come
> > around the harvesting bottleneck and to improve the sense of ownership
> > and responsibility etc. I still very much think this is *THE* way to  
> > go,
> > but in parallell with other routes - see below.
> 
> Yeah, let's get TFNR kick-started again.  I will hop on board this  
> time. :)

Great! TFNR is now my prio uno too, apart from replying to all the SqF
hoopla, spitting out the bi-weekly and making the new SM. :)

But really, we JUST HAVE TO DO IT this time. Damn shame we are in
different timezones. Can you pop over to IRC whenever you can and we can
see if we overlap? I am there during my work hours, hardly anything on
weekends these days - family takes prio.

> > And also, since we now have PackageRegistry I really think we should  
> > get
> > going with this.
> 
> What is PackageRegistry?  I don't see a class by that name, or a  

Sorry, I meant PackageOrganizer. And note it has a class instvar called
#default.
And btw, I now installed version 18 of PI into 3.9a-6550 and that silly
halt bugs me. Can we get rid of that somehow?

> package on SqueakMap.  Is this just the list of packages which shows up  
> when you open the "Package List" window?

No, that list is... well, is is quite slimmed down. :)
Personally I think it is a bit confusing - we have 3 things currently
making users confused:

1. Package browser. The old browser that packagifies class categories.
IMHO it should be called something else.
2. The above mentioned Package list. It feels a bit... well, can't we
integrate this stuff in the SM Loader?
3. SqueakMap Package Loader which perhaps should be renamed to "Package
Manager". That way it implies more than just "loading". I mean, it does
also show what packages are installed etc.

So either we make something more out of the "package list" or we zap it
and improve the SqueakMap Package Loader to become the One And True
Place For Packages. Which is my suggestion. :) It should for example be
easier to see what packages are installed - currently you need to apply
a filter - and newbies might not even find the damn menu. ;) Perhaps a
button row somewhere in this tool would make it more newbie-friendly.

[SNIP]
> >> Undefining messages is not a problem until two
> >> packages will modify the same concurrently.
> >
> > Eh, well IMHO the problem with uninstalling packages is not exactly  
> > what
> > you describe.
> > If all packages were forced to be constrained to be Monticello packages
> > (without dirty code running in class initializers) - then we would more
> > or less already have that. But they aren't.
> 
> The solution to this is obvious, I think. :)

Just note that this part of my previous post was quite theoretical just
to try to explain the intricacies with "uninstall" in Squeak country.
People tend to think that "packages" in Squeak are just source code. But
in fact - they aren't, because of the simple fact that arbitrary code is
allowed to execute on install. Which of course theoretically can have
irreversible side effects.

Sidenote: I am using Lunar Linux which is a source distro and it is
quite interesting to note how they use some kind of very low level
"tracking" during installations to see exactly what new files were in
fact created on the system. In fact - there is as always a lot of
interesting stuff to learn from studying Linux distros - both Debian and
others like Lunar.

> First, make it easy to unload packages that can be unloaded.  For  
> example, I don't understand why the "Package List" window (available  
> even without MC) doesn't offer an "unload package" menu item.  They  
> should be unloadable.  Of course, you can't guarantee unloadability if  
> your image is dependent on the code being unloaded, but the package  
> should already be "untangled" before it is made available.  (And there  
> can be issues with unloading if you have a bunch of instances of  
> unloaded classes in your image doing important things, but I don't  
> think that's our biggest problem right now.)

I vote for - according to my rambling above - adding a button row in
SqueakMap Package Loader (and renaming it to Package Manager) - which
includes an "Uninstall" button with ballon help explaining why it is
almost always disabled. :)

> Basically, the Package concept needs to be more available in the UI.   

Yes. And again, I am not saying this because it is a tool that *I* am
maintaining - but I really think we should push the SM Package Loader so
that it is more available. This includes adding it in the right hand
flap.

Sidenote: I intend to take a look at SUnit and copy how all that is done
and release a new loader. Should we also rename it then? Whadda ya say?
"Package Manager"? Or we stick with it as today but then we really need
to have yet another tool (better than PackageList) and that feels
suboptimal.

> There should be a package-centric browser.  The MC Snapshot Browser is  
> pretty much the right UI for this, but it might be nice to have it  
> available outside of MC, just operating on code in the image.  (But if  
> Basic already includes MC, well, I guess the Snapshot Browser could be  
> used.  But make it available from the Package List via a "browse  
> package" menu item and in other places.)

Mmmm. Again, I wonder if "Package List" really is worth having around.
:) What are the compelling reasons?

> Second, require that the partitioning of the base image must be via  
> MC/PackageInfo packages.  Then work on detangling them so they are  
> unloadable.

Yes.

> Then, people will get used to being able to unload certain types of  
> packages, and there will be strong incentive for other package  
> maintainers to make their packages unloadable too, by making them  
> PI/MC/etc unloadable packages.

True.

[SNIP]
> Forget about changesets, .st files and .sar files, they have their  
> purpose, but they don't need to be unloadable.  All real "packages"  
> (i.e. applications or subsytems) don't need to be in those formats,  
> they can be MC/PI-based.

.sar files could be unloadable if they only contain .mcz files and no
"dirty" postscripts.

> Changesets are still good for things like bugfixes, or order-dependent  
> changes to a living image, but they don't really need to be unloadable.  
>   .sar files are good for installing things like fonts or arbitrary data  
> into an image, but there's not a pressing need to have those be  
> unloadable either.

Right, on the other hand we might need them for packaging resources.
So perhaps we could "tag" them somehow? As in "this .sar file is
uninstallable, because it only has .mcz files and a proper postscript".
Eh... would it also need a uninstall-postscript then perhaps? Hmm. :)

Anyway, let's dump that issue on Ned and Avi. :) I will mail them.

[SNIP]
> > Now - in short, my choice of attack would be something like this (in  
> > not
> > exactly this order, and many things can be done in parallell):
> >
> > 1. TFNR. Yes. Let's finally partition the image and put people in  
> > charge
> > of each part. Yes, the parts will still be totally intertangled - but
> > each line of code will have a maintainer. And each logical "tool" or
> > "mechanism" will have a caring Squeaker. We should do this NOW.
> 
> Sounds good.  I am on board!

Great. Let's do it. We simply just add a cs that makes PIs more or less
based on the current class categories IMHO. Stuff can be moved around
later. Agree?

> > 2. Get all the VM people to help Craig with Spoon. Get the VM changes  
> > he
> > has made into the regular official VM ASAP.

The above is of course "less important" at this stage. But if Spoon is
ever going to become part of "official Squeak" it needs to be done of
course. I leave it up to Craig at this point.

> > 3. Create a category on SM called "unloadable package" (or something)
> > and gradually start to migrate packages over to Monticello with *nice*
> > class initializers. Again, unloadability is not key to all this, but we
> > can still do it. Having stuff in Monticello format is on the other hand
> > pretty important - especially for having correct upgrades (the other
> > formats can't do that).
> 
> Yes!  Well, we don't have to do this for *all* packages on SqueakMap,  
> but I think it makes sense for at least all of the Squeak-official  
> (Full) packages.

Right.

> > 4. Start moving tools over to Tweak, so that we can get rid of Morphic.
> > And move to newer tools where appropriate - like Omnibrowser instead of
> > the old browser etc. In essence we need Omnibrowser, a workspace,
> > exporer/inspectors and a debugger. At least. :)

Btw, the above was based on the assumption that Tweak was definitely
scheduled to be the next "official" UI framework in official Squeak.
AFAIK this is definitely not "definite" at this point. In short - this
is a topic of its own, will not discuss it in this posting.

> > 5. Get Monticello/SqueakMap and other low basics to load on top of
> > Spoon. Then Tweak. Then OmniBrowser and the tools. :) Or whatever. We
> > need to get a *head* on Spoon with a minimal tool env. Noone will  
> > choose
> > to "live" there until it has that at a *minimum*. So until we get there
> > the manpower working on Spoon will be low.
> 
> These are important, although I think detangling the big chunks in the  
> base image may be more urgent?  Perhaps you're already assuming the  
> detangling will be mostly done in steps 1 and 3.
> 
> In other words, I think a lot more people will start working on things  
> like this (4 and 5), once the basic chunks are detangled from the base  
> image.

Yeah, the above was much further down the road. At *this* point (I
change my mind quickly you know) I actually say:

- We shouldn't include Spoon in this plan just yet. If we want to go
small, we can try with any small image that at least has a head (Morphic
or MVC).
- We shouldn't automatically think Tweak is the given successor to
Morphic. The issue is more complex. And Morphic will have to be around
for a long time to come.

In fact - I think we should make sure Squeak can work with both
frameworks. I know Andreas has mentioned resurrecting ideas/code in
StableSqueak (using an abstract UI framework layer) but on the other
hand, given recent postings about Tweak (the Islands post for example) I
am not so sure that Tweak is going to be so easy to just "put in
Squeak". Again, a different topic altogether.

> For example, if you have a working detangled headless Squeak 3.9/4.0  
> Kernel image, even if it's still a bit lumpy and non-streamlined, you  
> can compare that with Craig's Spoon image and see what the differences  
> are.  And if you have a detangled Graphics package, you could try  
> porting it over to the Spoon image.  Etc.

Right. So let's ignore that low level part for know.

> > 6. Get the next release of SM out with full dependencies.
> >
> > Well, something like that. Ok, do we have someone willing to be General
> > on this? I can and should grab number 1 and get that done. And damnit,  
> > I
> > will. But I need help with the rest.
> 
> Overall, your plan sounds pretty compatible with what I was rambling  
> about in this post:
> 
> http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-January/087639.html

Yes ( I reread it) that post is DEAD ON what I am after. And yes, that
is exactly what TFNR was/is all about.
And yes, we really should move squeak.org - :) We now also have a server
that can take it on. Can please someone take care of that? Just email me
to get access to the new server. We have apache2 running there.

> I can help with #1 (TFNR), and the update stream.  I could perhaps take  
> charge on #3.
> 
> With #1, it might be good to have a basic plan of attack figured out  
> here on squeak-dev first, before forming a separate mailing list of  
> volunteers.  If there is a basic plan in place, it might actually help  
> with getting more volunteers, because they'll know something will  
> "really happen this time". ;)

Yeah, but too much planning also... well, I am in "do it" mode for the
last few weeks. :)
But sure, I am with ya.

> My 2 cents on the plan:
> 
> We should probably use the update stream to do the partitioning and the  
> detangling of the base image.  Changesets broadcasted to the update  
> stream would contain partitioning doits and detangling changes.

Yes, agree.

>  There  
> may be other ways to do the detangling, such as an MC-based scheme like  
> the one Avi described here:  
> http://people.squeakfoundation.org/article/39.html .  But I'm not sure  
> it would work, for the reasons Andreas stated in his response.  Also, I  
> think it may be easier with the update stream to keep everyone's  
> changes "in sync" until everything is sufficiently detangled.  Everyone  
> doing the partitioning/detangling would have access to the update  
> stream.  It's easy to broadcast a changeset to the update stream, it  
> works right now.

I am all with that.

> Basically, I don't think we can really consider getting rid of the  
> update stream until #6 (full dependencies) is done.  And even then we  
> may keep it.  Or not, who knows? ;)

Sure. Let's stick to it.

> My other thought is that we should make the initial partitioning as  
> coarse as possible.  "Graphics" is one package, "Morphic" another,  
> "Kernel", "Tools", etc.  This will make it easier to detangle, if  
> you're only worrying about large pieces... don't worry about breaking  
> the large pieces into smaller ones until later.  At this point, no one  

Exactly my view. It doesn't matter if they "overlap" in wrong ways in
the beginning, just as long as every line of code belongs to *someone*.
Then the people in charge can work it out between themselves.

> really cares that much about being able to unload small sections of  
> Morphic such as Morphic-PDA, we just want to be able to separate  
> Morphic itself from the rest of the system.  (Well, separating EToys  
> from Morphic would be a nice second step... get Juan Vuletich working  
> on that one.)

Yes, yes, exactemente. Great initiative by Juan btw, replying to that
separately.

> The partitioning could roughly correspond to the first section of each  
> class category.  This is how MC/PackageInfo seems to work by default,  
> anyway.  I just tried creating a "Morphic" package in Monticello, and  
> it uses all of the classes in the 'Morphic-*' categories for its code,  
> plus there are already some *morphic extensions in the base image which  
> show up.  Saving the Morphic package results in a 2.1MB .mcz file, by  
> the way.

Interesting.

> (Then there's the whole issue that PackageInfo is based on the strings  
> of the class categories and method categories to define the code that  
> it contains, which is kind of a hack, but it works and is compatible  
> with existing tools.  It wouldn't be hard to convert PackageInfo-based  
> packages to a more "real" modules system later, after the detangling is  
> done... it's the detangling that's the hard part.  I'm not sure we want  
> to wait around for a more proper modules system before we begin  
> partitioning & detangling.  I guess one problem with this is that you  

No, we should definitely not wait anymore. I am not even sure what we
would be waiting for.

> may end up changing the class categories of a lot of classes to make  
> this work.  For example, "Collections-*" might have to be renamed to  
> "Kernel-Collections-*" if we want the Collections-* classes to be in  
> the Kernel package, which is where most of them would need to be.  Can  
> the package names be unrelated to the class category names?)

Well, I would favour having separate packages. Why can't collections be
a separate package?

I mean, sure - it will never be uninstallable, but it seems you would be
able to upgrade it to newer releases at least. Hey, that is a thought
worth noting...

> Ok, those are my thoughts for now.  When do we start? ;)

Yesterday. :)

> (And I know the above is only 80% thought through, but announcing a  
> plan and starting on it is a sure way to get feedback!)
> 
> - Doug

Let's just roll with it - the plan is fine. I say we just MOVE FORWARD
and people can yell if there is anything. :)

regards, Göran

PS. A while back I posted a not-yet-finished-but-pretty-slick-ENH that
uses PackageRegistry to implement cs splitting and "send changeset to
maintainers" in ChangeSorters. Please take a look:
	http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-January/087
253.html