One of the things you mention is the process of splitting up the existing code so that it can be managed and loaded in chunks. This is exactly the process I had in mind when I wrote SpaghettiTracer, since then renamed MudPie.
First generalities just to put us all on the same page, then I'll describe MudPie, then some of my thoughts about what would help to make some progress on this. When I say progress, I mean on the following specific goal - *over time, the code in the squeak-dev released image, becomes divisible into smaller packages*
I follow Avi in completely ignoring for now the code container technology, and further also ignore whether and how code actually is moved to packages.
Problem: Squeak includes lots of code, and in terms of dependencies, not all code is born equal. A. There is some Kernel Stuff, which must be there for anything to run. Includes all the classes that the VM knows about for example. Everything depends on this. The kernel (shouldn't) depend on anything else. B. There are libraries that we are used to, for example Morphic, that is dependent on the kernel, and that applications are dependent on. C. Applications like Celeste depend on many different libraries.
The most obvious thing about dependencies is that their direction should be A<-B<-C, and nothing else.
We can't break Squeak into lots of little packages because this is not the case. In fact, Kernel classes depend on Morphic, Morphic knows intimately some things that most people would not call libraries. We have cycles. This means that after an application is removed, Morphic can break.
MudPie: What MudPie does, is identify these cycles. It also allows us to write SUnit tests that say "X and Y should not be in the same cycle", so that once something like that is fixed, it stays fixed. MudPie at the moment, IIRC, works, but has zero UI. There are many kinds of loving it could use in fact, including a rewrite, but it can do what I said as is (modulo changes in PackageInfo since it was written).
At a higher level, MudPie allows us to, at every moment, get an answer to: 1. "what can I break off and package now", and 2. "what is preventing me from breaking off and packaging <XYZ>".
For something that is clearly an application, like Celeste, MudPie isn't really needed, because since we know it's okay that it depends on lots of stuff, and being an application, nothing should depend on it, we can just ask "what depends on Celeste", and PackageInfo does that nowadays. Same goes for the Kernel.
I think MudPie should become useful when you want to: 1. Assert/see something about the global structure of the code. 2. Deal with the middle level, for example untangling libraries that are mixed into the whole system, like Morphic.
Elements in a solution: 1. Making the situation visible. Right now it is hard to measure progress. People "know that Squeak has to much in it", but that is very vague. We can easily get the answer to the question "What code outside the Kernel does the Kernel depend on, transitively". This one question gives us lots of visibility into what is wrong, cheaply. Get the number of classes from that code, and we have a metric and a goal - 0. Later on, if we want more details, MudPie can give lots of them (how much of a suite of "modularity tests" is green, for example). Anyway, whatever metrics we choose to code up, should be widely available, and not require every person interested to learn to write some magic. We need a UI, or better, a web UI to this information, maybe with history. 2. Some big refactorings. I've seen lately calls for making Squeak's UI frameworks be pluggable components, so that for example, inform: does the right thing, whether Morphic, Tweak, or SeaSide is the current UI. This kind of refactoring would break the upwards dependencies that create most of the cycles. BTW, note that using MudPie doesn't require any further declarations or new constructs, so the actual work of untangling code is nothing more than refactoring (often just the renaming of method categories, and moving classes between class categories). As the packaging team, we should identify these refactorings, cooperate with people doing them to make sure the chosen solution actually breaks the cycles, and help get them into the image (reviews and so forth). 3. Maximize benefits: tools Regardless of whether the code is packaged separately, having the code dependencies be a DAG gives us various ways in which the tools can be smarter. For example, I wrote a little something for the star browser that does a topological sort on the Class Categories, according to the dependencies, so that applications end up above libraries, which end up above the Kernel, giving some meaningful clues about the meaning of the code to the user. Telling the user which SUnit tests to fix first is also an obvious application. 4. Maximize benefits: community Seems to me that untangling the code will give us more opportunities to swap code among the various Squeak subcommunities. Just saying that we should be alert to this, help it, and let it drive us.
Looking forward to hearing your thoughts on this,
packages@lists.squeakfoundation.org