One of the big problems that needs to be addressed is the social one. About two years ago, the Squeak community began an ill-fated attempt to retrofit modularity into Squeak. The resulting design failed to generate the necessary buy-in by the average Squeak hacker. The current SqueakMap work is an alternate attempt at modularity that takes less ambitious steps forward, and has been quite successful so far.
I wholeheartedly agree. One of the major problems that you have to face if you essentially want to "convert a community" is to deal with their expectations. I believe that ultimate goal we are talking about here can be achieved while essentially staying within "current" Squeak for a very long time.
Here's what I'd do: Make your "own class Object" which is considered the base for any future work. Then either slightly modify the compiler (or better: copy the entire enchilada so you have room for modifications) which may (for example) not even know how to compile primitives, how to generate "optimized byte codes" (I know that various people wanted that) and have your "class Object" use it. Make it so that "your class Object" has it's own environment (or whatever you choose to do) for symbols. This will COMPLETELY ISOLATE anything that is done in "unsafe Squeak" from whatever you do in "safe Squeak". With its own set of symbols there is NO WAY that "safe Squeak" can ever send a message that is understood by "unsafe Squeak", there is NO WAY that "safe Squeak" can ever even "name a global" from unsafe Squeak.
I will give you an example for the latter since it is fundamentally important for your work and will show you how efficiently this will isolate the environment you are working in. Let's assume that your environment (and the SafeCompiler) obtains all its symbols from a set named "Foo". Looking up a symbol in Foo will return a different symbol than this in the default symbol set (represented by Symbol's lookup table) so I will prefix a symbol named "bar" looked up in "Foo" by Foo::bar. Now let's write a method in our class EObject (I like the name ;-)
EObject>>tryAnythingBad: aSystemDictionary Smalltalk at: #Array. "try obtaining class array"
If your compiler looks up all of its symbol in Foo, then the above will compile into: Foo::Smalltalk Foo::at: #Foo::Array
Now on to the implications. First of all, there IS NO "Foo::Smalltalk" in global Smalltalk dictionary - you would end up with the compiler telling you that this global does not exist. Neither does "Foo::Array" so that even passing Smalltalk as an argument to that method would not reveil the existence of Smalltalk in that Dictionary. And even *if* you were handed both a reference to Smalltalk and a reference to the #Array symbol from it, Smalltalk would still not understand the message Foo::at:. Consider another bit of code:
thisContext sender.
Even *if* the compiler would allow you access to "thisContext" (which it may not have to but you might implicitly obtain it by creating a block) the message "sender" would (again) translate into Foo::sender which is nothing understood by "unsafe Squeak".
Get it? As long as you stay in your own "symbol set" there is absolutely NO RELATION whatsoever between "unsafe Squeak" and "safe Squeak". Then, you can start to expose certain abilities from "unsafe Squeak" to "safe Squeak" but can do so securely and incrementally. For example, one of the first messages you will want to have in "safe Squeak" is #value (for blocks etc). Since some classes (including BlockContext) are known to the VM you cannot easily make up your own class, but you CAN implement Foo::value in BlockContext along the lines of:
BlockContext>>Foo::value "Invoked from Symbol space Foo" ^self value
Here's something interesting for you. The above essentially grants every object from symbol space Foo to send the message value to obtain the same effect that it would have in the "default symbol space" (Squeak). In other words, Symbol spaces are highly effective (meta) capabilities itself! If the above is compiled in the default symbol space (having the capability to evaluate a block) it can grant this capability to another symbol space (in this case Foo). And Foo - now having this capability - could grant it to another one (or not).
So in short, I believe that there is no need whatsoever to "burn the diskpacks" at this point. The above will give you an entirely isolated space in which you can actually build lots of stuff with room for experimentation, with room for introducing the "right notions" for everything you want (including brands or whatever) and see how this works out. It provides ways for migrating existing code at very little cost. Once it is worked out you are likely to have a much better idea about what parts in the VM really need to be changed in order to be more efficient - but to me, this is quite far ahead.
Cheers, - Andreas