Proposal: Squeak-E = Squeak x Kernel-E

Sun Jan 26 18:33:28 UTC 2003

As Andreas said, I didn't make much of a point.  Let me try again.

My attempt was twofold: to focus the issue on what is important, and to
remind everyone of how far my "Islands" got.  I will try to clear those
up, and also reply to the subject line.

To be blunt, I don't see the big advantage of segments of memory in the
VM, even though you might want conceptual segments of memory within the
image.  It is unclear what kind of VM support would be useful.  The
extremes are useless: one extreme is the same as multiple Squeak
processes, and the other extreme is the same as not having segments at
all.  We need to know what semantics we want from the image segments
before we implement them, and no one seems to know.  My Islands project
doesn't use  VM memory segments.

Guys, Islands is pretty far along.   It has true capabilities, and it
allows non-graphical computations to run on its own island without
harming the rest of the system.  The VM changes for Islands are pretty
small; the vast bulk in image-level things like the following:

	- supporting symbols with ==, even between islands
	- immutable literals (so you can't change the value of pi)
	- immutable blocks (so you can't write your own bytecodes)
	- safer exceptions (though I'm still worried in this area)
	- safely limited access to class methods (so you can't modify class
Object arbitrarily)
	- dynamically-scoped variables, so that the "global" variables aren't
shared

I even had a prototype of Morphic in a box, though it was pretty slow.

Islands has been stalled for a few years, but not because of any
technical difficulties.  I am busy doing PhD work at a college where no
one wanted to sponsor me with the Islands work, plus being a teaching
assistant to pay the way.  I'm happy for anyone else to plug away on
Islands, but people seem to be running into trouble getting it loaded at
all (if anyone has gotten it loaded, PLEASE post your image and VM :) 
Mine got left on a Disney network drive).  It takes several hours, and
many people don't seem to perservere.  Writing this email is already
taking more hours than I really have.

I'd guess that, working full time, I could get secure and efficient
morphic-in-a-box within a month.  It's a fairly wild guess.  The main
copmonents would be careful consideration of the box-to-world protocol,
and work on the efficiency of the dynamically-bound variables.

Now let's consider E, and what we can take advantage of.

The E VM itself would give us efficiency from the get-go.  However, all
the image-level work remains.  Additionally, the image-level work from
Islands would have to be duplicated.  Switching to E *might* help with
development support for dealing with capabilities, depending on how
closely Squeak matches E and how well we can reuse their tools.  Then
again, we could also port E tools to Smalltalk....

Overall, I don't see a huge advantage of switching to the E VM from the
point of view of security.  We *might* want to do it just to combine
development efforts.  But unless the merge could happen within a month
or so, there doesn't seem like much advantage from the point of view of
security.

All the difficulty with secure Squeak is in the libraries, at this
point.  We either have to modify the Squeak libraries to be
capability-like, or we have to start from scratch with a new library, or
we have to switch over to the library of some other language like E. 
This is work we can't avoid; there doesn't seem to be any magic bullet
in the VM, including image segments, that will help significantly.  I
ask everyone, especially Mark, to reread the list of things above that I
said Islands does.  How much of that has to do with the VM ?  Unless I'm
overlooking something, the main work to do is to tweak various bits of
library code so that they work inside a sandbox.

To me, it seems like we *can* modify our library successfully.  The vast
bulk of Smalltalk code does not matter for csecurity.  This was a design
goal of Islands  and I think it succeeds.  When someone writes
"OrderedCollectien new: 10", they don't care that OrderedCollection
gives you an object that doesn't respond to #superclass.  When an app
passes messages around to itself, the rest of the image is completely
unafraid of the security implications.  It's only primitives that you
have to watch for, and user code rarely defines primitives.  The vast
bulk of Morphic, not counting Balloon, should run without modification.

What E is definitely helpful with, is image-level *design* ideas.  How
do you deal, in practice, with a language that uses
objects-as-capabilities?  While capabilities are convenient for
designing security policies, objects-as-capabilities aren't as
convenient to use as they at first seemed.  It's hard to write code
where most of the objects you see might be lying to you on fundamental
things like #name or #asInteger, and my solutions are surely very
mundane compared to what Mark Miller would come up with.  After all,
he's the one who clued me in on capabilities, and that was two years
before Islands was started, which in turn was a few year before today.

Lex