Mark, your review of security possibilities in Squeak is deeply appreciated.
Your notion of 4 levels of effort is a great basis of discussion. In fact, I believe Islands works at level 4, though it is more awkward for such thigns than E. I'll attach my solution to the little-money problem at the end of the message. Still, as you guessed, the goal of the project was more for level 1 or maybe level 2 (eg, retracting sound access from annoying squeaklets), with hopes of making a basis for level 4.
In this message, I'll first elaborate on some big ideas, and then try to respond to questions on the details of Islands. The code example is atthe very end. Thus everyone can read as far down as they care too.
The main thing you'll care about is that, during the money exrecise, I became completely convinced that dynamic class lookups are bad. They are a convenient means to the end of reusing Squeak code quickly, but they mean that to be safe, you have to ensure your own island is running whenever an untrusted user calls you. This is a real nuisance! Otherwise, even things like integer addition can be a security breach, because in Squeak this will fall back on language-level code. Imagine what could happen if LargeInteger were replaced by something nasty!
There is a big engineering effort to get away from dynamic binding of class names yet still have limited access to classes. Happily, though, the independent modules efforts for Squeak seem to be solving this exact issue along with their other goals.
The more general dynamic global variables are annoying, but the one case of World is hard to do away with in Squeak. World refers to the current morphic rendering unit. Currently it is an entire desktop, though in a Squeaklet it would be more like one page or one subwindow. From another viewpoint, World is the root window from XWindows.
Code uses World all over the place: there are 98 direct accesses to the variable, and 119 senders of #currentWorld. Plus, there is a tricky *conceptual* issue with getting rid of the variable. It's just really convenient to have the world floating in the ether, and it would suck if Squeak code became significantly harder to write. But maybe it can be done; someone may want to try seeing how many of these method can be rewritten without the variable!
Note that you don't have to use dynamic variables if you don't want them. It's entirely reasonable to think of a future partitioning of Squeak where none of Morphic runs in any context that needs security, and thus nothing unsafe is accessing World. All the other globals variables seem easy to move around, by attaching the right objects to World. If you write code that doesn't use any global variables, and if we had static binding for classes, then you wouldn't have to fiddle with installing proper islands before running your code.
On another topic, you say that level-1 success, where you have completely isolated applets, is useless. That is not actually true in Squeak. The most important goal of Squeak is to allow a new way of composing electronic documents. Level-1 success means Squeak documents can be exchanged safely, which is a huge step forward for Squeak.
Something that is easy to overlook in Islands is that the restricted class methods are very important. The class methods that a limited proxy will execute are *very* limited. In fact, they are so limited, that you can always accomplish the same thing the class method does by inlining the code. The main thing is that in addition to the regular restrictions on safe methods, these methods may not access instance variables. Any class method that does not follow the restrictions, will be invisible to anyone who cannot touch raw classes (which is pretty much everyone, since touching raw classes is an immediate security problem).
Okay, let me respond to specific questions.
It seems to me that ObjectInspector can be refactored in a way that would make it both more convenient, and would naturally subdivide authority, simply by currying it. I assume that the listed methods of ObjectInspector are class methods, as shown by the above quoted use. Let's say that all these were instead instance methods, and that ObjectInspector instead had one instance variable and one "new:" class method. Then, instead of writing
x := ObjectInspector instVatOf: obj at: i
you'd write:
x := (ObjectInspector new: obj) at: i
I believe this is implemented. However, the instance methods just call the class methods. :) I guess you'd actually want to use something other than this exact message sequence to *create* an ObjectInspector instance, though, in order to defeat using "myObjectInspector class" to create arbitrary new ObjectInspectors.
Note that the automatic restriction on class methods is pulling weight in this scenario.
Anyway, I never made a serious pass at secure debugging, and it seems like a low priority. It seems very complicated -- e.g., the debugger itself may have to worry about being corrupted by the code it is debugging!
In the section on Literals, you seem to use immutable and read-only as synonyms. I find this confusing. To me, "read-only" means I can observe the current state but modify it. It doesn't mean the state won't change. And if the state to which I have read-only access does change, I expect to be able to observe the new state.
Do you indeed mean "immutable" everywhere you say "read-only"?
The distinction is unhelpful for Squeak literals. I suppose immutable is the better word. Currently, literals in Squeak change all too often; specifically, I have a literal change about once a year, and it sucks when it happens. :) Making literals immutable along with read-only is a terrific thing. The only time they are mutated in Islands, is when they are instantiated by the compiler when a method is compiled.
The section "Dynamic Variable" seems to be to explain only per-process variables. Is this right?
There's a little more to it. Dynamic variables have a stack of bindings within each process. All accesses modify the most deeply nested binding. When you pop off the deepest binding, the old binding will return along with its old value.
As a final tweak, the bindings come a table at a time. When you install an island, it installs a new set of bindings (and hides anything not explicitly listed in the island). So you can install and uninstall islands to restore a complete set of predictable bindings. This idea is used in the example below.
(Implementation note: I never made an island-creating capability for Islands, and I never implemented a thisIsland facility for grabbing the island you are running in. It seems like these would be very easy, however.)
This section refers to "methods that are marked as privileged". Even after reading the later section that explains this, I didn't feel like I understood how this marking happens, or how the authority to do this marking is controlled.
You mark it by sending messages to the real, low-level class object, the same class object that the VM uses. Thus it is as privilaged as it can be.
I really hope that Islands reaches the 4th level of success that you talk about, in which case normal users never need to write privilaged methods. Who needs to define primitives, or access variables from the system's most privilaged island, or touch thisContext? Logically no one, and hopefully this works out in practice.
Can the same dynamic-variable-using instance be invoked from different processes (causing different bindings of the same use-occurrence of the same variable), or are dynamic-variable-using instance partitioned among the processes that can invoke them. Unfortunately, I suspect it's the first, but I'll wait to find out before arguing against it.
It's possible for it to be the first. For example, if a squeaklet forks a thread, then the thread will surely be given the same island and thus the same bindings.
It is perfectly reasonable, though, for secure code to keep track of a predictable island for its own use. When you install an island you have created, you get no accidental variable bindings.
Whenever code is loaded from untrusted sources, it should be loaded into unprivileged methods.
I'm not sure how to read this. If the code includes methods that were written assuming they would run privileged, what happens?
A compile error.
Processes with direct access to
I don't understand what this means, but it alarms me. In a capability system, we should be speaking only of objects having access to other objects. I know that Processes are objects, which is good, but I don't think this accounts for what you mean.
I'm not sure what sentense this was. Process is probably the wrong word, and island should be used instead.
Are restricted classes (as implemented by the restricting proxy) read-only? (I mean, genuinely read-only, not immutable.) I think they should be.
Yes, that's part of the point. Incidentally, all classes are restricted unless you have a "SystemIsland" installed. It just doesn't seem like the more powerful features of classes are very useful in practice. For example, how often do you really need a class instance variable instead of a regular class variable? Privilaged class methods also become a convenient place to stick methods that only privilaged code be using; there is no way for unprivilaged code to directly invoke such methods.
In particular, many class methods return "self"
This reminded me that Smalltalk methods by default return self. E had a different but similarly dangerous policy. We found to our terror that this was a pervasive source of accidental security holes in our code, very much along the lines of the specific security hole you found with classes. Rather than making a custom repair for each individual case, I fear that you'll find you'll want returns-to-SafeSqueak to return null, rather than self, by default. This is the first issue I've encountered that makes Java easier to tame than Squeak. In Java, methods are void return by "default". ("default" is a funny word. It's still explicit, but is the path of least resistance.)
I believe I solved this quite sufficiently, however. There aren't any more specific cases. When you run a method on a restricted class proxy, then anything that method can do, you can already do yourself.
Smalltalk doesn't have many statement types. All of them can be reviewed. I haven't done it *real* carefully, but it ceartinly seems like anything a restricted class method can do, the caller of that method could do already, with the one exception of instantiating classes. The other statements are things like creating temporary variables, sending messages to objects you already have, creating literals, creating blocks, and so on.
My intuition for this design approach is that class methods fall into three categories:
1. Utility methods, eg returning a constant. These need no privilage.
2. Instantiation methods. Except for #new itself, these are the same as #1.
3. Primitives. These should not be directly accessible in restricted contexts.
The Islands solution allows for #1 automatically, and for the two necessary methods in #2 (basicNew and basicNew:) to be punched through explicitly. Eliminating #3 automatically is a nice side benefit, which means that a lot of legacy code doesn't have to be rewritten immediately.
Incidentally, this notion of super-safe mehods is useful for instance methods, as well. If you are auditing proxies, then you can immediately ignore super-safe methods.
Also incidentally, my favorite solution to this problem would be to get rid of class methods. Even if we have to write explicit factories like the Java guys, it would simplify Smalltalk so much that it may well be worth it. This analysis and solution to class methods are probably the hardest part of Islands, and yet class methods are supposed to be simplifying things! (Alternatively, the class-ish methods and the utility-ish methods of classes could be separate aspects of some kind, eg using Nathanael's and Stephen's "Traits", but this is starting to make my head spin. I'll let them try, if they want to!)
An incomplete list of such methods is the following:
I would be very interested to see the complete list of methods on Class, ClassDescription, Behavior, and Object that you consider safe. It's much more important to review the list of what's allowed than the list of what's disallowed (though you made the right choice of which to list for the paper).
It's an algorithmic definition. The compiler automatically marks which methods stay within the super-safe subset of Smalltalk described above, and proxies check whether this mark is present.
This is why there is no list of rejected methods. I agree that having a reject list is easy to screw up with.
I don't understand why you allow instaVarAt: and instVatAt:put: on non-proxies? Does this include non-proxies written in SafeSqueak?
Umm, you don't have to except for legacy code. However, it loses nothing for non-proxies. If instVarAt: is unsafe, then the object should probably be a proxy.
The idea is that if you instatiate a normal object, then you already can get access to all the things the object will have in its instance variables. And if that is true, then you can extend the argument used above with super-safe class methods: you cannot do anything with instVarAt:, or by calling any method compiled in restricted mode, that you could not do with your own code.
Thus, it's almost a definition: Proxies are objects which protect their instance variables from outsiders. The name "proxy" thus works very well (though I'm not sure I had all this in mind at the time the word was chosen!)
I completely did not understand the section "Characters and Symbols", probably because of my ignorance of modern Smalltalks. Could you expand?
The issue is that both characters and symbols should be == whenever they are =. For example:
(Character value: 10) == (Character value: 10)
To accomplish this, there must be a system-wide table of characters and symbols. Currently these are in a class variable (incidentally, Islands was still secure before this work was done -- it was just that creating characters and symbols from their constituents did not work, outside of literals.)
Thus, the two creation methods (#value: and #intern:) for these classes are made into instance methods of capability objects.
This proxy might or might not immediately install the cursor as requested, depending on the precise security policy that is being implemented.
Implemented by whom, how? This is the first I've seen of "security policy" used this way.
Similarly, installing a cursor is accomplished by sending a method to a known global variable. (It could just as well be an instance method of the world....) Depending on what is in that variable, the cursor will become the hardware cursor, or something else will happen.
In order for an island to change the hardware cursor, it will have to be given the capability that influences the hardware cursor from outside.
I would hope to convince you that shared-memory multi-threading, locks, semaphores and such should not be part of SafeSqueak. But, scoping and partitioning issues aside, this is a mostly separate discussion we can leave till another time.
Interesting thought. They often get you in trouble, anyway, just from accidental problems. :)
You don't actually explain what the issue is with Exceptions.
Exceptions, like many things in Squeak, are implemented in Smalltalk-level code. They start with thisContext and then trace around the call stack. Instances of exceptions actually have contexts stored in instance variables.
Access to stack frames has not been secured at all, and it seems difficult to do so. Further, there is little use for it in a restricted context. It's a neat exercise to try and come up with safe access to stack frames, but instead I went after the easier problem of nailing down exceptions.
Only, well, it's still hard. A complication is that users can write subclasses of exception and have their code run! My idea for Isladns was that, to get things going, you could only *define* new subclasses of Exception, and that you could not add any method to them. This solves most uses of exceptions, but it's a kludge and it should be rethought at some point. Most likely, Exceptions are screwed up in Islands in multiple places.
In particular, the following primitives should be disabled:
- instVarAt: and instVarAt:put:, because they allow directly breaking
confinement
- at:, basicAt:, at:put:, and basicAt:put:, if the proxy has indexed fields,
because they would allow directly breaking confinement
I think I understand the others, but what's the problem with #at: and #at:put: ?
It's the same with instVarAt: and instVarAtPut:, only for indexed (numbered) variables instead of named variables. Arrays are probably unlikely to be used as proxies, but then again it's extremely easy to block, anyway.
Additionally, #shallowCopy and #clone make revocation much more difficult; thus, they should most likely be overridden to return the receiver instead of returning a true copy.
If you can't support their contract (and indeed you can't), then shouldn't you just suppress them?
It's already allowed in Squeak. For example, "3 clone" returns 3. Arguably, #shallowCopy should be general and #clone should insist on a real copy, but that's not the way the contracts are right now.
Note that all non-primitive methods from class Object may be safely left accessible. Since such code must consist of message sends between parameters, self, and globals, user code could emulate the code even it it were disabled, and so disabling such methods gives no gain in security.
Are all the globals accessible from methods on Object necessarily accessible by both callers and subclasses?
Not right now, due to the dynamic binding of globals.
There probably aren't many globals floating around up there except for classes, though, so reviewing all the code may not be as tedious as it sounds.
A primary advantage of the submemories approach is that there is no need to add a special kind of cross-memory oop
Given the stated purpose of submemories, you need to be able to reclaim a submemory without being able to reclaim the parent memory. This means that oops from the parent into the sub need to spontaneously seem to become some innocuous object (like null) when this happens. I think you will find support for this hard at fine-grain. E supports this only between vats, where there's an enforced indirection through intermediate objects anyway, and where the possibility of partition is part of the semantics anyway.
I dunno, it seems like you could do an allObjectsDo: inside the VM and look for instance variables pointing into the submemory. Remember than in Squeak, we do have easy access to the VM.
In any case, I recommend postponing further worry about resource controls and denial of service until these other issues settle down.
Agreed.
Okay, here's my solution to the money problem.
Mint, Purse are normal classes with a rich set of methods. The only strange thing is that, Mints instances know of some "money island", which should be easily created via something like "Island new". Money island has all the normal classes installed.
LimitedMint, LimitedPurse are a subclasses of LimitedProxy, and these are what restricted code will see. Probably this is a common pattern in Smalltalk capability implementations, since there are no private methods.
Here are the interesting methods.
===================================================== LimitedPurse>>deposit: rawAmount from: rawOtherPurse | amount otherPurse | amount := self safeIntegerFrom: rawAmount. (self hasSameClassAs: rawPurse) ifFalse: [ self error: 'invalid purse supplied' ]. otherPurse := rawOtherPurse realPurseOnBehalfOf: self.
purse mint moneyIsland installDuring: [ purse mint == otherPurse mint ifFalse: [ self error: 'incompatible mints' ].
amount < 0 ifTrue: [ self error: 'only positive amounts may be deposited' ].
otherPurse decreaseBalance: amount. purse increaseBalance: amount. ].
realPurseOnBehalfOf: aLimitedPurse (self hasSameClassAs: aLimitedPurse) ifTrue: [ ^purse ]
LimitedMint>>newPurse: rawStartingBalance | startingBalance realPurse newPurse | startingBalance := self safeIntegerFor: rawStartingBalance.
mint moneyIsland installDuring: [ startingBalance < 0 ifTrue: [ self error: 'initial balance must be non-negative' ]. realPurse := mint newPurseWithBalance: startingBalance. newPurse := LimitedPurse new initialize: realPurse ]. ^newPurse.
LimitedPurse>>initialize: realPurseToUse purse == nil ifFalse: [ self error: 'dual initialization' ]. purse := realPurseToUse.
=====================================================
Four things seem interesting:
1. There is a ton of code doing input validation. This seems unavoidable with objects-as-capabilities, because callers can pass you any object they please. Some automation would probably be helpful. For example, it looks like E's soft typing is making the E solution shorter (and more reliable). As is, there are a bunch of things in RestrictedProxy such as safeIntegerFor: which will halt execution on bad input.
2. There is no direct support for sealers and unsealers, so the code uses a method realPurseOnBehalfOf:, which in turn relies on hasSameClassAs: to detect valid objects. I believe Sealers and Unsealers, however, could be implemented even within SafeSqueak. (I never saw the point of such a facility until doing this exercise, incidentally -- it's a good one!)
3. Every time the code calls across to a real Mint or Purse object, it has to set the island to Money Island. So the limited methods mostly have the pattern: check arguments, install a sane island, and do the real work inside the sane island. With static binding, this wouldn't be necessary.
4. You have to explicitly check for dual initialization, since construction methods are just normal methods in Smalltalk.
Lex Spoon