Cees de Groot wrote:
The only thing that's not neat about E is that it sits on top of Java and has a shitty development environment, so it's time to rob their ideas and transport them to Squeak ;-).
I just wanted to endorse this thought, and to make it clear that the E community, myself included, would be very eager to see progress in this direction, and can probably help. We're primarily interested in seeing the ideas succeed, rather than E per se. On the erights site, we regularly endorse "competitors" that seem to be getting things right. So please "rob" away!
After seeing Lex Spoon's ( http://minnow.cc.gatech.edu/squeak/2074 & http://www.cc.gatech.edu/~lex/7431/term.ps ) and Rob Wither's ( http://minnow.cc.gatech.edu/squeak/2410 ) efforts in this regard, I extended the E history page ( http://www.erights.org/history/overview.html ) with my hope for these efforts. If it doesn't offend anyone, I've given this pair of efforts the unauthorized name "Squeak-E".
On the e-lang list ( http://www.eros-os.org/mailman/listinfo/e-lang ) Rob Withers has been discussing with us where to go from here. Those of you interested in joining the discussion are most welcome.
---------------------------------------- Text by me above is hereby placed in the public domain
Cheers, --MarkM
Mark S. Miller markm@caplet.com said:
On the erights site, we regularly endorse "competitors" that seem to be getting things right. So please "rob" away!
Thanks :-). I'm subscribing to the mailing list, but just to fill me in: is there any work on making sure that vats from multiple languages can communicate? That'd be really neat...
At 08:46 AM 4/30/2002 Tuesday, Cees de Groot wrote:
Thanks :-). I'm subscribing to the mailing list,
Glad to have you aboard! I'm cross-posting this message to both lists, but let's continue the discussion on e-lang.
but just to fill me in: is there any work on making sure that vats from multiple languages can communicate? That'd be really neat...
Well maybe it doesn't yet qualify as "work", but here's a snippet of conversation from e-lang:
I wrote:
Might it be possible for a Squeak "vat" to speak CapTP and interoperate (at vat granularity) with E vats?
Rob Withers wrote:
I don't see why not but Java serialization would be an issue. I was going to go with a generic serialization framework, using smalltalk serialization (one of many versions), then look at building java serialization. As you can imagine, this is a big task.
I wrote:
Let's start by trying to agree on a relatively platform neutral textual serialization based on term trees. We can then later discuss a binary equivalent, and evaluate Java serialization as a candidate format. We are committed to having adequate connect-time negotiation to accommodate these even if we can no longer drop our current serialization.
By platform neutral, I include neutrality from Java vs other implementation platforms for E, but not neutral regarding E vs Squeak. Rather, let's proceed as if your efforts were to result in an E implemented in Squeak (Squeak-E). We want disparate implementations of E to be able to speak to each other anyway (eg, ENative/CapScript (E on C++), CapSharp (E on .NET)), so this may be a good way to get our feet wet.
The E protocol, Pluribus, comes in two levels: VatTP, described well at http://www.erights.org/elib/distrib/vattp/index.html and CapTP, barely described at http://www.erights.org/elib/distrib/captp/index.html . All the serialization issues below occur at the inadequately described CapTP level.
The current status is that I owe Rob a message explaining our Term trees. I'd love to hear any other thoughts on "platform neutral" serialization.
1) By "platform neutral" I don't mean that it can't favor or be based on one platform (such as Squeak), but rather that it not impose a level of pain on other platforms significantly beyond what a truly neutral format would impose. For example, Java's RMI, despite all its faults, is a better "platform neutral" system than Corba. While RMI is built to support Java specifically, it is less painful for other platforms than is Corba. Corba imposes similar pain on everyone, and that level of pain is higher for everyone. I'm more interested in standards which reduce the general level of pain than in ones which allocate pain "fairly". A platform independent system, like Corba, is just covertly yet another platform, and usually one less well designed than the platforms it's trying to be independent of. So don't be shy about proposing Squeak-based or Parcels-based serialization formats.
2) From previous discussion on e-lang, I feel we must have both a textual and binary serialization format, designed together, or one derived from the other, such that conversion in either direction is meaning preserving. A connection will start out textual, and will switch to binary only after a text-based negotiation about how to proceed. That way we enable text-only processors (humans on a telnet) to play.
3) Both formats must provide good support for upgrade (or class evolution). Java serialization, for all its awful complexity, does this, and I'd like one that does at least this well. My understanding is that Parcels does an excellent job at upgrade, but I'm much less familiar with it.
4) The textual format must be human readable and human editable. This combined with the "meaning preserving in both directions" requirement means that one can edit a binary serialization graph by translating to text, editing, and translating back. Likewise, it enables a logger to log binary network traffic as binary, and then later present it as textual if there's interest. An edited textual log can then be used as a basis of test cases.
5) The binary format must be able to perform reasonably. My understanding is that Parcels is unmatched here on reading speed, but I have no idea what price it pays on writing. For CapTP's use, each graph will be written once in order to be read once. For persistence use, many more graphs will be written (one per checkpoint) than will be read (one per revival). Both of these are different than the tradeoffs Parcels was optimized for.
6) Both formats must be simple, and easy to document. Java serialization fails this test.
7) The format must be compatible with our security requirements. In particular, since E is only prepared to trust mobile E code, any need for executable code in the serialization format must be E code. Parcels as is fails this test of course.
---------------------------------------- Text by me above is hereby placed in the public domain
Cheers, --MarkM
Mark,
As the CORBA apologist, I just have to chime in... :-/
- By "platform neutral" I don't mean that it can't favor or be based on
one
platform (such as Squeak), but rather that it not impose a level of pain
on
other platforms significantly beyond what a truly neutral format would impose. For example, Java's RMI, despite all its faults, is a better "platform neutral" system than Corba.
I think our assumptions about "better" weight different factors...
While RMI is built to support Java specifically, it is less painful for other platforms than is Corba.
I'd appreciate references to experiments that demonstrate this. It could well be true. It is also worth noting that RMI is also lighter-weight and therefore less general than CORBA. Depending on system constraints, this will obviously favor one over the other.
Corba imposes similar pain on everyone, and that level of pain is higher for everyone. I'm more interested in standards which reduce the general level of pain than in ones which allocate pain "fairly".
I object to the use of the word "standard" in this context. I think you mean "platform" here.
A platform independent system, like Corba, is just covertly yet another platform,
Nothing covert - OMG has recognized CORBA as an independent platform at least since I started participating in 1996!
and usually one less well designed than the platforms it's trying to be independent of.
Of Java I would agree, too :-D At least CORBA is designed - Java is cobbled together under unreasonably tight deadlines, which leads to it being timely, but not necessarily well built. All (except timeliness) complaints that are often aimed at CORBA and MS as well.
The statement as spoken is true; however, as a long-time observer of the CORBA development process, I would claim that the implicit assertion that CORBA is (intentionally) less well-designed is untrue. I would agree that it may not be as effective as more recent equivalent protocols (RMI is NOT an equivalent protocol - merely similar). But this is due more to the fact that GIOP is older (~1993) and doesn't benefit from more recent research.
In general, I have the impression that you're not making a sufficient distinction between the run-time CORBA infrastructure (the ORB) - which isn't particularly relevant to this discussion - and the Common Data Representation used as the normalized format in CORBA message - which is VERY relevant to this discussion. Whether or not you choose it, I think that you should consider the CDR as a possible serialization format. More specifically, after looking at the requirements below, I think that you should consider CDR-encoded OMG Objects-by-Value (IDL valuetypes) as an option. The URL for this aspect of the CORBA spec is http://www.omg.org/cgi-bin/doc?formal/01-12-43
So don't be shy about proposing Squeak-based or Parcels-based
serialization
formats.
- From previous discussion on e-lang, I feel we must have both a textual
and binary serialization format, designed together, or one derived from
the
other, such that conversion in either direction is meaning preserving. A connection will start out textual, and will switch to binary only after a text-based negotiation about how to proceed. That way we enable text-only processors (humans on a telnet) to play.
With the possible exception of the telnet requirement, I think that valuetypes would support this, since there is the standard CDR encoding, as well as a valuetype-to-XML mapping specified :-)
- Both formats must provide good support for upgrade (or class
evolution).
Java serialization, for all its awful complexity, does this, and I'd like one that does at least this well. My understanding is that Parcels does
an
excellent job at upgrade, but I'm much less familiar with it.
If this is code evolution, then the valuetype spec has the concept of being able to identify the "codebase" for which the valuetype is designed. If the issue is state evolution, the valuetypes include inheritance semantics; therefore, they can evolve in a backwards compatable way.
- The textual format must be human readable and human editable. This
combined with the "meaning preserving in both directions" requirement
means
that one can edit a binary serialization graph by translating to text, editing, and translating back. Likewise, it enables a logger to log
binary
network traffic as binary, and then later present it as textual if there's interest. An edited textual log can then be used as a basis of test cases.
This is probably hard in the CDR case, but certainly would be possible in an XML marshalled message.
- The binary format must be able to perform reasonably. My understanding
is that Parcels is unmatched here on reading speed, but I have no idea
what
price it pays on writing. For CapTP's use, each graph will be written
once
in order to be read once. For persistence use, many more graphs will be written (one per checkpoint) than will be read (one per revival). Both of these are different than the tradeoffs Parcels was optimized for.
I'll admit that I can't speak to this.
- Both formats must be simple, and easy to document. Java serialization
fails this test.
The CDR format is well specified and straightforward. XML should be as easy as it ever is (opinions seem to vary in this forum).
- The format must be compatible with our security requirements. In
particular, since E is only prepared to trust mobile E code, any need for executable code in the serialization format must be E code. Parcels as is fails this test of course.
The valuetype serialization carries no code, but does provide a slot to identify downloadable code that implements the valuetype.
-DMC
Mark S. Miller markm@caplet.com said:
So don't be shy about proposing Squeak-based or Parcels-based serialization formats.
Braindump:
- SRP (State Replication Protocol) is a Smalltalk-platform-neutral protocol with some nice properties, and is expressly meant to be usable on platforms besides Smalltalk. Especially the data encoding is cool, but apart from that it also has good support for class evolution, etcetera. http://wiki.cs.uiuc.edu/CampSmalltalk/About+State+Replication+Protocol+(SRP)
- Python has a neat serialization format that is based on some sort of mini-language which has textual and binary representations, support for custom formats and things like oid-replacements. Jim Fulton (Zope and formerly Smalltalk) thinks it is usable from other languages as well. The pickle.py module is less than 1000 lines of code.
- I wouldn't suggest XML - all the code doing the XML parsing is probably too large for anything security-related.
- RMI would be doable in Smalltalk, I think; it's well specified and the people behind RMI know what they were talking about so IMHO it's an OK wire protocol. The multiplexing makes it quite efficient from a resource-usage point of view. As you remark, Java serialization is a tad complex, so this might be a big job. http://java.sun.com/products/jdk/1.1/docs/guide/rmi/spec/rmi-protocol.doc.ht... http://java.sun.com/products/jdk/1.2/docs/guide/serialization/spec/serialTOC...
- The Parcel format is VisualWorks specific, although it probably could be ported (the problem is that the format has no formal specification AFAIK, so a 'port' would almost certainly mean having to read code from VisualWorks, which brings you in muddy legal waters).
Even though I haven't followed the conversation on the E list on the necessity of a text format, I dissent. I think that if you want a human to talk to a running E instance through telnet you would better open a separate connection with a minimal command language (maybe one that would allow you to specify an object, step by step, and then you enter the "send" command).
- The format must be compatible with our security requirements. In
particular, since E is only prepared to trust mobile E code, any need for executable code in the serialization format must be E code. Parcels as is fails this test of course.
In the context of heterogenous E implementations, the point of mobile code becomes tricky, of course :-)
squeak-dev@lists.squeakfoundation.org