[Q][magma] MaObjectSerializer documentation and usage

Robert Withers rwithers12 at attbi.com
Wed Jun 11 09:36:06 UTC 2003


On Tuesday, June 10, 2003, at 04:42 PM, Chris Muller wrote:

>> Chris, it looks like it took at least several weeks for you to write
>> Magma.
>
> Ha!  I only wish!  The initial concept for Magma actually started in 
> February
> 2000.  I spent about ten hours a week since then to build and assemble 
> the
> independent frameworks that make it work the way it does today.  
> Thankfully,
> the codebase ended up relatively small and dense, so I prefer to think 
> of it as
> a fine instrument than a giant bulldozer.  :-)

I thought you would enjoy my understatement.  :-)  Three years of 
effort is no small dedication - approximately 1500 hours!?  I wonder 
how much value is published on SqueakMap.   My interest in getting some 
of the features of E in Squeak started about 2 years ago, but I have 
been much more inconsistent with my efforts to date.   Hopefully this, 
serialization, is the home stretch.  I am also hoping for _some_ of 
those fine instrument qualities in SqueakElib.


I am doing distributed objects.  Specifically, I am trying to replicate 
the Elib framework from E, and I am finishing up the CapTP layer ( 
http://www.erights.org/elib/distrib/index.html ).   It has some very 
interesting features, but it is fundamentally a distributed object 
protocol.

Elib uses Java serialization.  If only we had a Java serialization 
framework, then E and Squeak could interoperate, but that is a broader 
task than I am willing to pursue at this time. :)    While I won't be 
able to provide interoperability with E yet, CapTP does use Descriptors 
to exchange these references and I would like to adhere to the wire 
protocol as much as possible.  Finally, there is information being 
passed by these descriptors that is used to establish 'session' in 
CapTP.

You may be interested in this paper:  
http://www.erights.org/data/serial/jhu-paper/intro.html

The E terminology for a separate environment of objects is a Vat.  So 
let's say we have VatA and VatB.

The CapTP protocol uses several different types of proxies (supporting 
the aforementioned interesting features), and they are exchanged by way 
of different types of Descriptors - 4 different types.  All descriptors 
include a connection specific ID (wirePosition), but some of the 
descriptor types may also include other information like objectIdentity 
and a security hash.

The objectidentity is a 160 bit random number and they are registered 
in an object table (called a swissTable).  They can be accessed from 
multiple connections.   It is truly the object's identity.  If I add 
persistence (and I was looking toward Magma), these registered objects 
will be persistent and keep this ID.

The wirePosition is a connection specific id for traffic over that 
connection for that object.  Different descriptor types use different 
tables to store these wirePositions.   They are effectively session IDs 
for each object.

Your OIDs I think are scoped to each serialized graph.  Isn't it the 
mechanism you use to implement a forwarding table, so you only have to 
write each object once?  Your assigning of Class Ids is different from 
these OIDs, since classes look to be pre-registered.


When a graph is to be serialized, in VatA, I'll use substitution to do 
two things for proxiable objects.  First, make the necessary 
registration in VatA's export table and get a wirePosition, as well as 
get any other info needed for the descriptor, such as registering an 
exported object with an identity. Second, is to actually substitute the 
resulting descriptor in the graph of objects.

When I materialize (depickle?) the graph, in VatB, I need to do 3 
things.  First is to build the appropriate object representing the 
remote proxiable object (a handler).  Second is to make the correct 
registration for that object in the VatB's import table.  Third is to 
substitute a proxy on the handler into the graph of objects.


I want to do this work as part of a single pass of 
serialization/materialization, for speed and to avoid that tricky 
become: issue.


>> When materializing a byteArray, I also need to substitute objects for
>> descriptors.
>
> If your descriptor instances only purpose in life is to be friendly to
> serialization/materialization, and not intended to be used by the 
> external
> program which works with the serialized/materialized objects, then I 
> think
> you'll be happier with not using substitutions at all.  That's 
> because, under
> that scenario, there is a cost incurred to materializing the 
> descriptor and yet
> is an object that nobody wants.  You only want it so that it can be 
> converted
> to the object it describes.

The descriptors are used by the external program, but so veery close to 
the serialization boundary, that it could perhaps be part of the 
serialization protocol.  I think that may be mixing responsibility too 
much, though.


> To illustrate, pretend the serializers traversalStrategy has a method 
> called
> "realObjectForSubstitute:" which takes the substitute and gives you 
> back the
> original (the opposite of what it already has, which is 
> "substituteFor:").

great illustration!  I would rather not do a become if possible.   I 
like the semantic #substituteFor: for use in both directions.  Perhaps 
having a different strategy for serialization and materialization would 
make sense.  Another option would be for the Serializer to ask the 
traversalStrategy #isSubstitutingForSerialization and 
#isSubstitutingForMaterialization, but not in the inner loop of course.

I actually plan on having two Serializers, one for serialization and 
one for materialization.  These functions happen on different threads.


>> You seem to be
>> recommending that I use special ObjectBuffers that materialize to the
>> correct object.
>
> Not trying to "recommend" anything without a full understanding of your
> context, sorry.  It wouldn't be a special ObjectBuffer, it would be an 
> existing
> one, ByteObjectBuffer.

I didn't understand your use of ByteObjectBuffer in the previous mail.  
Plus I was _very_ tired and couldn't think straight any longer.  Funny 
enough I am approaching that same dilemma this morning.  :-)

you wrote:
 > > > On the materialization
 > > > side, I remember now why plain ol'e ByteObjectBuffers to name the 
global worked
 > > > much better for me with Magma than substitutions.  In the 
materialization
 > > > process, the framework ends up creating a basicNew instance of 
the object and
 > > > then setting its instVars one by one from the buffer (taking its 
oids and
 > > > looking up the objects for them).  If, at that point, it creates 
an "interim"
 > > > instance of a "Descriptor" you're going to eventually want to 
translate it back
 > > > to the original object (e.g., Smalltalk or Processor), which 
requires a become:
 > > > since it could be referenced from many objects in a single 
materialization.

Ok, I get it now.  You were discussing preserving object identity.  
Here's what's happening in Elib.  The tables are maintaining identity 
for the objects.  If a given graph, in VatA, has 2 references to the 
same object, serialization will create 2 different descriptors with the 
same wirePosition etc.  (will it!? dunno).   During materialization, 
each descriptor will be bound into the object tables and return the 
same object.  Since this substitution happens "in-line" all the 
references will be hooked up appropriately.

Note that the Java Serialization framework does allow substitution for 
both directions, but I don't know what it does about object identity.   
Actually, I just realized, writing the previous sentence, that Java is 
probably maintaining identity such that only one Descriptor will be in 
a given serialized graph and so only one substitution will occur on 
materialization.  I wonder if you could really screw up serialization 
in Java by substituting the same object multiple times.  I'll bet they 
handle that through an internal object identity table of some kind.

It seems that my object tables are in fact part of the serialization 
process and that supports my original guess that I need a special 
OIDManager for CapTP, if only I knew where to stuff the extra info and 
discriminate between the different types of descriptors.  At this 
point, substitution does seem attractive with the warning that I may 
break identity invariants if I am not careful about my substitutions, 
and then don't do a become; rely on the user to do the right thing.


> Perhaps you could give me some more concrete details, or perhaps a 
> simple code
> sample that demonstrates what you're wanting to do.

I wish I had a simple code example to give you!  Hopefully I explained 
briefly (not!) and clearly enough to communicate my situation well.

cheers,
Rob



More information about the Squeak-dev mailing list