[OT] Insight on distributed computing wanted
Michael van der Gulik
squeakml at gulik.co.nz
Wed Jun 16 10:08:41 UTC 2004
This is what I worked on for my master's project - at dpon.sourceforge.org
(big mess, will clean one day) and written using Squeak. I never got it
working but I'm still trying to get some basic things working.
On 14 May 2004 17:55:37 +0200, Martin Drautzburg
<martin.drautzburg at web.de> wrote:
> I've been pondering over the question:
> "information is no much easier to transport than material
> objects, so why is distributed computing so difficult".
Because the context of information is very specific, whereas the context
of material objects (i.e. a physical world, 3 dimensions plus time, plus
all the rules of physics) is quite big. Material objects can be moved
anywhere in 3-D space because its still the same context.
> Objects ultimatively reference their worlds. A Canvas object
> eventually references my graphics card, my monitor and the
> eyes and the brain of me, the user. None of these objects can
> be transported easily, so the boundary between a Canvas object
> and its world is definitely above the graphics card. The
> graphics card and all other material objects are
> "immobile". Maybe even the Canvas object itself is immobile.
My solution was to couple each object with a ReplicationAlgorithm object.
All messages sent to the object are captured and processed first by its
ReplicationAlgorithm. This meant that the distributed "aspect" of that
object (i.e. behaviour and state of the replication) is separated from the
normal.. er.. behaviour of the object.
I'm not very familier with the Canvas class yet... I haven't done any
graphics programming in Squeak. If the Canvas object is the wrapper around
the physical device, then yes, you'll need to use remote invocation. If
its just another layer of abstraction and it doesn't need any access to
plug-ins or hardware, you could use another replication algorithm, such as
having replicas placed on any computer that wants to use that object, and
using some consistency protocol keep the replicas up to date with each
It does get hairy quickly though.
> In any case there are mobile and immobile Objects.
Well, there can be replicated objects, where there's usually one replica
per machine. Examples of replication algorithms are:
- remote invocation / call-by-reference (one central replica, pack the
message up, send it off, wait for reply).
- migration / call-by-value (fetch the object, send it a message locally)
- master/slaves (read from a local replica, write to a central master -
for state only).
- broadcast (send all messages to all replicas
and then of course your imagination is the limit.
> When the world of an object is replaced the object needs to
> attach itself to a new world. If the object should expose a
> "similar" behaviour in the new world, then the two worlds must
> be reasonably alike. The same is true for material objects.
Yea, this is the hairy bit. Essentially, if you migrate an object, then
everything it has a relationship with must also be replicated. At the
object's destination, every reference the object owns must be converted to
some form of remote reference.
Be careful with your 'physical objects' analogy. The word "Object" was
perhaps a poor choice of words in Smalltalk. A better name for the
entities in Smalltalk would be "Concept". The physical "Object" could be
considered a sub-class of a "Concept" which can only exist in a 3-D
physical world. Concepts themselves consist only of relationships with
other concepts, with exceptions for things like numbers and characters. It
is those relationships which form the context of that concept. If you take
the concept out of that context, it becomes meaningless because its
essence - the relationships - no longer have meaning.
The solution is to ensure that a Concept (/Object)'s relationships remain
valid after moving or migrating that Concept/Object.
> * HIDING INFORMATION
> You typically don't want to expose too much information. An
> object on the sender side may "know more" than is relevant for
> the receiver. Likewise the object on the receiver side may
> know more about the receiver than is relevant for the sender.
Good point. This is one of the hard parts of distributed computing - it
can be difficult to encapsulate the distributive nature of an object. Lots
of bad things happen in distributed systems - networks aren't as reliable
as local processes and memory. Simple things like message sends aren't
guaranteed to happen, meaning that every invocation could cause a
distributed exception. Also, like you say, an object sometimes needs to be
aware of its migration, and adjust itself accordingly. It needs to be able
to serialize itself and unserialize itself, storing only the relevant
information it needs to remake itself, and be able to reconstruct its
context at the destination. This gets quite involved. Serialization is
hard - what do you do with a 32-bit local reference to another object? My
solution was to replicate each object that is referenced, and then ask the
coupled replication algorithm for a serialized reference to it.
> This means that objects have to "mutate". They can be in one
> of three states:
> - objects attached to the sender's world
> - object detached
> - object attachted to the receiver's world
This is essentially serializing and deserializing an object.
More information about the Squeak-dev