MC, P2P, distributed collections (Was: Re: Morphic Splitters Team progress report)

Cees de Groot cg at cdegroot.com
Wed Mar 16 15:10:19 UTC 2005


On Wed, 16 Mar 2005 13:01:03 +0100, Avi Bryant <avi.bryant at gmail.com>  
wrote:
>  This won't work very well with a naive HTTP or FTP implementation of
> one file per method; even just using that approach in a local
> directory proved to be unacceptably slow.  So if anyone has an
> experiment they want to try here (p2p, cees?), that would be great.
>
The 'easy' solution would of course be to write a appserver with Rmt or  
similar. But if you say 'experiment' and 'p2p' in one sentence, yeah, I'm  
all ear :)

The p2p project has one major issue to solve - distributed collections.  
Which is starting to look an awful lot like MC anyway. The idea is that if  
you have, say, communities of users then two users at two points in the  
network should be able to simultaneously add themselves to a given  
community (or one adds himself, the other removes herself). So, the  
information of what users are member of which community isn't so much a  
centralized collection, but rather a distributed set of assertions about  
membership in a collection.

Of course, the biggest issue is how to propagate and merge changes, how to  
resolve conflicts (probably by throwing a coin - we'll just write the app  
so conflicts are unlikely: user A will end membership at the same time as  
user B joins, but will one machine remove user A and another one add user  
A at around the same time? Hardly likely), etcetera. Propagation of deltas  
is already handled by the 'presence' mechanism; all I have to do is think  
about how to propagate, merge, and keep around assertions.

The case I'm currently thinking about: user A is a member, ends  
membership, but on second thought joins a moment later. So we have an  
existing assertion 'A is member of collection X', then an assertion 'A is  
not a member of collection X', and some time later a repeated assertion 'A  
is member of collection X'.

Of course, Joe Random P2P Node can get the two deltas in any order...

A global notion of time could be possible, but I wouldn't like to mandate  
that nodes be synchronized by NTP or similar (nodes are exchanging  
information anyway, so I could add some sort of virtual clock  
synchronization to the protocol, but that's a *lot* of work - a globally  
monotonically increasing counter would make a lot of things simpler,  
though). The alternative is to have these assertions to carry along some  
sort of MC-like ancestry: every version gets a new UUID, and versions  
state what their parent is. But that would mean that these assertions  
would get large. As they are the glue in the whole distributed information  
repository and therefore quite common with probably frequent updates this  
could mean quite a bandwidth burden...

Anyway, our persistent collection (Set-like, actually) class is at the  
moment around the only thing that doesn't play well with p2p, so that's #1  
on my agenda. After that I'm more than happy to experiment with an MC2P2P  
repository (starts to sound like some rolling piece of scrap in an SF  
movie...)



More information about the Squeak-dev mailing list