distributed collections (Was: Re: Morphic Splitters Team progress
Cees de Groot
cg at cdegroot.com
Wed Mar 16 15:10:19 UTC 2005
On Wed, 16 Mar 2005 13:01:03 +0100, Avi Bryant <avi.bryant at gmail.com>
> This won't work very well with a naive HTTP or FTP implementation of
> one file per method; even just using that approach in a local
> directory proved to be unacceptably slow. So if anyone has an
> experiment they want to try here (p2p, cees?), that would be great.
The 'easy' solution would of course be to write a appserver with Rmt or
similar. But if you say 'experiment' and 'p2p' in one sentence, yeah, I'm
all ear :)
The p2p project has one major issue to solve - distributed collections.
Which is starting to look an awful lot like MC anyway. The idea is that if
you have, say, communities of users then two users at two points in the
network should be able to simultaneously add themselves to a given
community (or one adds himself, the other removes herself). So, the
information of what users are member of which community isn't so much a
centralized collection, but rather a distributed set of assertions about
membership in a collection.
Of course, the biggest issue is how to propagate and merge changes, how to
resolve conflicts (probably by throwing a coin - we'll just write the app
so conflicts are unlikely: user A will end membership at the same time as
user B joins, but will one machine remove user A and another one add user
A at around the same time? Hardly likely), etcetera. Propagation of deltas
is already handled by the 'presence' mechanism; all I have to do is think
about how to propagate, merge, and keep around assertions.
The case I'm currently thinking about: user A is a member, ends
membership, but on second thought joins a moment later. So we have an
existing assertion 'A is member of collection X', then an assertion 'A is
not a member of collection X', and some time later a repeated assertion 'A
is member of collection X'.
Of course, Joe Random P2P Node can get the two deltas in any order...
A global notion of time could be possible, but I wouldn't like to mandate
that nodes be synchronized by NTP or similar (nodes are exchanging
information anyway, so I could add some sort of virtual clock
synchronization to the protocol, but that's a *lot* of work - a globally
monotonically increasing counter would make a lot of things simpler,
though). The alternative is to have these assertions to carry along some
sort of MC-like ancestry: every version gets a new UUID, and versions
state what their parent is. But that would mean that these assertions
would get large. As they are the glue in the whole distributed information
repository and therefore quite common with probably frequent updates this
could mean quite a bandwidth burden...
Anyway, our persistent collection (Set-like, actually) class is at the
moment around the only thing that doesn't play well with p2p, so that's #1
on my agenda. After that I'm more than happy to experiment with an MC2P2P
repository (starts to sound like some rolling piece of scrap in an SF
More information about the Squeak-dev