MC, P2P, distributed collections

Colin Putney cputney at wiresong.ca
Wed Mar 16 18:01:11 UTC 2005


Cees de Groot wrote:
> On Wed, 16 Mar 2005 11:52:22 -0500, Colin Putney <cputney at wiresong.ca>  
> wrote:
> 
>> With MC2, each presence assertion would be versioned separately. Thus  
>> you'd only get new versions when some node left or joined the 
>> network,  and you'd be able to compare it with your current notion of 
>> the status  of the node: either it's the same version (status hasn't 
>> changed), it's  an older version (you can discard it), or it's a newer 
>> version (you need  to update your status)
>>
> Your are talking about 'older', 'newer'. There are three ways my  
> restricted bit of grey mass can come up with:
> - Transfer the notion with the information, like MC1 does;
> - Globally synchronise time;
> - Nodes send their notion of time around and you keep a tab on the last  
> time seen per node. Could get expensive.
> But then, I've been busy with non-computer stuff the whole day so 
> please  tell me which obvious method I'm overlooking (and you'd do me a 
> big favor  if you would apply this to the example *I've* been thinking 
> about, with  the distributed collections of assertions you want to 
> reconstruct locally  to a regular collection with some degree of 
> confidence).

Ok, maybe I was reading too much into your reference to MC1. The basic 
point I wanted to make is that if we were to take a "versioning" 
approach to the problem of synchronizing information between nodes, the 
actual code in MC1 wouldn't be very useful, but the code in MC2 might 
be. So I was talking about your first point above: transfer the notion 
with the information.

Here's what I imagine:

MC2 versions things called Elements. There are ClassElements, 
MethodElements etc. So, for the p2p project, we could have a 
PresenceElement. It would have instance variables like #community and 
#username, which would identify it within the network.

Now, the state of the presence that a given PresenceElement identifies 
changes over time, as the user joins and leaves a particular community. 
Every time it changes, we create a new version of that element (in MC2, 
an instance of ElementVersion).

Instead of a UUID, we use a combination of SHA1 hash and timestamp to 
identify each version. Each version also has an ancestry, which is a Set 
of all the hashstamps of the versions that precede it.

So, every time a user joins or leaves a community, that creates a new 
version of her presence. These versions would be transmitted over the 
network, and other nodes would use it to update their notions of the 
community. The update would be simple - it's a straight ahead merge of 
the new version with the existing version.

As I said, I think the versioning overhead would be fairly low, but 
you're right that it would grow over time, especially if a particular 
user was always coming and going. We might want to have some strategy 
for ancestry trimming; I think it could be done reasonably safely 
depending on what assumptions we're willing to make. For example, if we 
assume that each node has a consistent clock, (though not necessarily 
synchronized with other nodes), we can trim ancestry that is older than 
say, one hour.

Gotta run for now.

Colin




More information about the Squeak-dev mailing list