MC, P2P, distributed collections

Colin Putney cputney at wiresong.ca
Fri Mar 18 03:57:06 UTC 2005


On Mar 17, 2005, at 3:10 AM, Cees de Groot wrote:

> Ok, looks like my list was fairly complete, then. Does anyone mind my 
> thinking aloud? No? Good ;)
>
> Situation:
> - S1: user U asserts on node N that A is true ("U is a member of 
> community X")
> - S2: user U asserts on node N that A is not true ("U is not a member 
> of community X")
> - (user drinks coffee, reconsiders, logs in on the coffeeroom computer)
> - S3: user U asserts on node M that A is true ("U is a member of 
> community X")
> These assertions are broadcast on the network.

Interesting. I hadn't considered that assertions about the same U might 
be coming from different nodes. Do we also have to deal with the 
possibility that two different people will be making different 
assertions about U at the same time?

> Problem:
> - On node K, the assertions can arrive in any random order.
>
> Solution 1: the 'mc solution'
> - S1 comes with H(S1,timestamp)=VS1
> - S2 comes with VS1,H(S2,timestamp)=VS2
> - S3 comes with VS1,VS2,H(S3,timestamp)=VS3
> This is enough information to let node K find out which assertion is 
> the latest one. However, the 'freak' situation:
> - S1 and S2 didn't arrive to the coffeeroom computer in time. So user 
> wasn't a member on the coffeeroom computer and thus the coffeeroom 
> computer accepted the assertion. This is less an exception if you 
> consider the coffeeroom computer being off-line - one of the 
> requirements, of course, is that you can work while you're off-line.
> So:
> - S1 - VS1
> - S2 - VS1,VS2
> - S3 - VS3
> Now, how does node K find out what the latest assertion was? S2 or S3? 
> This can only be broken either by human intervention of having some 
> global notion of time (regarding both S1 and S3 as having a 'null' 
> parent, this becomes a regular merge conflict). Some mechanism of 
> bringing the various node's clocks together to something "close 
> enough" might be "good enough" to help out in solving these conflicts 
> (exchange your local time with your peers; note average deviation; 
> make "Vx=H(x,timestamp),timestamp" and let timestamp be corrected for 
> this deviation. As it is only for coarse timing and exceptional cases 
> plus inter-node time comparisions only (intro-node goes with the 
> ancestry mechanism), we could let clocks run backwards when 
> correcting, etcetera, which makes synchronisation a whole lot easier).

Another method would be to have the nodes do some chatting to resolve 
the conflict. When K receives S3 and detects the conflict, it 
rebroadcasts S2, which is, as far as K knows, the latest status of U. M 
receives S2 and does the same merge, but since it *knows* that U is a 
member of X, it can resolve the conflict. It then broadcasts S4, which 
succeeds S1, S2 and S3. K receives S4, merges it without conflicts, and 
concludes that U is a member of X.

It's a different bandwidth and latency tradeoff, but it should work.


> The solution that you keep a tab on the time per node doesn't seem to 
> make a difference, either.

Right. Also, if the "clock comparison" strategy turns out to be better 
than "chatty ancestry" we can probably manage to do clock comparison 
only between peers, and not require every node to track the clock of 
every other node.

> And people aren't hopping in and out of communities regularly, so 
> carrying the whole version history for just a single assertion around 
> is probably not too much trouble - the usual case is that you have at 
> most 2 assertions (an initial postive assertion and a later 
> retraction; after some time, both will probably be cleaned up). Ditto 
> for MC, probably: you assert "method M has body B", then "method M has 
> body C", ... but for the vast majority of methods this'll only go 
> throuh a four or five revisions.

Agreed.

> Lots of verbosity and thinking aloud - but did I understand you 
> correctly here, Colin?

Yes.

Colin




More information about the Squeak-dev mailing list