Going for the Full Monti (Re: How to improve Squeak)

Wed Jul 14 21:12:24 UTC 2004

changed the order a bit... 

Avi Bryant <avi at beta4.com> wrote:
>  It sounds 
> like the question you're asking is "how do we replace the BFAV", and I 
> was trying to answer "how do we replace the update stream".  I think 
> the answer to the latter will inform the former - except we won't 
> really replace it, just migrate it to a slightly different workflow.
I guess you're right, and its a good distinction. Replacing the update
stream is almost easy - make a mechanism that polls for new updates from
Doug, eats up the updates automatically one by one, commiting after each
one, and saves to a public repository, and you're almost done. Still
need to deal with non-code changes to the image, scalability (so its
practical to merge images-worth of code) and so forth. But it doesn't
seem to require big conceptual changes to me, so it might be a good
first step. 

The problem of replacing the BFAV is really the problem of integrating
lots of little changes from uncoordinated sources. Seems to me like this
is the harder problem, but also the one that will give us more network
effect gains, which (IMO) is really what we're trying to leverage.

Let me give you an example - I recently thought of an enhancement for
the BFAV - that it should show first all the items you ever replied to,
except those in which you're the last who replied. This way you'd see
right away any progress with an item you're invested in. Of course this
lead to the next thing, which is that the rest of the stream could be
sorted using some predictor of "how interesting is this to me", as used
in Bayesian spam filters and such. For a small, coherent project 
currently using MC, with 10 distinct contributors, this is irrelevant. For
 Squeak, this could help harvest more of the network effect benefits of
 many people doing things from their own POV.

> But if you're testing multiple patches at once, what you're really 
> testing is whether or not to patches can coexist.  Which is a good 
> thing to test (and probably deserves a version of its own, with a 
> commit message like "successfully merged changes A and B, had to make 
> minor tweak C"), but you should have tested them in isolation first, 
> surely?
In a big enough system, given finite resources, I think the assumption
"most changes made by different people are independent" is reasonably
valid and really quite important. Most of the testing done on software
isn't intentional and dedicated to a patch, is is simply use of a system
that includes it. We do the distinctio work lazily, when bugs actually
happen. Seems to me that requiring a stack discipline in the tools simply 
means that most of the actual, defacto testing will be harder to report 
properly. 

BTW, given a good versioning system, we have more options for tools to 
help do the lazy distinction between bug-causes (found a bug, made a 
test - fork off a process that runs the failing unit test repeatedly, peeling
off versions until it works again, remove the culprit, start adding stuff
back while doing regression testing, then report the functional 
dependencies :-).

> If you were really careful, you would use a "stack discipline" in your 
> day to day development, too. 
If I were really careful, my room would be in order, and I'd never lose
my keys, either ;-)

> That is, the ideal thing from 
> Monticello's point of view would be that, every time you started a new 
> set of changes, you reverted back to the earliest version that could 
> support them and continued from there.  If the next changes you make 
> are unrelated to those, you revert back again and branch.  Then if both 
> of those changes are required to do the next changes, you merge the two 
> resulting versions and use that as the baseline going forward.  But you 
> shouldn't do them linearly, because that implies to the versioning 
> system that the second changes are dependent on the first, and you'll 
> have trouble if you later want to use them on their own.
> Now, I don't do this much in practice, because it's a hassle, and not 
> the way I'm used to working.  But it would be interesting to see if 
> this awareness of the dependence and independence of sets of changes 
> could be integrated into the development environment well enough that 
> it was a natural and pleasant way to work.
Sure, but if you pick up two random patches from the BFAV,
they're much likelier to be independent than otherwise. Assuming that
any two patches have a specific ordering dependency just because I filed 
them in in that order seems unwarranted. BTW, I see how assumptions 
about order are more useful when a single person is developing - then 
there's usually a thread of dependencies between changes made, at least 
if one commits often enough, but it I think they apply less when this person 
is acting as integrator doing "random access" the great pool of floating
patches about anything and everything in Squeak.

> > Also, I might later on have
> > additional comments to add that are related to that patch like "also
> > appears to solve the X bug", even without changing the code.
> 
> Oh, I'm not really suggesting that the commit log be the only source of 
> comments.  We definitely want some kind of bug-tracking system that's 
> connected to all this that is organized by thread, not by version.  But 
> I think the traditional "issue" is more appropriate here than "patch": 
> as in the BFAV, you want to track comments and associated 
> changesets/versions right from the time the bug is reported to the time 
> it's closed, not just the history of a particular changeset.
Could be that separating the code from the problem would do the trick.
But I'm not sure. How do you think this external system would allow us
to know that this "issue" is dealt with in versions x,y,z? its pretty
easy to detect whether their ancestors include a specific patch, so the
concept seems useful. An issue can also often generate distinct
patches... so they seem like distinct connected concepts to me, rather 
than mutually exclusive.

> Avi