[squeak-dev] Subversion (was: Re: Perl is to CPAN as Squeak is to
cputney at wiresong.ca
Mon Jun 30 14:57:24 UTC 2008
On 28-Jun-08, at 12:45 PM, Andreas Raab wrote:
> Colin Putney wrote:
>> On 28-Jun-08, at 5:27 AM, Claus Kick wrote:
>>> If push comes to shove, I would even say, lets ditch them all and
>>> just use SVN like the rest of the planet (if that is possible). It
>>> is hard enough to sell a image-based language with a real IDE to
>>> the C-style crowd, the package management systems should not add
>>> their grain of salt to the soup.
>> Been there, done that... <shudder/>
>> Monticello was created because this turned out not to be feasible
>> in practice.
> Can you say something more about that? A couple of weeks ago I saw a
> demo at HPI in Potsdam where students used SVN down to the method
> level, and it seemed to me that this approach might very well work
> because the SVN granularity is the same as the in-image granularity.
> It may also be interesting that this wasn't even trying to deal with
> source files of any sort - it retained the nature of the image and
> simply hooked it up directly with SVN. From my perspective this
> looked like an extraordinarily interesting approach that I am
> certain to try out as soon as it is available.
DVS, the precursor to Monticello, stored all the source code to each
package in a single text file. Those files were then versioned using
CVS. The file format was a modified chunk format, with the chunks
sorted to prevent unnecessary textual churn. The usage pattern was to
file out, commit, update and file in.
A large part of the problem came from this two step process for
dealing with CVS. It was a hassle to keep track of the state of the
image relative to the state of the CVS working copy. It was easy to
make mistakes - commit when the wc wasn't up to date, develop when the
image wasn't up to date, etc. That would lead to weirdness in the code
that had to be manually sorted out.
Merge conflicts were another problem. The textual merging done by CVS
wasn't smart enough to deal with a lot of the changes that would
happen in development. For example, if two developers each added a
method that sorted similarly, they'd get a textual conflict even
though there was no conflict at the Smalltalk level.
As DVS developed we added functionality to minimize or work around
these issues, until it became clear that it would be less effort to
just keep our own version history and do our own merges. At that point
we ditched CVS and renamed DVS to Monticello.
Now, this idea of using one file per method has come up before, and I
believe it would eliminate many of the difficulties we had with DVS.
Merging methods would get better, for sure. Merging class definitions
would still be hassle, unless each instance variable, class variable,
and pool import were defined in separate files. If the sources and
changes files were eliminated, that would fix many of the
synchronization problems that we had with DVS, since there would be no
need to manually decide when to synchronize.
Still, I see two big problems with this approach. One is that the
synchronization problems don't entirely go away. What if some other
process modifies the files on disk? How does the image find out about
the change, and what should it do in response? What if the
modification happens while the image isn't running? There are probably
answers to these questions, but I doubt they'll be *good* answers.
The other big problem is that tens of thousands of tiny files is a
horribly inefficient way to store source code. Yes, disk is cheap. But
disk IO is not. I discovered this early in the development of MC2,
when I implemented a type of repository that stored each method in a
separate file. Loading OmniBrowser from that repository involved
opening, reading, and closing over 600 files, and was very slow. I
don't remember the exact timing, but I think it was like 5 to 10
minutes, and in any case it was far too slow. Avi wrote a repository
that stored every thing in a single indexed file, and now load time is
dominated by compilation.
A quick doIt in my working image shows 44682 methods. Now imagine that
on start up, the image scans all those files to make sure that all its
compiled methods are up to date. That will take a very, very long time.
More information about the Squeak-dev