[squeak-dev] Urgent, Spur users please read, was: The Inbox: Kernel-kfr.858.mcz

Frank Shearar frank.shearar at gmail.com
Wed Jun 18 19:53:32 UTC 2014


On 18 June 2014 17:09, Chris Muller <asqueaker at gmail.com> wrote:
>>>> The basic unit we deal with in MC is the MCZ file (or MC version if it's not in a file). With proper branching, this basic unit *carries the information* of which branch it's from.
>>>
>>> Agreed.  As, it does, its full ancestry.
>>
>> Nope. The ancestry does not have a slot to tell which branch each version is from. That info is *intended* to be taken from the version name.
>>
>> Or are you seriously suggesting to parse comments to tell if a version is from a branch? It's *impossible* to tell from just looking at the ancestry - these packages do share a common ancestry with Trunk packages, that's the whole point of branching instead of creating new packages.
>
> Maybe it would help if you identified what use-case you're talking
> about where branches can save someone so much effort?  I was using the
> use-case of setting up a new image as my example, and how a separate
> repository makes branch-tagging unnecessary.  What use-case are you
> referring to and how often does that use-case occur?
>
>> The only other way I could think of (I am trying hard, you see) would be if you maintained a list of version infos that are "known" to be the "roots" of that branch.
>
> Yes!  By starting them off in a separate repository dedicated to that
> branch, the contents of that repository _is_ your list.  You wouldn't
> necessarily have to include all the ancestors but, even if you did,
> every use-case I can think of, from then on, is only concerned with
> the _latest_ versions, which will be for that branch..  No problem!
>
>>  Then by parsing the ancestry you could see if any of these roots are in the ancestry. But besides being expensive to examine, that information necessarily is external to the version itself. It won't be transmitted when you send versions elsewhere. In particular, a trunk image won't know that these versions are special. With a branch name attached, it would.
>
> Hey, at least you're entertaining some creative ideas, thank you!
> This is important because I think diving into branches as a community
> could have long-term repercussions..
>
>>>> By only putting a version in a different repo, the branch info is not attached to the version itself, it cannot be acted
>>> upon properly without utmost care by the user. Which *did* lead to a
>>> version getting submitted to the wrong repo.
>>>
>>> What do you mean by "wrong repo"?  Someone submitted something to the
>>> Inbox for Spur, _because_ there was no SpurInbox repository for him to
>>> put it.  If there were, then that's where it would have been saved --
>>> because anyone developing Spur will have set up their Monticello UI
>>> with the Spur repositories, making it very hard to accidently commit
>>> to the wrong repository.
>
> Why did you ignore the above, its a main point..?
>
>>> If we want to clone all the abilities of Trunk _including_ the Inbox
>>> submission, then we gotta give Eliot all the same resources, e.g., a
>>> SpurInbox too.
>>
>> Note that I did not talk about having different repos.
>
> Okay, so let's talk about it!  I had thought you said they were
> problematic and insufficient.  Why?
>
>> Even if Eliot decides to open a separate inbox for Spur, it would *still* be a good idea to use proper branches.
>
> Why?
>
>>>> And there is *no way to tell*, unless stuff randomly breaking when you load that version is considered fine.
>>>
>>> C'mon Bert, even in a single-repository ecosystem, I'm not buying
>>> that.  This whole event is pretty low-impact, wouldn't you say?
>>
>> No. Not at all low-impact if you want to automate things.
>
> Exactly the opposite.  Use of "branches" are what causes packages to
> be excluded from the new-and-improved club.  The MC method history
> function, for example, which has been available for going on a year
> now -- unfortunately is unavailable to projects that chose to use
> branches because that implicit function was hacked back into FileBased
> only.
>
> getimothy is seeing performance issues and timeouts, hmm, I wonder
> why?  I suspect it's because there are so many files in one
> repository, it's just taking longer and longer to read that directory.
> Combining multiple branches into one repository exacerbates that
> problem.
>
>>> How
>>> often do you go loading an Inbox package without saving your image
>>> first?  Plus, there _is_ a way to tell -- if something is worth
>>> documenting in a version-name, then worth documenting in the
>>> version-comments.  We're talking about 2-4 clicks to check the
>>> ancestry.
>>
>> No, we're talking about parsing that info out of hundreds of files that need to be downloaded from a remote server, vs. checking just their file names. I get the impression you're intentionally avoiding this fact.
>
> Please, what use-case are you talking about here?
>
>> Actually, I don't get this at all. You have been the one wanting to make the MC version name encoding explicit. Which turned out to be a good idea, code became simpler and more readable. And *now* you're suggesting to bury vital information in the comment, instead of the existing, documented field in MCVersionName? Sorry, you lost me there.
>
> I'm suggesting that the number of times one _needs_ to know exactly
> all which versions are for one branch vs. the other is so rare, it's
> not worth incurring the forever-cost of "branches".  I really dying to
> know what use-case has you wanting to do that for "hundreds of files",
> on a regular basis?  And why a separate repository woudln't alleviate
> the pain even IF it's needed on a regular basis?
>
>>> Copying select versions between the two could be set up to be automatic or
>>> deliberate, as needed, but the key is that it's using **existing** MC
>>> code and infrastructure to do that.
>>
>> In contrast to the *existing* branch support?
>
> Yes because branches are currently NOT supported for anything but
> FileBased, and whose function interferes with the simplicity of the
> tools and the abilities of the other MCRepository types.
>
>>> What do you think of Eliot's recently introduced preference,
>>> "Secondary Update URL"?
>>
>> I actually could have used that, two years ago, when I set up an update mechanism for VPRI's Frank project. I instead had to write considerable code to allow using two update streams at once.
>
> My whole question about that was, "Was it a good trade-off?"  Sure,
> it's a useful function for that moment in history, but you ignored the
> *forever-cost* (actually even trimmed it!).
>
>>>> With the evidence we have available now we can conclude that relying on separate repos is not enough. Being explicit about the branch is a Good Idea.
>>>
>>> We should collect evidence when its set up properly -- the same as trunk.
>>>
>>> PS -- instead of just responding with "why we can't", I really hope
>>> you'll give this proposal a fair shake.
>>
>> See above - I am trying to imagine how this could work reliably without real branching. I can't.
>
> Thanks for your perseverence -- we've put in this much effort, I hope
> we'll find a consensus.
>
>> Leaving aside how many repos Eliot wants to use: being able to tell from the MCZ itself that it's from a different branch is a Good Idea. Think your local package cache. Think sending versions by email. Etc. Let's make this concrete: You're looking at your package cache - how exactly do you tell a version is from the spur branch or trunk? How would a script tell?
>
> I guess I feel that the responsibility for that should not be put on
> MCVersion.  MCVersion should be concerned with the stuff it manages --
> it's Snapshot and below (Definitions, etc.).  We shouldn't inject a
> hierarchy into its name just so it will happen to sort conveniently in
> some tool way up there..
>
> Anything concerned with what Versions belong with what branch should
> managed by the higher-level mechanisms that already manage Versions --
> Repositories and Configurations.
>
> I'm having trouble understanding the contexts of those examples.  What
> am I going to do with my package-cache that has branches saving the
> day for me?  Sending a version in an e-mail?  I would probably send a
> link to a config or a repository.  Even if I did send a MCZ Version in
> an email, the email would state what its for, any branch tag would be
> redundant..
>
> PS -- And, heck, even just the _contextual_ info (e.g., time and
> author) kinda gives it away anyway!  Frank already said if there's
> more than 2 branches going on simultaneously, that's a problem.

No, what I said what that if you have a hierarchy of branches more
than two levels deep, you have a problem.

So in the degenerate/base case you have one branch, right? "master" in
git, "default" in mercurial, and so on.

You can fork off master, giving you a branch master-a. The name's
arbitrary, and I'm using a naming scheme to highlight parent-child
relationships - what branches fork off what branches?

You can then fork off master-a, giving master-a-a. You can also create
master-a-a-a, master-a-a-a-a and so on. And this is where the wheels
come off. To get master-a-a-a into master, you should really merge
back to master-a-a, then to master-a, and finally to master. It takes
lots of time to merge all the way back to master. Every merge risks
conflicting edits. Deeply nested branches live much longer than ones
off master. That makes them diverge further from master, increasing
the risk of conflicting edits dramatically. And a conflict in merging
master-a will affect master-a-a, master-a-b, and all the other
branches off master-a.

That's not the same as having lots of branches. I _want_ lots of
branches: master-a, master-b, master-c, for as many features as are
currently in focus. I want these hordes of little branches
short-lived, because branching implies divergence, and nothing
prevents divergence better than not living long enough to diverge very
far.

frank


More information about the Squeak-dev mailing list