[squeak-dev] The Inbox: Monticello-cmm.1550112371873461.mcz

Chris Muller asqueaker at gmail.com
Thu Feb 14 22:07:14 UTC 2019


Wow, please relax guys, this was just to invite an exercise for us to
open our minds and have a "thought experiment" about a crazy idea.  I
haven't had time myself to digest all the implications of the idea
well enough to have developed any "advocacy" for it, this hurricane of
abhorrence and vehement resistance is unnecessary.  :)  Some of these
responses indicate to me that you may not have thought it all the way
through either, and I'd like to respond to some of them, since I seem
to be the only one actually open to the idea.

Nicolas wrote:
> i don't like this change because it looses a very important thing: READABILITY

I *feel* the same way you guys do about the readability, even though I
can't put my finger on what use-case needs to care about readability.

The tools put everything where it needs to be.  I would only ever
cut-and-paste a VersionName never type it, so it's not about typing..
What then?

I sense I may just be feeling a "resistance to change," more than
identifying any practical disadvantage due to readability of the
version number.

AND, the number of digits can be further mitigated in any case...

> This is only going to work with 3 to 5 figures in version number.
> With 10 figures, this is ruining our brain and just does not work.

We're dealing with up to four digits today.  If we used
minute-resolution, we would only need to add three more digits for a
total of 7.

7 is only 2 more digits than 5, which you said would work, and if we
would be open-minded to a base-36 number, then we're back down to only
5 alphanumeric digits.

I know, it's not _quite_ as nice, but keep in mind, besides total
elimination of duplicates, we also get a *useful encoding* out of
this, the timestamp of the version!  Chris Cunningham had something to
say about
that:

> First off we have to decide on a mandatory specific timezone, otherwise we will have commits created before other commits being flagged as 'more current' if they are in the right timezones

Nope, #utcMicroseconds saves the day, since it's the same for
everybody, globally.

Marcel:

> Since this directly affects Squeak's build number...

Not adversely.  It's just a number and its still monotonic.  Okay,
maybe readability again, but again...  in what use-case does that
matter?

> People could never estimate the count of changes between builds/updates again.

I assume you mean count of "versions", since we're pretty bad at
making 1:1 changes:versions.  Neither metric is important to
Monticello, but if you need it for something else (release notes?),
then this helps enable an accurate count by having the DateAndTime
encoded into the name, so just count the number of versions that are
after the date of the last release (which should be a field in the
image).

I don't think there's not even any easy way to do the same today
since, as you said, the version number can only provide an estimate.

> Monotonicity doesn't count as long as you have a tool that can sort out the ancestry.

Monotonicity is needed to identify the proper orphan(s) in the tree
which are the most-recently developed.  There tend to be a lot of
other extra orphans in repositories...

> As long as we have unique ID, then we should use that for storing and retrieving a package.
> The ID is stored in the ancestry, so it's just a matter of using the ID as filename in the backend rather than the ambiguous package name.
> It's more complex because we have to change our servers and protocols, but it would be the right thing to do.
> I think that you are doing the easy but and not the right one with your quick and clever hack.

Guys, sometimes you have to accept the reality of a "legacy."
Monticello is a distributed code legacy database with thousands upon
thousands of MCZ files distributed on machines all over the world, and
accessed by Squeak images old and new.  Breaking that legacy, while
being "more complex," instead of using leverage afforded by the domain
and staying compatible via a very simple solution, is not only the
wrong thing to do, no one will even ever propose it since, if someone
had that kind of time to invest, they'd build a new system instead.

 - Chris


 - Chris

On Thu, Feb 14, 2019 at 1:17 PM Bert Freudenberg <bert at freudenbergs.de> wrote:
>
> -1
>
> - Bert -
>
>
> On Thu, Feb 14, 2019 at 10:12 AM Chris Cunningham <cunningham.cb at gmail.com> wrote:
>>
>> Hi.
>>
>> I have issues with the cmm proposal - besides the size of the number (which is too big), basing it on timestamp is going to be a problem.  First off we have to decide on a mandatory specific timezone, otherwise we will have commits created before other commits being flagged as 'more current' if they are in the right timezones - and we have enough widely spaced developers/committers that this WILL happen.
>>
>> Further, with the timezones, getting all of our dev machines (and/or all of the repository machines) synced up with the same time and right timezone offset rules is HARD.  From personal experience, even with a set of Unix boxes all setup to sync off the same timeserver, they still drift out of sync with each other.  It is a mess.
>>
>> However, the idea of "using the ID as filename in the backend" I don't really like, either.  I use file directories locally for development, and having them all stored as an unintelligible name there will cause me angst in the short term.  Further, I would assume that passing files via email, say, we'd still keep the 'backend' name as well?  Or the other artificial 'name'?
>>
>> Also, this would mean that we'd need to patch all of our publicly facing repositories, right?  The Squeak ones are definitely doable; the ones located in other companies gets weirder; personal 'public' repositories are definitely weirder, especially if their owners have moved on to other pursuits.  And then there is SmalltalkHub as well.  Of course, maybe I'm just overworrying about this part.
>>
>> -cbc
>>
>> On Thu, Feb 14, 2019 at 1:08 AM Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com> wrote:
>>>
>>> Hi Chris,
>>> i don't like this change because it looses a very important thing: READABILITY
>>>
>>> Monotonicity doesn't count as long as you have a tool that can sort out the ancestry.
>>> Since we always browse the versions thru some MC tools it's superfluous.
>>> Monotonicity is mainly for helping us poor humans to quckly identify the relationship between two packages in the graph.
>>>
>>> This is only going to work with 3 to 5 figures in version number.
>>> With 10 figures, this is ruining our brain and just does not work.
>>> Please revert or put in inbox purgatory while we have a chance to discuss it.
>>>
>>> As for uniqueness, this is a small problem.
>>> As long as we have unique ID, then we should use that for storing and retrieving a package.
>>> The ID is stored in the ancestry, so it's just a matter of using the ID as filename in the backend rather than the ambiguous package name.
>>> It's more complex because we have to change our servers and protocols, but it would be the right thing to do.
>>> I think that you are doing the easy but and not the right one with your quick and clever hack.
>>>
>>> Le jeu. 14 févr. 2019 à 05:23, Chris Muller <ma.chris.m at gmail.com> a écrit :
>>>>
>>>> HI Eliot,
>>>>
>>>> > > On Feb 13, 2019, at 7:13 PM, Chris Muller <asqueaker at gmail.com> wrote:
>>>> > >
>>>> > > What are the two most-important properties we want from our
>>>> > > versionNumber?  Monotonicity and uniqueness.  The current scheme only
>>>> > > provides the former, this uses DateAndTime now utcMicroseconds to
>>>> > > provide the latter, too.  As a bonus it also happens to encode the
>>>> > > save timestamp into the VersionName, so available without having to
>>>> > > open the file.
>>>> > >
>>>> > > I admit it looks intimidating given what we're used to seeing, but
>>>> > > what of the added safety and utility?
>>>> >
>>>> > It is trumped by the illegibility.
>>>>
>>>> Not as bad as it appears, since the high-order digits will be the same
>>>> between version #'s, plus, second-resolution should be sufficient, so
>>>> versions in a list would actually look like this:
>>>>
>>>>     Monticello-cmm-1550203798
>>>>     Monticello-cmm-1550117398
>>>>     Monticello-cmm-1550030998
>>>>
>>>> Whilst still retaining all of the utility.  Maybe even a setting in
>>>> the tools could hide the high-order digits in the UI if we wanted...
>>>> We're already into 4 digits in our version #'s anyway so....
>>>>
>>>> > When was the discussion around this change?
>>>>
>>>> You're participating in it now.   :)
>>>>
>>>> There was another change to earlier today that you may be interested
>>>> in asking that question about too, since it changed 19-year old
>>>> SequenceableCollection>>#= with a one-day old replacement and actually
>>>> went into trunk.  This one is in the Inbox.
>>>>
>>>> > I’ve been out if things (apologies) but I find this change quite horrible.
>>>>
>>>> I understand this initial gut reaction, but I hope you'll think and
>>>> sleep on it, and help think about the problem and some alternative
>>>> solutions you like better.  VersionName uniqueness is important for
>>>> the Monticello model.
>>>>
>>>> Best,
>>>>   Chris
>>>>
>>>
>>
>


More information about the Squeak-dev mailing list