[Vm-dev] the purpose of CI

Ben Coman btc at openinworld.com
Sat Jun 3 16:18:38 UTC 2017

On Fri, Jun 2, 2017 at 7:36 AM, Eliot Miranda <eliot.miranda at gmail.com>

> Hi Ben,
>     reading forward it seems I've miscommunicated what I mean by an
> experiment.  By experiment I mean a Vm configuration not used for
> production, such as the Sista and Locoed VMs.  I visit this below.

ah, gotcha.  I was thinking more along the lines of short lived experiments
where things might break to before fully understanding the impact.

> On Thu, Jun 1, 2017 at 10:40 AM, Ben Coman <btc at openinworld.com> wrote:
>> On Thu, Jun 1, 2017 at 10:38 PM, Eliot Miranda <eliot.miranda at gmail.com>
>> wrote:
>>> Hi Tim,
>>> On May 31, 2017, at 11:53 PM, Tim Felgentreff <timfelgentreff at gmail.com>
>>> wrote:
>>> Hi,
>>> Just re the discussion of dev and stable branch, the original idea was
>>> that Cog is dev and master is stable. We never expected that people would
>>> use or recommend the Cog bintray builds for anything other than development.
>>> But "master" ended up so far behind, recent changes would not get much
>> testing leading up to Pharo release...
>> $ git log master
>> 17 Aug 2016
>> You want to *encourage* people to use more recent VMs to get more
>> feedback earlier.
> Yes, but I disagree that the issue is the master/Cog distinction.  The
> problem is that there is no mechanism to help remind us when to advance
> things to master.  If that's present then we have no problems using the
> current structure.

Okay. If master is kept reasonably up to date, then this concern goes away.

> I feel the only problem is that we need someone who merges to master when
>>> it is green. I think we have already protected the master branch in the way
>>> Ben suggested, i.e., you can only open a PR and merge it if the Travis
>>> build is all green.
>>> I can do this.  Ideally it would be either automatic or prompted.  What
>>> I mean is that there should be a set of tests that are relevant n on images
>>> using the production subset of the VMs built from the Cog branch. Whenever
>>> the tests are all green then either I get sent an email prompting me to
>>> push to master, or a push to master occurs.
>> I think it would be useful to differentiate between different levels of
>> stability depending on application and personal perspective.
>>    A. personal-stable -- stable "enough" for developer/student to use
>> personally on their desktop -- might be optimistic if it passes all tests
> Sure.  So the mechanism needed is running at least the test suite on a new
> VM right?

Yes. This already happens on the Cog branch doesn't it?  It just hasn't
seemed a priority to keep it green.

>>    B. production-stable -- "bulletproof" for operating machinery and
>> business systems - maybe when A has be in wide use for a while
> Not sure we can do much more here than run the tests.  What we can insist
> on is that a master commit should pass the tests on all supported platforms
> x all production VM configurations.

Just an aside: A dream scenario would be something like PharoLauncher also
managing VMs as well as Images, gathering statistics on {OS. VM. Image}
tuples started and crashed, reporting aggregate statistics to users to
facilitate users (ranged across the technology adoption bell curve) to
organically identify particular builds as real-world-tested-stable.  But
I'm not sure how workable that would be.

>> btw, it occurs to be that "master" is a bit ambiguous.  Perhaps for (B.)
>> "master" could be renamed "production"
>> https://stevebennett.me/2014/02/26/git-what-they-didnt-tell-you/
> Well, what's in a name?  "master" is fine for production.  Tagging a
> particular master commit with a release id tag would be the necessary extra
> no?

>> and for (A.) maybe introduce "stable" or alternatively "validated" to
>> mean all tests passed without the strong implication it is "stable".
> Well, if the build and test steps are separated then there would hopefully
> be a page with green "tests passed" entries containing like to the VMs that
> passed the tests, no?

okay. just floating ideas.

> Maybe I can get into the habit of checking the status of the build a few
>>> hours after a commit.  But a generated email would compensate for my, um,
>>> it's on the tip of my tongue, um, my, my memory!  And an automated push
>>> would allow me to resume walking in front of buses.
>>> The bintray deployment should not be taken as a source of stable builds.
>>> It is meant to be used by what Eliot calls brave souls who want to help to
>>> test the latest and possibly unstable changes.
>>> Good.  This makes perfect sense to me.  Are there places in the
>>> configuration to add brief overview texts explaining this to the bintray
>>> download pages?  It would be great to have a short paragraph that says
>>> these are development versions and directs to the master builds.
>>> P.S. for master builds Gilad has noticed that there is no .msi for the
>>> newspeak builds (and I suspect there may be no .dmg).  In e.g.
>>> build.win32x86/newspeak.cog.spur/installer is code to make the .msi for
>>> a newspeak vm.  And the corresponding thing exists for making the Mac OS
>>> .dmg.  Any brace souls feel up to trying to get them together be built?
>>> Just my 2c
>>> Tim
>>> On Thu, 1 Jun 2017, 06:05 Ben Coman, <btc at openinworld.com> wrote:
>>>> On Thu, Jun 1, 2017 at 2:27 AM, Nicolas Cellier <
>>>> nicolas.cellier.aka.nice at gmail.com> wrote:
>>>>> 2017-05-31 17:31 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:
>>>>>> Hi All,
>>>>>> > On May 31, 2017, at 1:54 AM, K K Subbu <kksubbu.ml at gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > On Wednesday 31 May 2017 12:35 PM, Esteban Lorenzano wrote:
>>>>>> >>>> On 31 May 2017, at 09:01, K K Subbu <kksubbu.ml at gmail.com>
>>>>>> wrote:
>>>>>> >>>>
>>>>>> >>>> On Wednesday 31 May 2017 12:18 PM, Esteban Lorenzano wrote:
>>>>>> >>>> 1) We need a stable branch, let’s say is Cog 2) We also need a
>>>>>> >>>> development branch, let’s call it CogDev
>>>>>> >>> IMHO, three branches are required as a minimum - stable,
>>>>>> >>> integration and development because there are multiple primary
>>>>>> >>> developers in core Pharo working on different OS platforms.
>>>>>> >> but nobody will do the integration step so let’s keep it simple:
>>>>>> >> integration is made responsibly for anyone who contributes, as it
>>>>>> is
>>>>>> >> done now.
>>>>>> >
>>>>>> > I proposed only three *branches*, not people. Splitting development
>>>>>> into two branches and builds will help in isolating faster (separation of
>>>>>> concerns). If all issues get cleared in dev branch itself, then integration
>>>>>> branch will still be useful in catching regressions.
>>>>>> I don't believe this.  Since the chain is VMMaker.oscog =>
>>>>>> opensmalltalk/vm => CI, clumping commits together when pushing from, say,
>>>>>> CogDev to Cog doesn't help in identifying where things broke in VMMaker.
>>>>>> This is why Esteban has implemented a complete autobuild path run on each
>>>>>> VMMaker.oscog commit.
>>>>>> But, while this is a good thing, it isn't adequate because
>>>>>> a) important changes are made to opensmalltalk/vm code independent of
>>>>>> VMMaker.oscog
>>>>>> b) sometimes one /has/ to break things to have them properly tested
>>>>>> (e.g. the new compactor).  i.e. there has to be a way of getting some
>>>>>> experimental half-baked thing through the build pipeline so brace souls can
>>>>>> test them
>>>>>> > I will defer to your experience. I do understand the difference
>>>>>> between logical and practical in these matters.
>>>>>> Let's take a step back and instead of discussing implementation,
>>>>>> discuss design.
>>>>>> For me, a VM is good not when someone says it is, not when it builds
>>>>>> on all platforms, but when extensive testing finds no faults in it.  For me
>>>>>> this implies tagging versions in opensmalltalk/vm (which by design index
>>>>>> the corresponding VMMaker.oscog because generated source is stamped with
>>>>>> VMMaker.oscog version info) rather than using branches.
>>>>>> Further, novel bugs are found in VMs that are considered good, and
>>>>>> these bugs should, if possible, be added to a test suite.  This points to a
>>>>>> major deficiency in our ability to tests VMs.  We have no way to test the
>>>>>> UI automatically.  We have to use humans to produce mouse clicks and
>>>>>> keystrokes.  For me this implies tagging releases, and the ability to state
>>>>>> that a given VM supersedes a previous known good VM.
>>>>> I just want to interject a check everyone's understanding of git
>> branches.  Although I haven't used SVN, what I've read indicates SVN
>> concepts can be hard to shake and git branching is conceptually very
>> different from SVN.
>> The key thing is "a branch in Git is simply a lightweight movable pointer
>> to a commit."
>> The following article seems particularly useful to help frame our
>> discussion...
>> https://git-scm.com/book/en/v1/Git-Branching-What-a-Branch-Is
> The commit is related to a graph of previous commits, right?  So a branch
> implies a distinct set of commits, right?

As a second order effect yes, a branch pointing at a single commit implies
the set of all its ancestors.  I just found that page particularly
enlightening... "Because a branch in Git is in actuality a simple file that
contains the 40 character SHA-1 checksum of the commit it points to,
branches are cheap to create and destroy. Creating a new branch is as quick
and simple as writing 41 bytes to a file (40 characters and a newline)."

When one merges from a branch, the commits on the branch don't get added to
> the target [branch] into which one merges do they

I've had trouble trouble parsing that.  On one hand, it might be considered
that commits on the source branch *are* added to the target branch.
One the other hand, it sounds like you concur that branches aren't a set of
commits (since git under the hood has no "sets") - just the code of two
specific commits being merged to create a new commit, to which the target
branch reference is moved to.

> Isn't that the difference between pull and pull -a, that the former just
> pulls commits on a single branch while pull -a p;pulls commits on all
> branches?

Sorry I don't know.  I never use `git pull`, rather always `git fetch` +
`git merge` after some advice I read to get more flexibility and a chance
to review changes first.
I search extensively but really could only find regurgitation of this
copied from the man-page...
   "-a --append Append ref names and object names of fetched refs to the
existing contents of .git/FETCH_HEAD. Without this option old data in
.git/FETCH_HEAD will be overwritten."
which I don't understand.

A branch is much the same as a tag, the are both references to a particular
>> commit, except
>> * branches are mutable references
>> * tags are immutable references
>> http://alblue.bandlem.com/2011/04/git-tip-of-week-tags.html
>> So if you want a moveable "good-vm" tag, maybe what you need is a branch
>> reference.
> But isn't that what master is supposed to be?

It seems a better process for master is underway, so okay.

> And the previous paragraph applies equally to performance improvements,
>>>>>> and functionality enhancements, not just bugs.
>>>>>> Test suites and build chains catch regressions.  Regressions in
>>>>>> functionality and in performance are _useful information_ for developers
>>>>>> trying to improve things, not necessarily an evil to be avoided at all
>>>>>> costs.
>>>>> Agreed. But you want to deal with your own regressions not other
>> peoples.  It seems harder to apply a "if you break it, you fix it"
>> philosophy if its always broken.
> To reiterate, distinguishing between the production set and the
> experimental set is what's necessary here.

After being a bit slow, now got it.

> The system must allow pushing an experiment through the build and test
>>>>>> pipeline to learn of a piece of development's impact.
>>>>> IIUC, experimental branches (and PRs!!) can run through the CI
>> pipeline identical to the Cog branch (except maybe deployment step).  There
>> seems no benefit here for needing to commit directly to the Cog branch when
>> a PR-commit would work the same.
> No, not experimental branches.  Experimental configurations.  Sista and
> LowCode are currently experiments.  We don't care if these are broken;
> they're not part of the production VM set yet.  So Clément, Ronie and
> myself should be able to modify, including break, these configurations
> without hindrance, and without generating noise for the community.

okay. got it.

>>> An experiment may have to last for several months (for several reasons;
>>>>>> the new compactor is a good example: some bugs show up in unusual
>>>>>> circumstances; some bugs are hard to fix).
>>>>>> Another requirement is to provide a stable point for someone to begin
>>>>>> new work.  They need to know that their starting point is not an experiment
>>>>>> in progress. They need to understand that the cost of working on what is
>>>>>> effectively a branch from the trunk is an integration step(s) into trunk
>>>>>> layer on, and this can't be just at the opensmalltalk/vm level using fit to
>>>>>> assist the merge, but also at the VMMaker.oscog level using Monticello to
>>>>>> merge.  Both are good at supporting merges because both support identifying
>>>>>> the set of changes.  Both are poor at supporting merges because they don't
>>>>>> understand refactoring and currently only humans can massage a set of
>>>>>> changes forwards applying refactorings to a set of changes.  This is what
>>>>>> real merges are, and the reason why git only eases the trivial cases and
>>>>>> why real programmers use a lot more tools to merge than just a vcs.
>>>>>> Can others add additional requirements, or critique the above
>>>>>> requirements?  (Try not to mention git or ci implementations when you do).
>>>>>> ======
>>>>>> With the above said what seems lacking to me is the testing framework
>>>>>> for completed VMs.  A build not can identify commits that fail a build and
>>>>>> also produce a VM for subsequent packaging and/or testing.  Separating the
>>>>>> steps is very useful here.  A long pipeline with a single red or green
>>>>>> light at the end is much less useful than a series of short pipelines, each
>>>>>> with a separate red or green light.  Reading through a bot log to identify
>>>>>> precisely where things broke is both tedious and, more importantly, not
>>>>>> useful in an automated system because that identification is manual.
>>>>>> Separate short pipelines can be used to inform an automatic system (right
>>>>>> Bob? Bob Westergaard built the testing system at Cadence and that is
>>>>>> constructed from lots of small steps and it isolates faults nicely;
>>>>>> something that an end-to-end system like Esteban's doesn't do as well).
>>>>>> Now, if we have a long sequence of nicely separated generate, build,
>>>>>> package, test steps how many separate pipelines do we need to be able to
>>>>>> collaborate?  Is it enough to be able to tag an upstream artifact as having
>>>>>> passed some or all of its downstream tests or do we need to be able to
>>>>>> duplicate the pipeline so people can run independent experiments?
>>>>>> For me, I see two modes of development; new development and
>>>>>> maintenance.  New development is fine in a fork in some subset of the full
>>>>>> build chain.  e.g. when working on Spur I forked within VMMaker.oscog (and,
>>>>>> unfortunately, in part because we didn't have opensmalltalk/vm or many of
>>>>>> the above requirements discussed, let alone met, I would break V3 for much
>>>>>> of the time). e.g. the new compactor was forked in VMMaker.oscog without
>>>>>> breaking Esteban's chain by my using a special generation step controlled
>>>>>> by a switch I set in my branch.  I tested in my own sandbox until the new
>>>>>> compactor needed testing by a wider audience.
>>>>>> Maintenance is some relatively quick fix one (thinks one) can safely
>>>>>> apply to either VMMaker.oscog or opensmalltalk/vm trunk to address some
>>>>>> issue.
>>>>>> Forking is fine for new development if
>>>>>> a) people understand and are prepared to pay the cost of merging, or,
>>>>>> better,
>>>>>> b) they can use switches to include their work as optional in trunk
>>>>>> There are lots of switches:
>>>>>> A switch between versions in VMMaker.oscog, e.g. Spur memory manager
>>>>>> vs V3, or the new Spur compactor vs the old, or the Sista JIT vs the
>>>>>> standard, etc
>>>>>> A switch between a vm configuration, e.g. pharo.cog.spur vs
>>>>>> squeak.cog.spur in a build directory, which can do any of
>>>>>> - select a generated source tree (e.g. spursrc vs spur64src)
>>>>>> - use #ifdef's to select code in the C source
>>>>>> - use plugins.int & plugins.ext to select a set of plugins
>>>>>> A switch between dialects (Pharo vs Squeak vs Newspeak)
>>>>>> A switch between platforms (Mac OS X vs win32, Linux x64 vs Linux ARM)
>>>>>> I get the above distinctions and know how to navigate amongst them
>>>>>> upstream, but don't understand very well the downstream (how to clone the
>>>>>> build/test CI pipeline so I can cheaply fork, work on the branch and then
>>>>>> merge). So I'm happier using switches to try and hide new work in trunk to
>>>>>> avoid derailing people.  And so I prefer the notion of a single pipeline
>>>>>> that tags specific versions as good.
>>>>>> Is one of the requirements that people want to clearly separate
>>>>>> maintenance from new development?
>> This diagram may be a good reference for discussion, of how maintenance
>> hotfixes can relate to development branches.
>> http://1.bp.blogspot.com/-ct9MmWf5gJk/U2Pe9V8A5GI/AAAAAAAAAT
>> 0/0Y-XvAb9RB8/s1600/gitflow-orig-diagram.png
> Makes perfect sense.  The thing this doesn't mention is the ability to
> have experiential configurations live alongside the code being worked on
> for a stable release.

I guess experimental configurations are orthogonal - more of a build
issue.  Ideally adding features to an experimental configuration would
still be done using a PR from a feature branch so the CI tests ensure no
problems leaked out of their #ifdefs, but without that, experimental
configurations is still a good step.

cheers -ben

>>>>>> Is one of the requirements that people want to clearly identify which
>>>>>> commit caused a specific bug? (Big discussion here about major, e.g. V3 =>
>>>>>> Spur transitions vs small grain changes; you can identify the latter, but
>>>>>> not necessarily the former).
>>>>> I suppose what I'm asking is what's the benefit of an all green
>>>>>> build?  For me a tested, version and named artefact is more useful than an
>>>>>> all green build.  An all red build is a red flag.  A mostly green build is
>>>>>> simply a failure to segregate production from in development artefacts.
>>>>> Hi Eliot,
>>>>> the main advantage of github is the social thing:
>>>>> - lower barrier of contributing via a better integration of tools
>>>>>  (not only vcs, but issue tracker, wiki, continuous integration, code
>>>>> review/comments and pull request - even if we under use most of these
>>>>> tools),
>>>>> - and ease integration of many small contributions back.
>>>>> For this to work well, such work MUST happen in separate branches.
>>>>> in this context, there is an obvious benefit of green build: quickly
>>>>> estimate if we can merge a pull request or not.
>>>>> when red, we have no information about possible regressions, and have
>>>>> to go through the tedious part: go down into the console log of both
>>>>> builds, try to understand and compare... There is already enough work
>>>>> involved in reviewing source code.
>>>> I see that my opening argument was simplistic.  However Nicolas' point
>>>> above is probably more significant.
>>>> If we want to encourage new contributors, we need:
>>>> * to show that the CI builds are cared for
>>>> * allow newcomers to be confident that the tip they are working from is
>>>> green before they start.  When they submit their PR and the CI tests fail,
>>>> they should be able to zero in the failures *they* caused and *as*a*newbie*
>>>> not have to sort through the confounding factors from other's failures.
>>>> * act timely to integrate, to encourage further contributions.  If
>>>> someone contributes a good fix, a green CI test may make you inclined to
>>>> quickly review and integrate. But when the CI shows failure, how will you
>>>> feel about looking into it? Further, when the mainline returns to green,
>>>> the existing PRs don't automatically retest, and no-one seems to be
>>>> manually managing them, so such PRs seem to end up in limbo which is
>>>> *really* discouraging for potential contributors.
>>>> cheers -ben
>>>>> I tend to agree on your view for mid/long term changes:
>>>>> Say a developper A works on new garbage collector, developper B on
>>>>> 64bits compatibility, developer C on lowcode extension and developer D on
>>>>> sista (though maybe there is a single developper touching 3 of these)
>>>>> Since each of these devs are going to take months, and touch many core
>>>>> methods scattered in interpreter/jit/object memory or CCodeGenerator, then
>>>>> it's going to be very difficult to merge (way too many conflicts).
>>>>> If on different branches, there is the option to rebase or merge with
>>>>> other branches. But it doesn't scale with N branches touching same core
>>>>> methods: N developpers would have to rebase on N-1 concurrent branches,
>>>>> resolve the exact same conflicts etc... Obviously, concurrent work would
>>>>> have to be integrated back ASAP in a master branch.
>>>>> So, a good branch is a short branch, if possible covering a minimal
>>>>> feature set.
>>>>> And long devs you describe must not be handled by branches, but by
>>>>> switches.
>>>>> This gives you a chance to inspect the impact of your own refactoring
>>>>> on your coworkers.
>>>>> In this model, yes, you have a license to break your own artifact (say
>>>>> generationalScavenger, win64, lowcode, sista).
>>>>> But you must be informed if ever you broke the production VM, and/or
>>>>> concurrent artifacts. You have to maintain a minimal set of features
>>>>> working, otherwise you prevent others to work. In the scavenger case, you
>>>>> used a branch for a short period, and that worked quite well.
>>>>> In this context, I agree, a single green light is not enough.
>>>>> We need a sort of status board tracing the regressions individually.
>>>>>> > Regards .. Subbu
> --
> _,,,^..^,,,_
> best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170604/694251c0/attachment-0001.html>

More information about the Vm-dev mailing list