[Vm-dev] VM stability / unit tests

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Fri Mar 30 21:46:53 UTC 2018


Hi Phil,

that's probably right: there is a lack of smoke tests.

There are several hurdles before we reach today's state of the art wrt
continuous delivery and regression testing

- 1) the artefacts must be built, or we won't even have a chance to run
tests
  we can observe that they have been broken too many times by all sort of
problems, and green status is an exception in an ocean of red
  problems encounterd so far are including
  * work in progress in core VM or plugins
  * wrong configuration of pharo target directories or credentials
    (this was the case most of the 2017 year but is fortunately fixed now)
  * staled or intermitent links (url)
    for example, the build is loading stuff from the network (like cygwin
updates)
    that sometimes fail
  * failure to build a library due to some tool changes at appveyor/travis

Introduction of new bugs could be prevented if feedback was correct (no
false alarm).
But it's not really the case until now (lot of parasites).

- 2) we run after many hares, that is a combination of
  v3 stack spur, i386 x86_64 ARM, linux Windows MacOS, sista lowcode,
Squeak Pharo Newspeak
  I certainly forgot threaded FFI in above list, plus the register
efficient JIT variants...

Again, breaking a single of these configurations lead to RED status.
Not all these configurations are at the same level
- of importance (less user, not used in production, ...)
- of maturity (in progress, experimental, or in production)
So we must find a way to prioritize and focus on production artifacts...

-3) we need stable image side for running smoke tests
But we need some image side changes for some new features, preventing to
run older versions.
Squeak and Pharo still have randomly failing tests (like Network dependent,
etc...).
Someone has to do the work (or pay it)...

-4) build status feedback is very sloowww
  * as said above, we build too many configurations
  * Pharo has introduced a lot of dependencies on external libraries
    this leads to either long build times, or the use of caches that delay
detection of new failures

We all know that dev branches (feature branches) help a lot for some of the
above problems.
But we have these additional hurdles:
- feature branches works well when cycles are short
  but core VM cycles are not short (3 to 6 months or more for introducing
new GC, 64 bits, minimal SISTA, ...)
  a lot of the changes required for SISTA, 64bits and JIT variants are
competing, and parallel branches would create conflicts and would not work
without regular sync.
  that explains why all the branches are gathered into a giant and complex
one today...
- versionning generated code is a recipe for creating unsolvable conflicts
(unmergeable)
  it's still possible to generate code for a plugin (if non concurrent)
  but this prevents working in parallel branches as soon as the core
generation is changed in VMMaker

In recent posts, I saw billiant young people under-estimating a bit the
work involved and the complexity of the task.
Fabio has done a tremendous work to restore the green status, and the help
of Esteban has been decisive with this respect.
We will never thank them enough for that.

But maybe current state is at the limit of sustainability.
And maybe it's time to drop some drag.



2018-03-30 22:35 GMT+02:00 Phil B <pbpublist at gmail.com>:

>
> While I've been enjoying the fantastic performance improvements we've seen
> from Cog onward, one thing I've been less excited about are some of the
> stability/functionality issues I've been running into.  They are not
> numerous (maybe 1/2 dozen or so major ones in the last 5 years) but they
> are getting quite tedious to isolate and replicate.  Recent examples that
> come to mind include the 64-bit primHighResClock truncation and 'could not
> grow remembered set' issues  (My current joy is a case where I have an
> #ifTrue: block that doesn't get executed unless I convert it to an
> #ifTrue:ifFalse: with a noop for the ifFalse:.. I'll provide a reproducible
> test case as soon as I'm able.  The specific issue isn't the issue, but
> rather that I keep hitting things like this that seem rather fundamental
> yet edge-casey at the same time)
>
> I don't expect perfection as a phenomenal amount of progress is being made
> by a small group of people but I am beginning to wonder if the existing
> unit tests are sufficient to adequately exercise the VM? I.e. so that the
> VM developers are aware that a recent change may have broken something or
> are the existing tests mainly oriented towards image and bytecode VM
> development?  Just some food for thought and wanted to see if it's just me
> having these sorts of issues...
>
> Thanks,
> Phil
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20180330/30467e45/attachment.html>


More information about the Vm-dev mailing list