Tests and software process
Daniel Vainsencher
daniel.vainsencher at gmail.com
Wed Nov 1 14:11:57 UTC 2006
Hi Ralph,
Of course you're right, this has been an issue for quite a while. I
think the problem is that tests have diverse domains of validity, and
there are neither abstractions nor infrastructure in place to support them.
In theory (and often in practice) you run "the" test suite every few
minutes, and a test fails iff some code is broken. Wonderful!
unfortunately, in a large scale, distributed, diverse effort like
Squeak, things are more complicated.
Examples:
- Platform specific tests.
- Very long running tests, which for most people don't give enough value
for their machine time.
- Non-self-contained tests, for example ones that require external files
to be present.
- Performance tests (only valid on reasonably fast machines. And this
might change over time...)
All of these do have some value in some context, but some cannot be
expected to be always green, and some aren't even worth running most of
the time. And the problem is that our current choice about "where/when
should this test run" is currently binary - everywhere, or nowhere. You
say we should be more aggressive in making this binary decision, but the
reason this isn't happening is that sometimes neither option is quite right.
The community has moved back and forth between extracting some/all tests
into an optional package, but in practice that just means they never get
run.
Do you know of some set of abstractions/practices/framework to deal with
this problem?
Daniel Vainsencher
Ralph Johnson wrote:
> Squeak comes with a large set of SUnit tests. Unfortunately, some of
> them don't work. As far as I can tell, there is NO recent version of
> Squak in which all the tests work.
>
> This is a sign that something is wrong. The main purpose of shipping
> tests with code is so that people making changes can tell when they
> break things. If the tests don't work then people will not run them.
> If they don't run the tests then the tests are useless. The current
> set of tests are useless because of the bad tests. Nobody complains
> about them, which tells me that nobody runs them. So, it is all a
> waste of time.
>
> If the tests worked then it would be easy to make a new version.
> Every bug fix would have to come with a test that illustrates the bug
> and shows that it has been fixed. The group that makes a new version
> would check that all tests continue to work after the bug fix.
>
> An easy way to make all the tests run is to delete the ones that don't
> work. There are thousands of working tests and, depending on the
> version, dozens of non-working tests. Perhaps the non-working tests
> indicate bugs, perhaps they indicate bad tests. It seems a shame to
> delete tests that are illustrating bugs. But if these tests don't
> work, they keep the other tests from being useful. Programmers need
> to know that all the tests worked in the virgin image, and that if the
> tests quit working, it is there own fault.
>
> No development image should ever be shipped with any failing tests.
>
> -Ralph Johnson
>
More information about the Squeak-dev
mailing list
|