Thread overview
Thoughts about unittest run order
May 06, 2019
H. S. Teoh
May 06, 2019
Jacob Carlborg
May 07, 2019
Walter Bright
May 07, 2019
H. S. Teoh
May 07, 2019
John Colvin
May 07, 2019
Atila Neves
May 07, 2019
H. S. Teoh
May 07, 2019
Atila Neves
May 07, 2019
H. S. Teoh
May 06, 2019
In theory, the order in which unittests are run ought to be irrelevant. In practice, however, the order can either make debugging code changes quite easy, or very frustrating.

I came from a C/C++ background, and so out of pure habit write things "backwards", i.e., main() is at the bottom of the file and the stuff that main() calls come just before it, and the stuff *they* call come before them, etc., and at the top are type declarations and low-level functions that later stuff in the module depend on.  After reading one of Walter's articles recently about improving the way you write code, I decided on a whim to write a helper utility in one of my projects "right side up", since D doesn't actually require declarations before usage like C/C++ do.  That is, main() goes at the very top, then the stuff that main() calls, and so on, with the low-level stuff all the way at the bottom of the file.

It was all going well, until I began to rewrite some of the low-level code in the process of adding new features. D's unittests have been immensely helpful when I refactor code, since they catch any obvious bugs and regressions early on so I don't have to worry too much about making large changes.  So I set about rewriting some low-level stuff that required extensive changes, relying on the unittests to catch mistakes.

But then I ran into a problem: because D's unittests are currently defined to run in lexical order, that means the unittests for higher-level functions will run first, followed by the lower-level unittests, because of the order I put the code in.  So when I accidentally introduced a bug in lower-level code, it was a high-level unittest that failed first -- which is too high up to figure out where exactly the real bug was. I had to gradually narrow it down from the high-level call through the middle-level calls and work my way to the low-level function where the bug was introduced.

This is quite the opposite from my usual experience with "upside-down order" code: since the low-level code and unittests would appear first in the module, any bugs in the low-level code would trigger failure in the low-level unittests first, right where the problem was. Once I fix the code to pass those tests, then the higher-level unittests would run to ensure the low-level changes didn't break any behaviour the higher-level functions were depending on.  This made development faster as less time was spent narrowing down why a high-level unittest was failing.

So now I'm tempted to switch back to "upside-down" coding order.

What do you guys think about this?


T

-- 
You have to expect the unexpected. -- RL
May 06, 2019
On 2019-05-06 20:13, H. S. Teoh wrote:
> In theory, the order in which unittests are run ought to be irrelevant.
> In practice, however, the order can either make debugging code changes
> quite easy, or very frustrating.
> 
> I came from a C/C++ background, and so out of pure habit write things
> "backwards", i.e., main() is at the bottom of the file and the stuff
> that main() calls come just before it, and the stuff *they* call come
> before them, etc., and at the top are type declarations and low-level
> functions that later stuff in the module depend on.  After reading one
> of Walter's articles recently about improving the way you write code, I
> decided on a whim to write a helper utility in one of my projects "right
> side up", since D doesn't actually require declarations before usage
> like C/C++ do.  That is, main() goes at the very top, then the stuff
> that main() calls, and so on, with the low-level stuff all the way at
> the bottom of the file.
> 
> It was all going well, until I began to rewrite some of the low-level
> code in the process of adding new features. D's unittests have been
> immensely helpful when I refactor code, since they catch any obvious
> bugs and regressions early on so I don't have to worry too much about
> making large changes.  So I set about rewriting some low-level stuff
> that required extensive changes, relying on the unittests to catch
> mistakes.
> 
> But then I ran into a problem: because D's unittests are currently
> defined to run in lexical order, that means the unittests for
> higher-level functions will run first, followed by the lower-level
> unittests, because of the order I put the code in.  So when I
> accidentally introduced a bug in lower-level code, it was a high-level
> unittest that failed first -- which is too high up to figure out where
> exactly the real bug was. I had to gradually narrow it down from the
> high-level call through the middle-level calls and work my way to the
> low-level function where the bug was introduced.
> 
> This is quite the opposite from my usual experience with "upside-down
> order" code: since the low-level code and unittests would appear first
> in the module, any bugs in the low-level code would trigger failure in
> the low-level unittests first, right where the problem was. Once I fix
> the code to pass those tests, then the higher-level unittests would run
> to ensure the low-level changes didn't break any behaviour the
> higher-level functions were depending on.  This made development faster
> as less time was spent narrowing down why a high-level unittest was
> failing.
> 
> So now I'm tempted to switch back to "upside-down" coding order.
> 
> What do you guys think about this?

There are different schools on how to order code in a file. Some will say put all the public functions first and then the private. Some will say code should read like a newspaper article, first an overview and the more and more you read you get deeper into the details. Others will says that you put related code next to each other, regardless if it's public or private symbols. I usually put public symbols first and the private symbols.

When it comes to the order of unit tests, I think they should run in random order. If a test fails it should print a seed value. If the tests are run with this seed value the tests should be run in the same order as before. This helps with debugging if some tests are accidentally depending on each other.

The problem you're facing, I'm guessing, is you run with the default unit test runner? If a single test fails, it will stop and run no other tests in that module (tests in other modules will run). If you pick and existing unit test framework or write your own unit test runner you can have the unit tests continue after a failure. Then you would see that the lower level tests are failing as well.

Writing a custom unit test runner with the help of the "getUnitTests" trait [1] would allow you to do additional things like looking for UDAs that could set the order of the unit tests or group the unit tests in various ways. You could group them in high and low level groups and have the unit test runner run the low level tests first.

For your particular problem it seems it would be solved by continue running the other tests in the same module when a test has failed. I think "silly" [2] looks really interesting. I haven't had the time yet to try it out so I don't know if it will continue after a failed test.

[1] https://dlang.org/spec/traits.html#getUnitTests
[2] https://code.dlang.org/packages/silly

-- 
/Jacob Carlborg
May 06, 2019
On 5/6/2019 11:13 AM, H. S. Teoh wrote:
> What do you guys think about this?

That thought never occurred to me, thanks for bringing it up.

It suggests perhaps the order of unittests should be determined by a dependency graph, and should start with the leaves.

May 07, 2019
On Monday, 6 May 2019 at 18:13:37 UTC, H. S. Teoh wrote:
> In theory, the order in which unittests are run ought to be irrelevant. In practice, however, the order can either make debugging code changes quite easy, or very frustrating.
>
> [...]

Use a test runner that runs all the tests regardless of previous errors? (and does them in multiple threads, hooray!)

https://github.com/atilaneves/unit-threaded

Then you'll at least get to know everything that failed instead of just whatever happened to be lexically first.

I agree that some ordering system might improve the time-to-narrow-down-bug-location a bit, but the above might be acceptable nonetheless.
May 07, 2019
On Tuesday, 7 May 2019 at 08:49:15 UTC, John Colvin wrote:
> On Monday, 6 May 2019 at 18:13:37 UTC, H. S. Teoh wrote:
>> In theory, the order in which unittests are run ought to be irrelevant. In practice, however, the order can either make debugging code changes quite easy, or very frustrating.
>>
>> [...]
>
> Use a test runner that runs all the tests regardless of previous errors? (and does them in multiple threads, hooray!)
>
> https://github.com/atilaneves/unit-threaded


unit-threaded can also do the random order and reuse a seed like Jacob mentioned above.

If the order tests run in is important, the tests are coupled... friends don't let friends couple their tests.

May 07, 2019
On Mon, May 06, 2019 at 10:30:01PM -0700, Walter Bright via Digitalmars-d wrote:
> On 5/6/2019 11:13 AM, H. S. Teoh wrote:
> > What do you guys think about this?
> 
> That thought never occurred to me, thanks for bringing it up.
> 
> It suggests perhaps the order of unittests should be determined by a dependency graph, and should start with the leaves.

That was also my first thought, but how would you construct such a graph? In my case, almost all of the unittests are at module level, and call various module-level functions.  It's not obvious how the compiler would divine which ones should come first just by looking at the unittest body. You'd have to construct the full function call dependency graph of the entire module to get that information.


T

-- 
Answer: Because it breaks the logical sequence of discussion. / Question: Why is top posting bad?
May 07, 2019
On Tue, May 07, 2019 at 09:40:27AM +0000, Atila Neves via Digitalmars-d wrote: [...]
> If the order tests run in is important, the tests are coupled... friends don't let friends couple their tests.

How do you decouple the tests of two functions F and G in which F calls G? If a code change broke the behaviour of G, then both tests should fail. Then we run into this problem with the default test runner. To make F's tests independent of G would require that they pass *regardless* of the behaviour of G, which seems like an unattainable goal unless you also decouple F from G, which implies that every tested function must be a leaf function. Which seems unrealistic.


T

-- 
The trouble with TCP jokes is that it's like hearing the same joke over and over.
May 07, 2019
On Tuesday, 7 May 2019 at 11:29:43 UTC, H. S. Teoh wrote:
> On Tue, May 07, 2019 at 09:40:27AM +0000, Atila Neves via Digitalmars-d wrote: [...]
>> If the order tests run in is important, the tests are coupled... friends don't let friends couple their tests.
>
> How do you decouple the tests of two functions F and G in which F calls G?

It depends. If F and G are both public functions that are part of the API, then one can't. Otherwise I'd just test F since G is an implementation detail.

I consider keeping tests around for implementation details an anti-pattern. Sometimes it's useful to write the tests if doing TDD or debugging, but afterwards I delete them.

May 07, 2019
On Tue, May 07, 2019 at 04:50:23PM +0000, Atila Neves via Digitalmars-d wrote:
> On Tuesday, 7 May 2019 at 11:29:43 UTC, H. S. Teoh wrote:
> > On Tue, May 07, 2019 at 09:40:27AM +0000, Atila Neves via Digitalmars-d wrote: [...]
> > > If the order tests run in is important, the tests are coupled... friends don't let friends couple their tests.
> > 
> > How do you decouple the tests of two functions F and G in which F calls G?
> 
> It depends. If F and G are both public functions that are part of the API, then one can't. Otherwise I'd just test F since G is an implementation detail.
> 
> I consider keeping tests around for implementation details an anti-pattern.  Sometimes it's useful to write the tests if doing TDD or debugging, but afterwards I delete them.

I almost never delete unittests. IME, they usually wind up catching a regression that would've been missed otherwise.


T

-- 
If you want to solve a problem, you need to address its root cause, not just its symptoms. Otherwise it's like treating cancer with Tylenol...