July 14, 2010

Jonathan M Davis wrote:
>
> On the other hand, never halting on assertion failures makes for a lot of cruft in your results.

I don't really understand why that's a problem, or causes extra work.

But I can see that having to divide up the tests into multiple unittest blocks adds extra work.
July 14, 2010
On Wednesday, July 14, 2010 16:30:58 Walter Bright wrote:
> Jonathan M Davis wrote:
> > On the other hand, never halting on assertion failures makes for a lot of cruft in your results.
> 
> I don't really understand why that's a problem, or causes extra work.
> 
> But I can see that having to divide up the tests into multiple unittest
> blocks adds extra work.
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos

Well, I find it perfectly normal having to divide up the unit tests since every unit testing framework that I've used divides up unit tests into functions where each function is a test that succeeds or fails. There may be many assertions in a particular function, but once one fails, that function ceases to run (generally because an exception was thrown which is then caught by the framework). So, it actually makes more sense to me to divide up tests into unit test blocks. True, if you're testing a set of small but unrelated stuff, it could make sense to put them all together in one test, but if you're testing blocks as a whole, then you have the choice of whether to break them up or not just like in another framework, you'd have the choice of whether you have a single function or multiple functions.

The cruft is a problem because it makes it harder to find the errors you actually care about. Unless you're checking a bunch of small, unrelated things, any assertion failure after the initial error is likely to be totally erroneous (since it was testing invalid state to begin with), so all of those errors just clutter the output, making it harder to find where one unittest block's tests end and where another begins (since it'll be the beginning of the next block where you'll start caring about errors again).

It's definitely better to run all tests in a block regardless of assertion failures than it is to stop running all of the unittests in a program or module because one assertion failed. However, I do think that it's better to not continue to run a particular unittest block after an assertion fails because then it's likely to be testing invalid state and report a bunch of errors which are irrelevant.

If you look at each unittest block as being a unit test testing a set of functionality rather than a particular assertion being the test, then it's just plain weird to continue to run the tests after the first assertion failed. The test already failed.

However, if you're viewing each assertion as a test with the idea that there may be many unrelated assertions together, then I can see why it might be desirable to have the block continue to be executed (since splitting them up would mean a lot of unittest blocks).

In any case, I do think that continuing to execute a unittest block after a failure is odd, and that it will result in a lot of unnecessary cruft, so I'd prefer that that didn't happen. However, it's definitely better to have the test continue than to stop them all after a single assertion failure. I suppose that the ideal solution would be to allow for both by having some way of indicating whether you wanted the test to continue after a particular assertion failure, but that would likely require multiple assertion types, which we don't have right now (at least not that the programmer can choose). Maybe if asserts threw normally (making it so that the first assertion failure in the unittest block resulted in that block ceasing execution, followed by the next unittest block being executed) but we had a library function of some kind which simply printed an error rather than throwing that would allow for the best of both worlds (like assert_nothrow() or something similar).

- Jonathan M Davis
July 14, 2010
On Jul 14, 2010, at 5:00 PM, Jonathan M Davis wrote:

> On Wednesday, July 14, 2010 16:30:58 Walter Bright wrote:
>> Jonathan M Davis wrote:
>>> On the other hand, never halting on assertion failures makes for a lot of cruft in your results.
>> 
>> I don't really understand why that's a problem, or causes extra work.
>> 
>> But I can see that having to divide up the tests into multiple unittest
>> blocks adds extra work.
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
> 
> Well, I find it perfectly normal having to divide up the unit tests since every unit testing framework that I've used divides up unit tests into functions where each function is a test that succeeds or fails. There may be many assertions in a particular function, but once one fails, that function ceases to run (generally because an exception was thrown which is then caught by the framework). So, it actually makes more sense to me to divide up tests into unit test blocks. True, if you're testing a set of small but unrelated stuff, it could make sense to put them all together in one test, but if you're testing blocks as a whole, then you have the choice of whether to break them up or not just like in another framework, you'd have the choice of whether you have a single function or multiple functions.
> 
> The cruft is a problem because it makes it harder to find the errors you actually care about.

This seems a lot like compiler error handling.  A compiler can flag an error and continue processing, but every successive error is less meaningful than the last.  If the first error was from a missing paren or curly brace, things get really weird really fast.  How about this... since unit tests are run after static ctors are run, the behavior of whether to throw an AssertError or report and continue can be a run-time configurable option.  I'd still like to get more granular tests though, since throwing on error is far less useful if all unit tests for a module are grouped together rather than run separately.
July 14, 2010



----- Original Message ----
> From: Sean Kelly <sean at invisibleduck.org>

> This seems  a lot like compiler error handling.  A compiler can flag an error
>and  continue processing, but every successive error is less meaningful than the last.

That's true because the semantics of the later part of the program depend heavily on the former part--how many braces, semi-colons, etc.

But a unit test block for the minus operation on a class Rational may have little or nothing to do with the unit test for division in Rational, say.  I think what Jonathan is saying is that it's fairly common practice to break up unit tests even for a single class into small sections--one for ctors, one for operation1, one for operations2 and 3 (because they're related), etc.  In this case, you'd like all 3 unit tests for the class to run, even though the new ctor you just added broke the first unit test block.

Having used several of the UT frameworks out there, my own expectation is that if I have 5 UT blocks in a module, when I run with -unittest, all 5 blocks will run.  If one or more should fail, then main will not run.  The failing test block will stop executing at an assertion failure and the next UT will be started.  But all 5 would run everytime.

>How about this... since unit tests are run  after static ctors are run, the behavior of whether to throw an AssertError or  report and continue can be a run-time configurable option.  I'd still like  to get more granular tests though, since throwing on error is far less useful if  all unit tests for a module are grouped together rather than run  separately.

I think that's probably the only way to make everyone happy.  That's why I suggested making a global UT status var available or allowing a function to be defined that could decide to continue or something.  Just have to justify the cost.
July 14, 2010

Sean Kelly wrote:
>
>
> How about this... since unit tests are run after static ctors are run, the behavior of whether to throw an AssertError or report and continue can be a run-time configurable option.

Frankly, I don't think a configurable option is a good idea. Druntime has a lot of user configurability, and as far as I can tell, absolutely none of it has ever been used. It just adds complexity, both to the code and the documentation.
July 14, 2010
On 07/14/2010 08:58 PM, Walter Bright wrote:
>
>
> Sean Kelly wrote:
>>
>>
>> How about this... since unit tests are run after static ctors are run, the behavior of whether to throw an AssertError or report and continue can be a run-time configurable option.
>
> Frankly, I don't think a configurable option is a good idea. Druntime has a lot of user configurability, and as far as I can tell, absolutely none of it has ever been used. It just adds complexity, both to the code and the documentation.

I agree. But also simple does not mean crappy. I think there's too much talk around a simple matter:

======
Assertion failure should abort the current unittest block.
======

Do we all agree about the above? We keep on discussing how the usefulness and informativeness of assertion failures falls off a cliff after the first failure, and I can't seem to hear loud and clear that we all want this same thing.

So we've been through these steps:

1. Crappy: failure to assert causes the program to abort.

2. Awful: assert is hacked inside unittests to essentially be writeln, all unittests run regardless. Programs claims success upon exit.

3. Mediocre: assert continues to be hacked inside unittests to essentially be writeln + set a flag. Programs exit with errors if the flag was set.

NOT YET:

4. DESIRED: assert is NOT hacked, any failing assert ends the current unittest, the failure message is printed, execution continues with the next unittest, program ends with error code if at least one assert failed, everybody's happy.

I'm glad we made progress from 2 to 3. Could you all please share your opinion about the delta between 3 and 4?


Andrei
July 14, 2010
On Jul 14, 2010, at 7:42 PM, Andrei Alexandrescu wrote:

> On 07/14/2010 08:58 PM, Walter Bright wrote:
>> 
>> 
>> Sean Kelly wrote:
>>> 
>>> 
>>> How about this... since unit tests are run after static ctors are run, the behavior of whether to throw an AssertError or report and continue can be a run-time configurable option.
>> 
>> Frankly, I don't think a configurable option is a good idea. Druntime has a lot of user configurability, and as far as I can tell, absolutely none of it has ever been used. It just adds complexity, both to the code and the documentation.
> 
> I agree. But also simple does not mean crappy. I think there's too much talk around a simple matter:
> 
> ======
> Assertion failure should abort the current unittest block.
> ======
> 
> Do we all agree about the above?

Yes.

> We keep on discussing how the usefulness and informativeness of assertion failures falls off a cliff after the first failure, and I can't seem to hear loud and clear that we all want this same thing.
> 
> So we've been through these steps:
> 
> 1. Crappy: failure to assert causes the program to abort.
> 
> 2. Awful: assert is hacked inside unittests to essentially be writeln, all unittests run regardless. Programs claims success upon exit.
> 
> 3. Mediocre: assert continues to be hacked inside unittests to essentially be writeln + set a flag. Programs exit with errors if the flag was set.
> 
> NOT YET:
> 
> 4. DESIRED: assert is NOT hacked, any failing assert ends the current unittest, the failure message is printed, execution continues with the next unittest, program ends with error code if at least one assert failed, everybody's happy.
> 
> I'm glad we made progress from 2 to 3. Could you all please share your opinion about the delta between 3 and 4?

That's what I want.  Most is a trivial library change which I've already made and have been waiting to commit, the missing piece is the unittest block-level granularity for test recovery.  That requires a compiler change to actually expose unit tests at that level of granularity.  My vacation ends tomorrow and I'll see if I can figure out how to patch this into the compiler.  I really haven't spent the time to understand how it works yet.
July 14, 2010
Le 2010-07-14 ? 22:42, Andrei Alexandrescu a ?crit :

> On 07/14/2010 08:58 PM, Walter Bright wrote:
>> 
>> 
>> Sean Kelly wrote:
>>> 
>>> 
>>> How about this... since unit tests are run after static ctors are run, the behavior of whether to throw an AssertError or report and continue can be a run-time configurable option.
>> 
>> Frankly, I don't think a configurable option is a good idea. Druntime has a lot of user configurability, and as far as I can tell, absolutely none of it has ever been used. It just adds complexity, both to the code and the documentation.
> 
> I agree. But also simple does not mean crappy. I think there's too much talk around a simple matter:
> 
> ======
> Assertion failure should abort the current unittest block.
> ======
> 
> Do we all agree about the above? We keep on discussing how the usefulness and informativeness of assertion failures falls off a cliff after the first failure, and I can't seem to hear loud and clear that we all want this same thing.
> 
> So we've been through these steps:
> 
> 1. Crappy: failure to assert causes the program to abort.
> 
> 2. Awful: assert is hacked inside unittests to essentially be writeln, all unittests run regardless. Programs claims success upon exit.
> 
> 3. Mediocre: assert continues to be hacked inside unittests to essentially be writeln + set a flag. Programs exit with errors if the flag was set.
> 
> NOT YET:
> 
> 4. DESIRED: assert is NOT hacked, any failing assert ends the current unittest, the failure message is printed, execution continues with the next unittest, program ends with error code if at least one assert failed, everybody's happy.
> 
> I'm glad we made progress from 2 to 3. Could you all please share your opinion about the delta between 3 and 4?

In my order of preference: 4, 1, 3, 2. I think 4 is what most people want in most situations, and it also fit much better with the expected behaviour of assert (which should throw). It's also what most unit test frameworks do.

I'm currently working with a unit test framework that happens to do 3 (SenTestingKit, Objective-C) and it's frustrating that when I have only two tests that fails I get 14 errors, only two of them meaningful (got that just a few hours ago). Fortunately for me, I use Xcode to visualize the results which groups those errors by unit test, making it easier to see how many and which tests are failing and navigate around: in this situation it might make some sense to allow tests to continue after a failed assertion even though the utility is limited. But for this kind of visualization, you'd need a unit tests delimiter in the output (and ideally unit test names), and you also need to subvert 'assert' not to throw, both of which do not fit well with D. (And you need an IDE to parse and display the results better.)

So I think 4 is the best option. 1 is second best. 3 is mostly the same as one, except you have a lot of garbage (meaningless errors) following the first error. 4 is downright silly (for a default behaviour). I can see some situation where each of these options can be useful, but 4 is the one that makes most sense most of the time and fits .

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



July 14, 2010
Le 2010-07-14 ? 22:42, Andrei Alexandrescu a ?crit :
>>
>> I agree. But also simple does not mean crappy. I think there's too much talk around a simple matter:
>>
>> ======
>> Assertion failure should abort the current unittest block.
>> ======
>> 

Yes.

>> 4. DESIRED: assert is NOT hacked, any failing assert ends the current unittest, the failure message is printed, execution continues with the next unittest, program ends with error code if at least one assert failed, everybody's happy.
>>
>> 

I want that.
July 14, 2010
On Wednesday 14 July 2010 19:42:06 Andrei Alexandrescu wrote:
> On 07/14/2010 08:58 PM, Walter Bright wrote:
> > Sean Kelly wrote:
> >> How about this... since unit tests are run after static ctors are run, the behavior of whether to throw an AssertError or report and continue can be a run-time configurable option.
> > 
> > Frankly, I don't think a configurable option is a good idea. Druntime has a lot of user configurability, and as far as I can tell, absolutely none of it has ever been used. It just adds complexity, both to the code and the documentation.
> 
> I agree. But also simple does not mean crappy. I think there's too much talk around a simple matter:
> 
> ======
> Assertion failure should abort the current unittest block.
> ======
> 
> Do we all agree about the above?

Yes.

> We keep on discussing how the
> usefulness and informativeness of assertion failures falls off a cliff
> after the first failure, and I can't seem to hear loud and clear that we
> all want this same thing.
> 
> So we've been through these steps:
> 
> 1. Crappy: failure to assert causes the program to abort.
> 
> 2. Awful: assert is hacked inside unittests to essentially be writeln, all unittests run regardless. Programs claims success upon exit.
> 
> 3. Mediocre: assert continues to be hacked inside unittests to essentially be writeln + set a flag. Programs exit with errors if the flag was set.
> 
> NOT YET:
> 
> 4. DESIRED: assert is NOT hacked, any failing assert ends the current unittest, the failure message is printed, execution continues with the next unittest, program ends with error code if at least one assert failed, everybody's happy.
> 
> I'm glad we made progress from 2 to 3. Could you all please share your opinion about the delta between 3 and 4?

4 is far more desirable. 3 makes assert function abnormally in comparison to normal code and will likely confuse people. It's also likely to result in lots of extraneous failures. 4 is what all the unit test frameworks I have used do, so I'm inclined to think that it's more standard. The only downside that I see to 4 is that if you want to do a lot of little tests and have them all run regardless of previous failures, you need a lot of unittest blocks. But in every other unit test framework, you'd need a full-on function for every such test (possibly including prototypes in a header file and the like), so it's no worse than what I'm used to.

I much prefer 3 over 2, and it's probably better than 1 (depending on how many tests you have failing), but 4 is, in my opinion, hands down better than 3. 3's behaviour is just plain odd - better than 2's behavior - but odd, and likely confusing to anyone new to D.

- Jonathan M Davis