Jump to page: 1 2 3
Thread overview
Documented unittests & code coverage
Jul 28, 2016
Johannes Pfau
Jul 28, 2016
Jonathan M Davis
Jul 28, 2016
Walter Bright
Jul 28, 2016
Seb
Jul 29, 2016
Jonathan M Davis
Jul 29, 2016
Walter Bright
Jul 29, 2016
Jonathan M Davis
Jul 29, 2016
Walter Bright
Jul 30, 2016
Seb
Jul 29, 2016
Jack Stouffer
Jul 29, 2016
Walter Bright
Jul 29, 2016
Chris
Jul 29, 2016
Jack Stouffer
Aug 04, 2016
Atila Neves
Aug 04, 2016
Walter Bright
Aug 04, 2016
Atila Neves
Aug 04, 2016
Walter Bright
Aug 05, 2016
Walter Bright
Aug 05, 2016
Atila Neves
Aug 05, 2016
Jonathan M Davis
Aug 05, 2016
Walter Bright
Aug 05, 2016
Jonathan M Davis
July 28, 2016
Some time ago we moved some example code from documentation comments
into documented unittests. Some of these more complicated examples are
incomplete and therefore are not expected to actually run. These are
also not very useful as real unittests as they do not contain any
asserts or verification of results. Making them documented
unittests was mainly done to make sure these examples keep compiling
when there are library changes.

I just had a quick look at https://github.com/dlang/phobos/pull/4587
and some example output:
https://codecov.io/gh/wilzbach/phobos/src/5fc9eb90076101c0266fb3491ac68527d3520fba/std/digest/digest.d#L106

And it seems that this 'idiom' messes up code coverage results: The code in the unittest is never executed (we just want syntactically valid code) and therefore shows up as untested code. The code coverage shows 46 missed lines for std.digest.digest, but only 8 of these actually need testing.

So how do we solve this?
* Ignore lines in documented unittests for code coverage?
* Make these examples completely executable, at the expense of
  documentation which will then contain useless boilerplate code
* Move these examples back to the documentation block?


And as a philosophical question: Is code coverage in unittests even a meaningful measurement? We write unittests to test the library code. But if there's a line in a unittests which is never executed, this does not directly mean there's a problem in library code, as long as all library code is still tested. It may be an oversight in the unittest in the worst case, but how important are ovesights / bugs in the unittests if they don't affect library code in any way?
July 28, 2016
On Thursday, July 28, 2016 12:15:27 Johannes Pfau via Digitalmars-d wrote:
> And as a philosophical question: Is code coverage in unittests even a meaningful measurement? We write unittests to test the library code. But if there's a line in a unittests which is never executed, this does not directly mean there's a problem in library code, as long as all library code is still tested. It may be an oversight in the unittest in the worst case, but how important are ovesights / bugs in the unittests if they don't affect library code in any way?

https://issues.dlang.org/show_bug.cgi?id=14856

- Jonathan M Davis

July 28, 2016
On 7/28/2016 3:15 AM, Johannes Pfau wrote:
> And as a philosophical question: Is code coverage in unittests even a
> meaningful measurement?

Yes. I've read all the arguments against code coverage testing. But in my usage of it for 30 years, it has been a dramatic and unqualified success in improving the reliability of shipping code.

July 28, 2016
On Thursday, 28 July 2016 at 23:14:42 UTC, Walter Bright wrote:
> On 7/28/2016 3:15 AM, Johannes Pfau wrote:
>> And as a philosophical question: Is code coverage in unittests even a
>> meaningful measurement?
>
> Yes. I've read all the arguments against code coverage testing. But in my usage of it for 30 years, it has been a dramatic and unqualified success in improving the reliability of shipping code.

@Walter: the discussion is not about code coverage in general, but whether code coverage within unittests should be reported, because we are only interested in the coverage of the library itself and as Johannes and Jonathan pointed out there are some valid patterns (e.g. scope(failure)) that are used within unittests and never called.

However as Jonathan mentioned in the Bugzilla issue, the downside of not counting within unittest blocks is that potential bugs in the unittests can't be catched that easy anymore.
July 28, 2016
On Thursday, July 28, 2016 16:14:42 Walter Bright via Digitalmars-d wrote:
> On 7/28/2016 3:15 AM, Johannes Pfau wrote:
> > And as a philosophical question: Is code coverage in unittests even a meaningful measurement?
>
> Yes. I've read all the arguments against code coverage testing. But in my usage of it for 30 years, it has been a dramatic and unqualified success in improving the reliability of shipping code.

The issue isn't whether we should have code coverage testing. We agree that that's a great thing. The issue is whether the lines in the unit tests themselves should count towards the coverage results.

https://issues.dlang.org/show_bug.cgi?id=14856

gives some good examples of why having the unittest blocks themselves counted in the total percentage is problematic and can lead to dmd's code coverage tool listing than 100% coverage in a module that is fully tested. What's critical is that the code itself has the coverage testing not that the lines in the tests which are doing that testing be counted as part of the code that is or isn't covered.

I know that it will frequently be the case that I will not get 100% code coverage per -cov for the code that I write simply because I frequently do stuff like use scope(failure) writefln(...) to print useful information on failure in unittest blocks so that I can debug what happened when things go wrong (including when someone reports failures on their machine that don't happen on mine).

D's code coverage tools are fantastic to have, but they do need a few tweaks if we want to actually be reporting 100% code coverage for fully tested modules. A couple of other reports that I opened a while back are

https://issues.dlang.org/show_bug.cgi?id=14855 https://issues.dlang.org/show_bug.cgi?id=14857

- Jonathan M Davis

July 28, 2016
On 7/28/2016 9:48 PM, Jonathan M Davis via Digitalmars-d wrote:
> gives some good examples of why having the unittest blocks themselves
> counted in the total percentage is problematic and can lead to dmd's code
> coverage tool listing than 100% coverage in a module that is fully tested.
> What's critical is that the code itself has the coverage testing not that
> the lines in the tests which are doing that testing be counted as part of
> the code that is or isn't covered.
>
> I know that it will frequently be the case that I will not get 100% code
> coverage per -cov for the code that I write simply because I frequently do
> stuff like use scope(failure) writefln(...) to print useful information on
> failure in unittest blocks so that I can debug what happened when things go
> wrong (including when someone reports failures on their machine that don't
> happen on mine).
>
> D's code coverage tools are fantastic to have, but they do need a few tweaks
> if we want to actually be reporting 100% code coverage for fully tested
> modules. A couple of other reports that I opened a while back are

As soon as we start taking the % coverage too seriously, we are in trouble. It's never going to be cut and dried what should be tested and what is unreasonable to test, and I see no point in arguing about it.

The % is a useful indicator, that is all. It is not a substitute for thought.

As always, use good judgement.

July 28, 2016
On Thursday, July 28, 2016 22:12:58 Walter Bright via Digitalmars-d wrote:
> As soon as we start taking the % coverage too seriously, we are in trouble. It's never going to be cut and dried what should be tested and what is unreasonable to test, and I see no point in arguing about it.
>
> The % is a useful indicator, that is all. It is not a substitute for thought.
>
> As always, use good judgement.

True, but particularly when you start doing stuff like trying to require that modules have 100% coverage - or that the coverage not be reduced by a change - it starts mattering - especially if it's done with build tools. The current situation is far from the end of the world, but I definitely think that we'd be better off if we fixed some of these issues so that the percentage reflected the amount of the actual code that's covered rather than having unit tests, assert(0) statements, invariants, etc. start affecting code coverage when they aren't what you're trying to cover at all.

- Jonathan M Davis

July 29, 2016
On Friday, 29 July 2016 at 05:12:58 UTC, Walter Bright wrote:
> As soon as we start taking the % coverage too seriously, we are in trouble. It's never going to be cut and dried what should be tested and what is unreasonable to test, and I see no point in arguing about it.
>
> The % is a useful indicator, that is all. It is not a substitute for thought.
>
> As always, use good judgement.

In the context of the bug, we are not the ones interpreting the statistic, we're the ones measuring and reporting it to users, and it's being measured incorrectly. By deciding not to fix a bug that causes an inaccurate statistic to be reported, you're making a decision on the user's behalf that coverage % is unimportant without knowing their circumstances.

If you're going to include coverage % in the report, then a job worth doing is worth doing well.
July 28, 2016
On 7/28/2016 10:49 PM, Jonathan M Davis via Digitalmars-d wrote:
> True, but particularly when you start doing stuff like trying to require
> that modules have 100% coverage - or that the coverage not be reduced by a
> change - it starts mattering - especially if it's done with build tools. The
> current situation is far from the end of the world, but I definitely think
> that we'd be better off if we fixed some of these issues so that the
> percentage reflected the amount of the actual code that's covered rather
> than having unit tests, assert(0) statements, invariants, etc. start
> affecting code coverage when they aren't what you're trying to cover at all.

Worrying about this just serves no purpose. Code coverage percentages are a guide, an indicator, not a requirement in and of itself.

Changing the code in order to manipulate the number to meet some metric means the reviewer or the programmer or both have failed.
July 29, 2016
On 7/28/2016 11:07 PM, Jack Stouffer wrote:
> you're making a decision on the user's behalf that coverage % is
> unimportant without knowing their circumstances.

Think of it like the airspeed indicator on an airplane. There is no right or wrong airspeed. The pilot reads the indicated value, interprets it in the context of what the other instruments say, APPLIES GOOD JUDGMENT, and flies the airplane.

You won't find many pilots willing to fly without one.

« First   ‹ Prev
1 2 3