dmd commit, revision 657

dmd commit, revision 657

user: braddr

msg:
reenable fail274.d since the halt it was hitting has been removed

http://www.dsource.org/projects/dmd/changeset/657

The practice of commenting out unit tests is generally a bad idea. Historically, I've seen it done a lot in Phobos. It's bad because such tests are easily forgotten about.

NUnit has the [Ignore] attribute. Other test suites have expected failures. The basic idea is to leave tests that fail due to a legitimate bug such that the overall result is success but the details still show the problems.

As an example, cruise control with NUnit would colorize status bars based on what fraction of tests were success, failure, or ignored. (green, red, and yellow respectively)

I'll get off my soap box now...

Sent from my iPhone

On Sep 2, 2010, at 1:59 AM, "dsource.org" <noreply at dsource.org> wrote:

> dmd commit, revision 657
> 
> 
> user: braddr
> 
> msg:
> reenable fail274.d since the halt it was hitting has been removed
> 
> http://www.dsource.org/projects/dmd/changeset/657
> 
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals

When I run phobos unit tests, I get a few printouts like:

--- std.numeric(547) CustomFloat broken test ---

It seems to me like a good way to not forget about the test, but to also note that it's broken due to some compiler bug, is this a viable option for you?

-Steve

----- Original Message ----
> From: Jason House <jason.james.house at gmail.com>
> To: Discuss the internals of DMD <dmd-internals at puremagic.com>
> Sent: Thu, September 2, 2010 8:16:13 AM
> Subject: Re: [dmd-internals] dmd commit, revision 657
> 
> The practice of commenting out unit tests is generally a bad idea.
>Historically,  I've seen it done a lot in Phobos. It's bad because such tests are easily  forgotten about.
> 
> NUnit has the [Ignore] attribute. Other test suites have  expected failures.
>The basic idea is to leave tests that fail due to a  legitimate bug such that the overall result is success but the details still  show the problems.
> 
> As an example, cruise control with NUnit would  colorize status bars based on
>what fraction of tests were success, failure, or  ignored. (green, red, and yellow respectively)
>
> 
> I'll get off my soap box  now...
> 
> Sent from my iPhone
> 
> On Sep 2, 2010, at 1:59 AM, "dsource.org" <noreply at dsource.org> wrote:
> 
> >  dmd commit, revision 657
> > 
> > 
> > user: braddr
> > 
> >  msg:
> > reenable fail274.d since the halt it was hitting has been  removed
> > 
> >  http://www.dsource.org/projects/dmd/changeset/657
> > 
> >  _______________________________________________
> > dmd-internals mailing  list
> > dmd-internals at puremagic.com
> > http://lists.puremagic.com/mailman/listinfo/dmd-internals
> _______________________________________________
> dmd-internals s mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
> 

On 9/2/2010 5:16 AM, Jason House wrote:
> The practice of commenting out unit tests is generally a bad idea. Historically, I've seen it done a lot in Phobos. It's bad because such tests are easily forgotten about.
> 
> NUnit has the [Ignore] attribute. Other test suites have expected failures. The basic idea is to leave tests that fail due to a legitimate bug such that the overall result is success but the details still show the problems.
> 
> As an example, cruise control with NUnit would colorize status bars based on what fraction of tests were success, failure, or ignored. (green, red, and yellow respectively)
> 
> I'll get off my soap box now...

I agree.  However, in this case, disabling the tests and turning on an extra level of tightness in error checking of the test suite was a net improvement.

At the same time I disabled those three tests, I made it so that any 'fail_compilation' test that causes the compiler to crash (segv, abort, etc.. any signal) to fail the test.  The three tests that caused the compiler to crash all have bugs filed for them and all have patches pending.

Anyway.. I considered adding something to do more formal disabling, but I hate disable tests more than that would indicate.  I don't want to even have the mechanism.

Later,
Brad

I haven't run the Phobos unit tests since Walter's assert changes, but here's.  How useful to me would depend on a couple things:

1. Do unit tests print debug information mixed in with these failure printouts? If yes, that makes finding the failures kind of laborious / non-obvious.

2. Is there a way to differentiate failures that are known and not expected to be fixed soon from those that are unexpected failures?

Sent from my iPhone

On Sep 2, 2010, at 9:27 AM, Steve Schveighoffer <schveiguy at yahoo.com> wrote:

> When I run phobos unit tests, I get a few printouts like:
> 
> --- std.numeric(547) CustomFloat broken test ---
> 
> It seems to me like a good way to not forget about the test, but to also note that it's broken due to some compiler bug, is this a viable option for you?
> 
> -Steve
> 
> 
> 
> ----- Original Message ----
>> From: Jason House <jason.james.house at gmail.com>
>> To: Discuss the internals of DMD <dmd-internals at puremagic.com>
>> Sent: Thu, September 2, 2010 8:16:13 AM
>> Subject: Re: [dmd-internals] dmd commit, revision 657
>> 
>> The practice of commenting out unit tests is generally a bad idea. Historically,  I've seen it done a lot in Phobos. It's bad because such tests are easily  forgotten about.
>> 
>> NUnit has the [Ignore] attribute. Other test suites have  expected failures. The basic idea is to leave tests that fail due to a  legitimate bug such that the overall result is success but the details still  show the problems.
>> 
>> As an example, cruise control with NUnit would  colorize status bars based on what fraction of tests were success, failure, or  ignored. (green, red, and yellow respectively)
>> 
>> 
>> I'll get off my soap box  now...
>> 
>> Sent from my iPhone
>> 
>> On Sep 2, 2010, at 1:59 AM, "dsource.org" <noreply at dsource.org> wrote:
>> 
>>> dmd commit, revision 657
>>> 
>>> 
>>> user: braddr
>>> 
>>> msg:
>>> reenable fail274.d since the halt it was hitting has been  removed
>>> 
>>> http://www.dsource.org/projects/dmd/changeset/657
>>> 
>>> _______________________________________________
>>> dmd-internals mailing  list
>>> dmd-internals at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>> _______________________________________________
>> dmd-internals s mailing list
>> dmd-internals at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>> 
> 
> 
> 
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals

On Sep 2, 2010, at 2:07 PM, Brad Roberts <braddr at puremagic.com> wrote:

> I agree.  However, in this case, disabling the tests and turning on an extra level of tightness in error checking of the test suite was a net improvement.

> At the same time I disabled those three tests, I made it so that any 'fail_compilation' test that causes the compiler to crash (segv, abort, etc.. any signal) to fail the test.

Detecting seg faults as failures is definitely a good change.

> The three tests that caused the compiler to crash
> all have bugs filed for them and all have patches pending.

I'd argue that is a good reason to leave the failures in the test suite results. It's a reminder to go apply the patch!

> Anyway.. I considered adding something to do more formal disabling, but I hate disable tests more than that would indicate.  I don't want to even have the mechanism.

Here's how I'd rank things by level of dislike:
 ? A crippled test suite where failing tests are deleted
 ? A crippled test suite with failing tests are commented out
 ? A complete test suite that always fails due to bugs way down on the priority list
 ? A complete test suite that ignores tests but reports that it's ignoring tests
 ? A complete test suite that runs all tests but is tolerant of hand-picked tests failing (xFAIL, etc...)
 ? A complete test suite with all tests passing

The last one us unrealistic while under development. In fact, a "complete test suite" is also unrealistic, but a test suite should slowly become more and more thorough. Certainly, failing tests should not be removed.

On Thu, 2 Sep 2010, Jason House wrote:

> Here's how I'd rank things by level of dislike:
>  ? A crippled test suite where failing tests are deleted
>  ? A crippled test suite with failing tests are commented out
>  ? A complete test suite that always fails due to bugs way down on the priority list
>  ? A complete test suite that ignores tests but reports that it's ignoring tests
>  ? A complete test suite that runs all tests but is tolerant of hand-picked tests failing (xFAIL, etc...)
>  ? A complete test suite with all tests passing
> 
> The last one us unrealistic while under development. In fact, a "complete test suite" is also unrealistic, but a test suite should slowly become more and more thorough. Certainly, failing tests should not be removed.

The last one is what we have, as long as you drop the word 'complete'. DMD's test suite is monotomically increasing in completeness and releases don't comment out tests except in some extreme exceptions.. such as rolling back a change due to finding regression that wasn't covered by the test suite.

ie:
  fix a bug + add a test that showed the bug
  discover a bug that fix caused that didn't have a test in the suite
  roll back the fix and the new test
  re-fix with both the new test and the newly discovered test

The major wish I have is that walter would start using the public test suite rather than the private one.  I'm working on getting the public one to work on windows -- making good progress, it runs most of the tests now.

Once walter actually uses that one, it'll be easier to monitor to make sure fixes come with tests. :)

Later,
Brad

On Sep 2, 2010, at 3:46 PM, Brad Roberts <braddr at puremagic.com> wrote:

> On Thu, 2 Sep 2010, Jason House wrote:
> 
>> Here's how I'd rank things by level of dislike:
>> ? A crippled test suite where failing tests are deleted
>> ? A crippled test suite with failing tests are commented out
>> ? A complete test suite that always fails due to bugs way down on the priority list
>> ? A complete test suite that ignores tests but reports that it's ignoring tests
>> ? A complete test suite that runs all tests but is tolerant of hand-picked tests failing (xFAIL, etc...)
>> ? A complete test suite with all tests passing
>> 
>> The last one us unrealistic while under development. In fact, a "complete test suite" is also unrealistic, but a test suite should slowly become more and more thorough. Certainly, failing tests should not be removed.
> 
> The last one is what we have, as long as you drop the word 'complete'.

At the moment, that makes sense for the public test suite. As I understand it, you're playing catch-up; you're actively converting tests from the private test suite into the public one.

>  DMD's test suite is monotomically increasing in completeness and releases
> don't comment out tests except in some extreme exceptions.. such as
> rolling back a change due to finding regression that wasn't covered by the
> test suite.
> 
> ie:
>  fix a bug + add a test that showed the bug
>  discover a bug that fix caused that didn't have a test in the suite
>  roll back the fix and the new test
>  re-fix with both the new test and the newly discovered test

It should be simpler than that... When a bug is discovered, add a test. In your example above, two tests should be committed regardless of if the attempted patch is committed.

Let's take a hypothetical case of that Don attempted a patch and the new bug is best handled by Walter. The new tests represent a very clear communication of the problem space. Done right, that should make it easier for Walter to repeat Don's tests.

Similarly, many bugzilla entries gave a "reduced test case" posted to them when Don confirms the bug. Such tests should also make their way into the test suite.

I have no idea how Don currently concludes that patch for bugzilla 1234333 also fixes bugzilla 333322 and 3462663. A test suite with failing tests is one way to determine such things.

PS: Don, I'm not trying to pick on you. I'm using you as an example since you're such a prolific contributor.

PSS: Brad, I'm not picking on you either. I'm just using recent emails as inspiration to bring up a long overdue topic. All your work into the public test suite is commendable. It's much easier for me to be an armchair philosopher.

> The major wish I have is that walter would start using the public test suite rather than the private one.  I'm working on getting the public one to work on windows -- making good progress, it runs most of the tests now.
> 
> Once walter actually uses that one, it'll be easier to monitor to make sure fixes come with tests. :)

That's a great goal!
> 

Brad Roberts wrote:
> I don't want to even have the mechanism.
>
> 

Not everything needs a specific language mechanism.

    pragma(msg, "this unittest is disabled because of bugzilla 1234");
    //assert(foo == bar);

This way, you get dinged about it until it gets fixed.

Brad Roberts wrote:
> The major wish I have is that walter would start using the public test suite rather than the private one.  I'm working on getting the public one to work on windows -- making good progress, it runs most of the tests now.
>
> 

It already works fine on Windows, and I disagree that it needed reengineering so that it wouldn't.

Forums