Thread overview
Serious Problems with the Test Suite
Jun 17, 2020
Walter Bright
Jun 18, 2020
Avrina
Jun 18, 2020
H. S. Teoh
Jun 18, 2020
Avrina
Jun 18, 2020
Walter Bright
Jun 18, 2020
H. S. Teoh
Jun 19, 2020
Walter Bright
Issues with optlink
Jun 19, 2020
Mathias LANG
Jun 18, 2020
Stefan Koch
Jun 19, 2020
Walter Bright
June 17, 2020
A good test suite should:

1. verify that things that are supposed to work do work

2. when things don't verify, point to where the problem is

The D test suite fails miserably at point 2. The only bright spot is the autotester, where when one of the tests fail it's quick to find the problem source.

But I cringe every time something else fails, because then I know I'm in for hours or even DAYS trying to figure out what and where things went wrong.

For example,

https://github.com/dlang/dmd/pull/11287

has several failures. All of which come with USELESS log files. I have no idea what went wrong. Some principles for log files:

1. If the log file says ERROR, it should be an ERROR, i.e. the test should fail. I'm often confronted with log files that list multiple ERRORe, but never mind, those errors don't need to pass. All benign ERROR messages, all deprecation messages, all warning messages need to be fixed, so what when the log file says ERROR that's why the test failed.

2. The ERROR that causes the test to fail should be LAST line in the log file, not 300 lines back.

3. Log files need to contain comment text at each step to SAY WHAT THEY ARE DOING.

4. Makefiles should NEVER, EVER be run in "quiet" mode, for the simple reason that one has no idea what it was trying to do when it failed.

5. Test files must either include a URL to the bugzilla issue they fix or have some clue in the comments what they are doing.

6. Running tests multi-process makes them go faster, but since the log files randomly interleave the output from them, it makes it impossible to figure out where the failure is.

7. Any test that fails because of a network error, or other environmental error unrelated to what is being tested, should automatically sleep for a minute or ten, then try again.

8. Any timeout terminations MUST say which test timed out.

9. Tests should not be Rube Goldberg Machines with layers and layers of complexity before the actual test is even run. Tests should be a THIN layer over the test.

10. Many tests are UTTERLY UNDOCUMENTED. For example,

https://github.com/dlang/dmd/tree/master/test/unit

What is that? What does it do? Is it one test or many tests? Let's look at:

https://github.com/dlang/dmd/blob/master/test/unit/frontend.d

Not a SINGLE COMMENT in it. What it is, what it does, etc., is all left to the imagination. This is completely unacceptable for production code, it is also unacceptable for any code accepted into the D repository.

11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.
June 18, 2020
On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
> 11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.

I've run into these problems with, for example, optlink. When trying to get optlink removed, you prevent it. These heisenbugs exist because, a lot of the time, you aren't willing to chop off dead weight.
June 18, 2020
On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
> A good test suite should:
>
> 1. verify that things that are supposed to work do work
>
> [...]

Most of those could be fixed with an improved test runner. If we did a timeout per test.

Another oblivious improvement would be printing only the tests which failed.

As for the missing comments, I think that's a plus.
When introducing a change in how dmd interprets D's semantics, one should be forced to scratch their head.
June 17, 2020
On Thu, Jun 18, 2020 at 01:59:39AM +0000, Avrina via Digitalmars-d wrote:
> On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
> > 11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.
> 
> I've run into these problems with, for example, optlink. When trying to get optlink removed, you prevent it. These heisenbugs exist because, a lot of the time, you aren't willing to chop off dead weight.

Whoa, holey miss the point batman!  Optlink may have its own share of issues, but the problem here isn't with this or that piece of software, it's with the structure of the testsuite.

Tests that are non-deterministic or depend on external state, strictly speaking, shouldn't be in the test suite. This includes tests that involve downloading some remote resource over the network, tests that assume things about the host OS and filesystem, etc..  There are a couple of these in the test suite, and they put you at the mercy of external state which is beyond your control. (I remember one time there was a heisenbug that had to do with random number generators, meaning, its probability of arbitrary, totally coincidental failure was non-zero. Sigh.)

These tests ought to be removed, or at least disabled in CI.  Any time you depend on external state, it really does not belong in the test suite, or at least, it does not belong in the autotester, because it just leads to tons of wasted time trying to track down exactly what it is that failed, which most of the time isn't even relevant to the PR you're trying to push through.


T

-- 
MASM = Mana Ada Sistem, Man!
June 18, 2020
On Thursday, 18 June 2020 at 02:34:42 UTC, H. S. Teoh wrote:
> On Thu, Jun 18, 2020 at 01:59:39AM +0000, Avrina via Digitalmars-d wrote:
>> On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
>> > 11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.
>> 
>> I've run into these problems with, for example, optlink. When trying to get optlink removed, you prevent it. These heisenbugs exist because, a lot of the time, you aren't willing to chop off dead weight.
>
> Whoa, holey miss the point batman!  Optlink may have its own share of issues, but the problem here isn't with this or that piece of software, it's with the structure of the testsuite.

There are issues with optlink, I've seen them manifest in testsuite and just running the test again "fix" it. It's not the only problem where this has occured.

I'm sure there's more problem with the test suite, and it is rather messy and has grown slow. I was replying specifically to the point about "heisenbugs". Some of which are of Walter's own creation do to his refusal to accept change.
June 18, 2020
On 6/18/2020 7:38 AM, Avrina wrote:
> There are issues with optlink, I've seen them manifest in testsuite and just running the test again "fix" it. It's not the only problem where this has occured.

I've run those tests more than anyone, and have not seen an optlink heisenbug.
June 18, 2020
On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:
> On 6/18/2020 7:38 AM, Avrina wrote:
> > There are issues with optlink, I've seen them manifest in testsuite and just running the test again "fix" it. It's not the only problem where this has occured.
> 
> I've run those tests more than anyone, and have not seen an optlink heisenbug.

I think it's because Walter uses advanced quantum technology that can directly handle quantum-superimposed computation states [1], so none of these heisenbugs affect him. ;-)

[1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d@puremagic.com


T

-- 
English is useful because it is a mess. Since English is a mess, it maps well onto the problem space, which is also a mess, which we call reality. Similarly, Perl was designed to be a mess, though in the nicest of all possible ways. -- Larry Wall
June 18, 2020
I've added a new keyword TestSuite and here are the current test suite bugs that I found:

https://issues.dlang.org/buglist.cgi?keywords=TestSuite&list_id=231900
June 18, 2020
On 6/18/2020 3:20 PM, H. S. Teoh wrote:
> On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:
>> I've run those tests more than anyone, and have not seen an optlink
>> heisenbug.
> 
> I think it's because Walter uses advanced quantum technology that can
> directly handle quantum-superimposed computation states [1], so none of
> these heisenbugs affect him. ;-)
> 
> [1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d@puremagic.com

That's not an optlink issue.
June 19, 2020
On Friday, 19 June 2020 at 00:54:15 UTC, Walter Bright wrote:
> On 6/18/2020 3:20 PM, H. S. Teoh wrote:
>> On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:
>>> I've run those tests more than anyone, and have not seen an optlink
>>> heisenbug.
>> 
>> I think it's because Walter uses advanced quantum technology that can
>> directly handle quantum-superimposed computation states [1], so none of
>> these heisenbugs affect him. ;-)
>> 
>> [1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d@puremagic.com
>
> That's not an optlink issue.

Starting a new thread as not to derail the original topic, which contained valid points.

Optlink has been a pain for everyone on x86 Windows for a while. I personally use Linux and Mac OSX, but tried doing some work on Windows recently and first think I got was a linker crash.

There have been active steps taken to limit its use / reduce the exposure of new users to it, among them:
- Dub defaults to mscoff since v1.15.0, and that has drastically improved the UX for new users. See https://github.com/dlang/dub/pull/1661 for the many reasons this was done.
- Vibe.d recently dropped support for it because they were causing crashes / timeout: https://github.com/vibe-d/vibe.d/pull/2445
- This was tried in DMD, and you obviously shut it down: https://github.com/dlang/dmd/pull/8347 . I will just quote the last post by Manu here: "I don't have the energy to pursue this. I do think it's important though."

And yes, they are document, advertised, and have been advocated for years, yet you refused to listen to the feedback countless users have given.