January 17, 2017
Since my original post got no interest in the general section [1], I'm going to rehash here:

Sporadic auto-tester failures often go unreported with the general attitude being "just restart the test."

I fixed the failures I found while simply looking at open PRs [2-4].
Then I put a together a proof of concept which:
 - scrapes the auto-tester website and finds "random" failures
 - searches the log files for the test that failed
 - groups similar failures and outputs a report

I've posted the code in [5]. I pipe the python output to a file and pass the filename as argument to the D program.

Shouldn't something similar be integrated into to the auto-tester?

It's important to report "random" failures so that they can be addressed.
While many are caused by bad tests, they can have underlying critical causes (just look at [2].)
They can also make contributing a sour experience.

Thoughts?


P.S. PR [4] is sitting in review limbo due to codecov's red X.


[1] https://forum.dlang.org/thread/hjkkpiavwqjlmnaskqfv@forum.dlang.org

[2] https://github.com/dlang/phobos/pull/4988
    https://github.com/dlang/phobos/pull/4993

[3] https://github.com/dlang/phobos/pull/4997

[4] https://github.com/dlang/phobos/pull/5004

[5] https://gist.github.com/WalterWaldron/59a52b610890911d3622e93e8bdf75ec