May 01, 2014
On Wed, 30 Apr 2014 14:35:45 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
wrote:
> Agreed. I think we should look into parallelizing all unittests. --

I'm all for parallelizing all unittest blocks that are pure, as doing so would be safe, but I think that we're making a big mistake if we try and insist that all unittest blocks be able to be run in parallel. Any that aren't pure are not guaranteed to be parallelizable, and any which access system resources or other global, mutable state stand a good chance of breaking.

If we make it so that the functions generated from unittest blocks have their purity inferred, then any unittest block which can safely be parallelized could then be parallelized by the test runner based on their purity, and any impure unittest functions could then be safely run in serial. And if you want to make sure that a unittest block is parallizable, then you can just explicitly mark it as pure.

With that approach, we don't risk breaking existing unit tests, and it allows tests that need to not be run in parallel to work properly by guaranteeing that they're still run serially. And it even make it so that many tests are automatically parallelizable without the programmer having to do anything special for it.

- Jonathan M Davis
May 01, 2014
On Wednesday, 30 April 2014 at 15:43:35 UTC, Andrei Alexandrescu wrote:
> This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale

A message structured like this would be awesome.

    Unittest Failed foo.d:345 Providing null input throws exception

> Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom:
>
> version(unittest) void main() {}
> else void main()
> {
>    ...
> }
>
> I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one.

I would like to see -unittest redefined.

May 01, 2014
On 4/30/14, 10:01 PM, Jonathan M Davis via Digitalmars-d wrote:
> I'm all for parallelizing all unittest blocks that are pure, as doing
> so would be safe, but I think that we're making a big mistake if we try
> and insist that all unittest blocks be able to be run in parallel. Any
> that aren't pure are not guaranteed to be parallelizable, and any which
> access system resources or other global, mutable state stand a good
> chance of breaking.

There are a number of assumptions here: (a) most unittests that can be effectively parallelized can be actually inferred (or declared) as pure; (b) most unittests that cannot be inferred as pure are likely to break; (c) it's a big deal if unittests break. I question all of these assumptions. In particular I consider unittests that depend on one another an effective antipattern that needs to be eradicated.

Andrei

May 01, 2014
On Wed, 30 Apr 2014 22:32:33 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
wrote:

> On 4/30/14, 10:01 PM, Jonathan M Davis via Digitalmars-d wrote:
> > I'm all for parallelizing all unittest blocks that are pure, as doing so would be safe, but I think that we're making a big mistake if we try and insist that all unittest blocks be able to be run in parallel. Any that aren't pure are not guaranteed to be parallelizable, and any which access system resources or other global, mutable state stand a good chance of breaking.
> 
> There are a number of assumptions here: (a) most unittests that can be effectively parallelized can be actually inferred (or declared) as pure; (b) most unittests that cannot be inferred as pure are likely to break; (c) it's a big deal if unittests break. I question all of these assumptions. In particular I consider unittests that depend on one another an effective antipattern that needs to be eradicated.

Even if they don't depend on each other, they can depend on the system. std.file's unit tests will break if we parallelize them, because it operates on files and directories, and many of those tests operate on the same temp directories. That can be fixed by changing the tests, but it will break the tests. Other tests _can't_ be fixed if we force them to run in parallel. For instance, some of std.datetime's unit tests set the local time zone of the system in order to test that LocalTime works correctly. That sets it for the whole program, so all threads will be affected even if they're running other tests. Right now, this isn't a problem, because those tests set the timezone at their start and reset it at their end. But if they were made to run in parallel with any other tests involving LocalTime, there's a good chance that those tests would have random test failures. They simply can't be run in parallel due to a system resource that we can't make thread-local. So, regardless of how we want to mark up unittest blocks as parallelizable or not parallelizable (be it explicit, implict, using pure, or using something else), we do need a way to make it so that a unittest block is not run in parallel with any other unittest block.

We can guarantee that pure functions can safely be run in parallel. We _cannot_ guarantee that impure functions can safely be run in parallel. I'm sure that many impure unittest functions could be safely run in parallel, but it would require that the programmer verify that if we don't want undefined behavior - just like programmers have to verify that @system code is actually @safe. Simply running all unittest blocks in parallel is akin to considering @system code @safe in a particular piece of code simply because by convention that code should be @safe.

pure allows us to detect guaranteed, safe parallelizability. If we want
to define some other way to make it so a unittest block can be marked as
parallelizable regardless of purity, then fine. But
automatically parallelizing impure functions means that we're going to
have undefined behavior for those unittest functions, and I really
think that that is a bad idea - in addition to the fact that some
unittest blocks legitimately cannot be run in parallel due to the use
of system resources, so parallelizing them _will_ not only break them
but make them impossible to write in a way that's not broken without
adding mutexes to the unittest blocks to stop the test runner from
running them in parallel. And IMHO, if we end up having to do that
anywhere, we've done something very wrong with how unit tests work.

- Jonathan M Davis
May 01, 2014
On 4/30/14, 11:31 PM, Jonathan M Davis via Digitalmars-d wrote:
> On Wed, 30 Apr 2014 22:32:33 -0700
> Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
> wrote:
>> There are a number of assumptions here: (a) most unittests that can
>> be effectively parallelized can be actually inferred (or declared) as
>> pure; (b) most unittests that cannot be inferred as pure are likely
>> to break; (c) it's a big deal if unittests break. I question all of
>> these assumptions. In particular I consider unittests that depend on
>> one another an effective antipattern that needs to be eradicated.
>
> Even if they don't depend on each other, they can depend on the system.

Understood, no need to repeat, thanks.

> std.file's unit tests will break if we parallelize them, because it
> operates on files and directories, and many of those tests operate on
> the same temp directories.

Yah, I remember even times when make unittest -j broke unittests because I've used the file name "deleteme" in multiple places. We need to fix those.

> That can be fixed by changing the tests, but
> it will break the tests.

I'm not too worried about breaking tests. I have in mind that we'll display a banner at the beginning of unittesting explaining that tests are ran in parallel and to force serial execution they'd need to set this thing or that. In a way I don't see it as "breakage" in the traditional tests. Unittests are in a way supposed to break :o).

> Other tests _can't_ be fixed if we force them
> to run in parallel. For instance, some of std.datetime's unit tests set
> the local time zone of the system in order to test that LocalTime works
> correctly.

Sure. We could specify that tests are to be run serially within one specific module, or to use classic interlocking in the unittest code. I see it as a problem relatively easy to address.

> We can guarantee that pure functions can safely be run in parallel. We
> _cannot_ guarantee that impure functions can safely be run in parallel.
> I'm sure that many impure unittest functions could be safely run in
> parallel, but it would require that the programmer verify that if we
> don't want undefined behavior - just like programmers have to verify
> that @system code is actually @safe. Simply running all unittest blocks
> in parallel is akin to considering @system code @safe in a particular
> piece of code simply because by convention that code should be @safe.

I don't think undefined behavior is at stake here, and I find the simile invalid. Thread isolation is a done deal in D and we may as well take advantage of it. Worse that could happen is that a unittest sets a global and surprisingly the next one doesn't "see" it.

At any rate I think it's pointless to insist on limiting parallel running to pure - let me just say I understood the point (thanks) so there is no need to restate it, and that I think it doesn't take us a good place.


Andrei
May 01, 2014
On Wednesday, 30 April 2014 at 20:20:26 UTC, Jacob Carlborg wrote:
> On 2014-04-30 19:30, Dicebot wrote:
>> I believe only missing step right now is propagation of UDA's to RTInfo
>> when demanded. Everything else can be done as Phobos solution.
>
> I don't see why this is necessary for this case.

It is not strictly necessary but you can't reliably get all unit test blocks during compile-time (must be transitively imported) and current run-time reflection for tests is missing any data but actual function pointers. I am personally perfectly satisfied with "single root module imports all" approach but it is likely to be complained if proposed as "standard" way.
May 01, 2014
On Wednesday, 30 April 2014 at 21:49:06 UTC, Jonathan M Davis via Digitalmars-d wrote:
> On Wed, 30 Apr 2014 21:09:14 +0100
> Russel Winder via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
>> On Wed, 2014-04-30 at 11:19 -0700, Jonathan M Davis via Digitalmars-d
>> wrote:
>> > unittest blocks just like any other unit test. I would very much
>> > consider std.file's tests to be unit tests. But even if you don't
>> > want to call them unit tests, because they access the file system,
>> > the reality of the matter is that tests like them are going to be
>> > run in unittest blocks, and we have to take that into account when
>> > we decide how we want unittest blocks to be run (e.g. whether
>> > they're parallelizable or not).
>> 
>> In which case D is wrong to allow them in the unittest blocks and
>> should introduce a new way of handling these tests. And even then all
>> tests can and should be parallelized. If they cannot be then there is
>> an inappropriate dependency.
>
> Why? Because Andrei suddenly proposed that we parallelize unittest
> blocks? If I want to test a function, I'm going to put a unittest block
> after it to test it. If that means accessing I/O, then it means
> accessing I/O. If that means messing with mutable, global variables,
> then that means messing with mutable, global variables. Why should I
> have to put the tests elsewhere or make is that they don't run whenthe
> -unttest flag is used just because they don't fall under your definition
> of "unit" test?

You do this because unit tests must be fast. You do this because unit tests must be naively parallel. You do this because unit tests verify basic application / library sanity and expected to be quickly run after every build in deterministic way (contrary to full test suite which can take hours).

Also you do that because doing _reliably_ correct tests with I/O is relatively complicated and one does not want to pollute actual source modules with all environment checks.

In the end it is all about supporting quick edit-compile-test development cycle.
May 01, 2014
On Wednesday, 30 April 2014 at 21:09:51 UTC, Átila Neves wrote:
>>> I don't know about anyone else, but I make my tests fail a lot.
>>
>> I think this is key difference. For me failing unit test is always exceptional situation.
>
> I TDD a lot. Tests failing are normal. Not only that, I refactor a lot as well. Which causes tests to fail. Fortunately I have tests failing to tell me I screwed up.

I dream of a day when TDD crap will be finally discarded and fade into oblivion.

> Even if failing tests were exceptional, I still want everything I just mentioned.

Probably. But will you still _need_ it? ;)

>>And if test group is complex
>> enough to require categorization then either my code is not procedural enough or module is just too big and needs to be split.
>>
>
> And when I split them I put them into a subcategory.

This is somewhat redundant as they are already categorized by module / package.
May 01, 2014
On Thursday, 1 May 2014 at 01:45:21 UTC, Xavier Bigand wrote:
> Splitting all features at an absolute atomic level can be achieve for open-source libraries, but it's pretty much impossible for an industrial software. Why being so restrictive when it's possible to support both vision by extending a little the language by something already logical?

You are pretty much saying here "writing good code is possible for open-source libraries but not for industrial software".
May 01, 2014
On Wed, 30 Apr 2014 23:56:53 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
wrote:
> I don't think undefined behavior is at stake here, and I find the simile invalid. Thread isolation is a done deal in D and we may as well take advantage of it. Worse that could happen is that a unittest sets a global and surprisingly the next one doesn't "see" it.
> 
> At any rate I think it's pointless to insist on limiting parallel running to pure - let me just say I understood the point (thanks) so there is no need to restate it, and that I think it doesn't take us a good place.

I'm only arguing for using pure on the grounds that it _guarantees_ that the unittest block is safely parallelizable. If we decide that that guarantee isn't necessary, then we decide that it isn't necessary, though I definitely worry that not having that guarantee will be problematic. I do agree though that D's thread-local by default helps quite a bit in ensuring that most tests will be runnable in parallel. However, if we went with purity to indicate parallelizability, I could easily see doing it implicitly based on purity and allowing for a UDA or somesuch which marked a unittest block as "trusted pure" so that it could be run in parallel. So, I don't think that going with pure would necessarily be too restrictive. It just would require that the programmer do some extra work to be able to treat a unittest block as safely parallelizable when the compiler couldn't guarantee that it was.

Ultimately, my biggest concern here is that it be possible to guarantee
that a unittest block is not run in parallel with any other unittest
block if that particular unittest requires it for any reason, and some
folks seem to be arguing that such tests are always invalid, and I
want to make sure that we don't ever consider that to be the case for
unittest blocks in D. If we do parallel by default and allow for some
kind of markup to make a unittest block serial, then that can work. I
fully expect that switching to parallel by default would break a number
of tests, which I do think is a problem (particularly since a number
of those tests will be completely valid), but it could also be an
acceptable one - especially if for the most part, the code that it
breaks is badly written code. Regardless, we will need to make sure that
we message the change clearly in order to ensure that a minimal number
of people end up with random test failures due to the change.

On a side note, regardless of whether we want to use purity to infer
paralellizability, I think that it's very cool that we have the
capability to do so if we so choose, whereas most other languages
have no way of even coming close to being able to tell whether a
function can be safely parallelized or not. The combination of
attributes such as pure and compile-time inference is very cool indeed.

- Jonathan M Davis