Parallel execution of unittests (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Parallel execution of unittests (page 5)

April 30, 2014

Re: Parallel execution of unittests

Posted by Jonathan M Davis
in reply to monarch_dodra

Jonathan M Davis

Posted in reply to monarch_dodra

On Wed, 30 Apr 2014 18:53:22 +0000
monarch_dodra via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Wednesday, 30 April 2014 at 15:54:42 UTC, bearophile wrote:
> >> We've resisted named unittests but I think there's enough evidence to make the change.
> >
> > Yes, the optional name for unittests is an improvement:
> >
> > unittest {}
> > unittest foo {}
> >
> > I am very glad your coworker find such usability problems :-)
> 
> If we do "name" the unittests, then can we name them with strings? No need to polute namespace with ugly symbols. Also:
> 
> //----
> unittest "Sort: Non-Lvalue RA range" { ... }
> //----
> vs
> //----
> unittest SortNonLvalueRARange { ... }
> //----

It would be simple enough to avoid polluting the namespace. IIRC, right now, the unittest blocks get named after the line number that they're on. All we'd have to do is change it so that their name included the name given by the programmer rather than being the name given by the programmer. e.g.

unittest(testFoo)
{
}

results in a function called something like unittest_testFoo.

- Jonathan M Davis

April 30, 2014

Re: Parallel execution of unittests

Posted by Andrei Alexandrescu
in reply to Jonathan M Davis

Andrei Alexandrescu

Posted in reply to Jonathan M Davis

On 4/30/14, 2:25 PM, Jonathan M Davis via Digitalmars-d wrote:
> Sure, that helps, but it's trivial to write a unittest block which
> depends on a previous unittest block, and as soon as a unittest block
> uses an external resource such as a socket or file, then even if a
> unittest block doesn't directly depend on the end state of a
> previous unittest block, it still depends on external state which could
> be affected by other unittest blocks. So, ultimately, the language
> really doesn't ensure that running a unittest block can be
> parallelized. If it's pure as bearophile suggested, then it can be
> done, but as long as a unittest block is impure, then it can rely on
> global state - even inadvertently - (be it state directly in the program
> or state outside the program) and therefore not work when pararellized.
> So, I suppose that you could parallelize unittest blocks if they were
> marked as pure (though I'm not sure if that's currently a legal thing
> to do), but impure unittest blocks aren't guaranteed to be
> parallelizable.

Agreed. I think we should look into parallelizing all unittests. -- Andrei

April 30, 2014

Re: Parallel execution of unittests

Posted by Andrej Mitrovic
in reply to Andrei Alexandrescu

Andrej Mitrovic

Posted in reply to Andrei Alexandrescu

On 4/30/14, Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> This brings up the issue of naming unittests.

See also this ER where I discuss why I wanted this recently: https://issues.dlang.org/show_bug.cgi?id=12473

April 30, 2014

Re: Parallel execution of unittests

Posted by Jonathan M Davis

Jonathan M Davis

On Wed, 30 Apr 2014 21:09:14 +0100
Russel Winder via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Wed, 2014-04-30 at 11:19 -0700, Jonathan M Davis via Digitalmars-d wrote:
> > unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not).
> 
> In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency.

Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of "unit" test?

There is nothing in the language which has ever mandated that unittest blocks be parallelizable or that they be pure (which is essentially what you're saying all unittest blocks should be). And restricting unittest blocks so that they have to be pure (be it conceptually pure or actually pure) would be a _loss_ of functionality.

Sure, let's make it possible to parallelize unittest blocks where appropriate, but I contest that we should start requiring that unittest blocks be pure (which is what a function has to be in order to be pararellized whether it's actually marked as pure or not). That would force us to come up with some other testing mechanism to run those tests when there is no need to do so (and I would argue that there is no compelling reason to do so other than ideology with regards to what is truly a "unit" test).

On the whole, I think that unittest blocks work very well as they are. If we want to expand on their features, then great, but let's do so without adding new restrictions to them.

- Jonathan M Davis

April 30, 2014

Re: Parallel execution of unittests

Posted by H. S. Teoh

H. S. Teoh

On Wed, Apr 30, 2014 at 02:48:38PM -0700, Jonathan M Davis via Digitalmars-d wrote:
> On Wed, 30 Apr 2014 21:09:14 +0100
> Russel Winder via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
[...]
> > In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency.
> 
> Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of "unit" test?
[...]

What about allowing pure marking on unittests, and those unittests that are marked pure will be parallelized, and those that aren't marked will be run serially?


T

-- 
Amateurs built the Ark; professionals built the Titanic.

April 30, 2014

Re: Parallel execution of unittests

Posted by H. S. Teoh

H. S. Teoh

On Wed, Apr 30, 2014 at 02:25:22PM -0700, Jonathan M Davis via Digitalmars-d wrote: [...]
> Sure, that helps, but it's trivial to write a unittest block which depends on a previous unittest block, and as soon as a unittest block uses an external resource such as a socket or file, then even if a unittest block doesn't directly depend on the end state of a previous unittest block, it still depends on external state which could be affected by other unittest blocks.

In this case I'd argue that the test was poorly-written. I can see multiple unittests using, say, the same temp filename for testing file I/O, in which case they shouldn't be parallelized; but if a unittest depends on a file created by a previous unittest, then something is very, very wrong with the unittest.

[...]
> I'm inclined to think that marking unittest blocks as pure to parallelize them is a good idea, because then the unittest blocks that are guaranteed to be parallelizable are run in parallel, whereas those that aren't wouldn't be.

Agreed.

> The primary dowside would be that the cases where the programmer knew that they could be parallelized but they weren't pure, since those unittest blocks wouldn't be parallelized.
[...]

Is it a big loss to have *some* unittests non-parallelizable? (I don't know, do we have hard data on this front?)

T

-- 
The two rules of success: 1. Don't tell everything you know. -- YHL

April 30, 2014

Re: Parallel execution of unittests

Posted by Nordlöw
in reply to H. S. Teoh

Nordlöw

Posted in reply to H. S. Teoh

> What about allowing pure marking on unittests, and those unittests that
> are marked pure will be parallelized, and those that aren't marked will
> be run serially?

I guess that goes for inferred purity aswell...

May 01, 2014

Re: Parallel execution of unittests

Posted by Steven Schveighoffer
in reply to Jonathan M Davis

Steven Schveighoffer

Posted in reply to Jonathan M Davis

On Wed, 30 Apr 2014 13:50:10 -0400, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Wed, 30 Apr 2014 08:59:42 -0700
> Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
> wrote:
>
>> On 4/30/14, 8:54 AM, bearophile wrote:
>> > Andrei Alexandrescu:
>> >
>> >> A coworker mentioned the idea that unittests could be run in
>> >> parallel
>> >
>> > In D we have strong purity to make more safe to run code in
>> > parallel:
>> >
>> > pure unittest {}
>>
>> This doesn't follow. All unittests should be executable concurrently.
>> -- Andrei
>>
>
> In general, I agree. In reality, there are times when having state
> across unit tests makes sense - especially when there's expensive setup
> required for the tests.

int a;
unittest
{
   // set up a;
}

unittest
{
   // use a;
}

==>

unittest
{
   int a;
   {
      // set up a;
   }
   {
      // use a;
   }
}

It makes no sense to do it the first way, you are not gaining anything.

> Honestly, the idea of running unit tests in parallel makes me very
> nervous. In general, across modules, I'd expect it to work, but there
> will be occasional cases where it will break.

Then you didn't write your unit-tests correctly. True unit tests-anyway.

In fact, the very quality that makes unit tests so valuable (that they are independent of other code) is ruined by sharing state across tests. If you are going to share state, it really is one unit test.

> Across the unittest
> blocks in a single module, I'd be _very_ worried about breakage. There
> is nothing whatsoever in the language which guarantees that running
> them in parallel will work or even makes sense. All that protects us is
> the convention that unit tests are usually independent of each other,
> and in my experience, it's common enough that they're not independent
> that I think that blindly enabling parallelization of unit tests across
> a single module is definitely a bad idea.

I think that if we add the assumption, the resulting fallout would be easy to fix.

Note that we can't require unit tests to be pure -- non-pure functions need testing too :)

I can imagine that even if you could only parallelize 90% of unit tests, that would be an effective optimization for a large project. In such a case, the rare (and I mean rare to the point of I can't think of a single use-case) need to deny parallelization could be marked.

-Steve

May 01, 2014

Re: Parallel execution of unittests

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

On Wed, 30 Apr 2014 11:43:31 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Hello,
>
>
> A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement.

I am not sure, but are unit-test blocks one function each, or one function per module? If the latter, that would have to be changed.

> This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about "the unittest at line 2035 is failing" and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change.

I would note this enhancement, which Walter agreed should be done at DConf '13 ;)

https://issues.dlang.org/show_bug.cgi?id=10023

Jacob Carlborg has tried to make this work, but the PR has not been pulled yet (I think it needs some updating at least, and there were some unresolved questions IIRC).

> Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom:
>
> version(unittest) void main() {}
> else void main()
> {
>     ...
> }
>
> I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one.

The runtime can intercept this parameter. I would like a mechanism to run main decided at runtime.

We need no compiler modifications to effect this.

> Thoughts? Would anyone want to work on such stuff?

I can probably take a look at changing the unittests to avoid main without a runtime parameter. I have a good grasp on how the pre-main runtime works, having rewritten the module constructor algorithm a while back.

I am hesitant to run all unit tests in parallel without an opt-out mechanism. The above enhancement being implemented would give us some ways to play around, though.

-Steve

May 01, 2014

Re: Parallel execution of unittests

Posted by Xavier Bigand
in reply to Andrei Alexandrescu

Xavier Bigand

Posted in reply to Andrei Alexandrescu

Le 30/04/2014 17:43, Andrei Alexandrescu a écrit :
> Hello,
>
>
> A coworker mentioned the idea that unittests could be run in parallel
> (using e.g. a thread pool). I've rigged things to run in parallel
> unittests across modules, and that works well. However, this is too
> coarse-grained - it would be great if each unittest could be pooled
> across the thread pool. That's more difficult to implement.
>
I think it's a great idea, mainly for TDD. I had experiment it with Java, and when execution time grow TDD loose rapidly his efficiently.
Some Eclipse's plug-ins are able to run them in parallel if I remember correctly.

> This brings up the issue of naming unittests. It's becoming increasingly
> obvious that anonymous unittests don't quite scale - coworkers are
> increasingly talking about "the unittest at line 2035 is failing" and
> such. With unittests executing in multiple threads and issuing e.g.
> logging output, this is only likely to become more exacerbated. We've
> resisted named unittests but I think there's enough evidence to make the
> change.
>
IMO naming is important for reporting tools (tests status, benchmarks,...). Unittests evolves with the rest of the code.

> Last but not least, virtually nobody I know runs unittests and then
> main. This is quickly becoming an idiom:
>
> version(unittest) void main() {}
> else void main()
> {
>     ...
> }
>
> I think it's time to change that. We could do it the
> non-backward-compatible way by redefining -unittest to instruct the
> compiler to not run main. Or we could define another flag such as
> -unittest-only and then deprecate the existing one.
>
> Thoughts? Would anyone want to work on such stuff?
>
>
> Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation