July 17, 2010

Michel Fortin, el 16 de julio a las 22:37 me escribiste:
> 
> Le 2010-07-16 ? 21:03, Jonathan M Davis a ?crit :
> 
> >> My thought is, that we might be running into issues because the idea of a unit test is being conflated with a function. What we really want is a collection of blocks of code that that are all run and if any of them trigger an assert, that block stops and the tests as a whole are considered to have failed.
> >> 
> >> As a point to start from, how about allow unittest as a statement type?
> >> 
> >> foreach(item; collection) unittest { asssert(foobar(item)); }
> > 
> > Regardless of whether you consider the assertion or the unittest block as the unit test (and I would certainly choose the latter), I don't see any point to this.
> 
> Walter had a point... what if you want to test multiple outputs of the same function on one go? Here's an example:
> 
> 	unittest {
> 		int[3] result = makeSomeTable(399, 383, 927);
> 		assert(result[0] == 281);
> 		assert(result[1] == 281);
> 		assert(result[2] == 281);
> 	}
> 
> Here, each assert is independent one from another and knowing that the first and the third fails but not the second might help diagnose the problem.
> 
> I think Jonathan's idea is quite on the spot. It'd allow you to write this:
> 
> 	unittest {
> 		int[3] result = makeSomeTable(399, 383, 927);
> 		unittest { assert(result[0] == 281); }
> 		unittest { assert(result[1] == 281); }
> 		unittest { assert(result[2] == 281); }
> 	}
> 
> and each "unittest assert" becomes an independent test that may fail but without interrupting the flow of the outside scope. The compiler could expand it like this:
> 
> 	unittest {
> 		int[3] result = makeSomeTable(399, 383, 927);
> 
> 		try
> 			assert(result[0] == 281);
> 		catch (AssertionError e)
> 			__unitTestLogFailure(e);
> 
> 		try
> 			assert(result[1] == 281);
> 		catch (AssertionError e)
> 			__unitTestLogFailure(e);
> 
> 		try
> 			assert(result[2] == 281);
> 		catch (AssertionError e)
> 			__unitTestLogFailure(e);
> 	}
> 
> And now you have a log entry for each failed assertion, because they're independent unit tests.
> 
> It'd also help reduce verbosity to not require the braces for one-statement unit tests.

Use a library function!

	unittest {
		int[3] result = makeSomeTable(399, 383, 927);
		check(result[0] == 281);
		check(result[1] == 281);
		check(result[2] == 281);
	}

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
<Damian_Des> Me anDa MaL eL CaPSLoCK

July 16, 2010
On 7/16/2010 10:08 PM, Walter Bright wrote:
> 
> Yes, for that sequence, the rest are irrelevant. But what I don't understand is why is it so bad that they also generate messages, which you just ignore looking over the log file? Me, I'd rather see a few extra cascaded messages, fix the ones that matter and ignore the extra, than be short messages and have to rebuild/rerun the tests just to get dinged again.

As you're fond of pointing out, continuing to run past the point of broken program state is a bad idea no matter how you decorate it.

Running beyond an assert could result in all sorts of really nasty behavior. Consider a test that creates files and wants to clean them up, but part of the test that decides what file to work with fails to do the right thing.  The user properly asserts on that, but then it goes and removes the 'wrong' file?

Sorry, no.  I'm firmly in the 'assert stop the test' camp.

Consider another case.  The user writes a helper function that lives outside a
unittest block and works exactly how the user expects.  It has an assert in it.
 A later refactoring moves the code directly into the unittest block rather than
the called function.. the code changes behavior?  wtf?

I can't believe this debate has lasted this long.

Sigh,
Brad
July 17, 2010
On 07/17/2010 12:07 AM, Walter Bright wrote:
>> But of course, that's probably more work to do in the compiler.
>>
>
> Yes, but it's not hard. For me, the issue is making things for the user more complicated. D is already a very large language, and adding more and more stuff like this makes it larger, buggier, and less approachable. For example, nothing stands out to say what "unittest assert" does vs "assert". I'd have to continuously recheck the manual. One thing I like about the current unittest setup is how utterly trivial it is to use effectively.

The way I understand what you say means that assert should have uniform semantics, whether used inside a unittest or not. Selective behavior would be one of those RTFM things.

> Is it really a good idea to have so much customizable behavior wedded into the grammar, whereas it can be achieved using existing D constructs (albeit with a little more typing)? I think the default behavior should be the simplest, "git 'er done" that covers 90% of the usage needs. The other 10% can be done with conventional D constructs.

I agree, and to me the natural consequent is - assert throws, period.


Andrei
July 17, 2010
On 07/17/2010 12:08 AM, Walter Bright wrote:
> Yes, for that sequence, the rest are irrelevant. But what I don't understand is why is it so bad that they also generate messages, which you just ignore looking over the log file? Me, I'd rather see a few extra cascaded messages, fix the ones that matter and ignore the extra, than be short messages and have to rebuild/rerun the tests just to get dinged again.

If you were 100% committed to that position, your compiler would continue parsing after an error. It doesn't.

Andrei
July 17, 2010
Le 2010-07-17 ? 1:07, Walter Bright a ?crit :

> Yes, but it's not hard. For me, the issue is making things for the user more complicated. D is already a very large language, and adding more and more stuff like this makes it larger, buggier, and less approachable. For example, nothing stands out to say what "unittest assert" does vs "assert".

I just want to point out that "unittest assert(x);" would just be a shortcut for "unittest { assert(x); }", which is quite similar to how you can omit the braces for conditional and loop statements in regular code. This is only a small shift from the earlier concept of unittest in D: instead of having unittest blocks scattered in your module, you have unittest statements (which can be blocks) scattered around. It's not much of a change, it's just being more permissive.

	module abc;

	unittest assert(sqrt(4) == 2); // single-statement unit test
	unittest assert(sqrt(16) == 4);

As for nesting unittests, I agree that it might get harder to conceptualize how it works. But still, there isn't anything really 'special' about it, they have the same semantics as the module-level ones. It's better explained with a concrete example. Say a module includes those unit tests:

	module abc;

	unittest assert(sqrt(4) == 2);
	unittest assert(sqrt(16) == 4);

	unittest {
		int[] result = makeSomeTable(399, 383, 927);
		assert(result.length == 3); // guard against invalid state

		unittest assert(result[0] == 281);
		unittest assert(result[1] == 284);
		unittest assert(result[2] == 283);

		unittest {
			int[] derivedResult = derive(result);
			assert(derivedResult.length == 2); // guard against invalid state

			unittest assert(derivedResult[0] ==  3);
			unittest assert(derivedResult[1] == -1);
		}
		// ... other tests reusing 'result'
	}

How do you translate this to a module-level unittest function? First define this function in the runtime:

	void __doUnitTest(void delegate() test) {
		try { test(); }
		catch (AssertionError e) { __unitTestLogFailure(e); }
	}

Then replace each unittest statement or block with a call to the above function, passing the unittest code as the argument. The three module-level unittests I wrote above would translate to this module unittest function:

	void __module_abc_unittests() {
		__doUnitTest({  assert(sqrt(4) == 2); });
		__doUnitTest({  assert(sqrt(16) == 4); });

		__doUnitTest({
			int[] result = makeSomeTable(399, 383, 927);
			assert(result.length == 3); // guard against invalid state

			__doUnitTest({  assert(result[0] == 281);  });
			__doUnitTest({  assert(result[1] == 284);  });
			__doUnitTest({  assert(result[2] == 283);  });

			__doUnitTest({
				int[] derivedResult = derive(result);
				assert(derivedResult.length == 2); // guard against invalid state

				__doUnitTest({  assert(derivedResult[0] ==  3);  });
				__doUnitTest({  assert(derivedResult[1] == -1);  });
			});
			// ... other tests reusing 'result'
		});
	}

As other mentioned, what I do here with nested unit tests could easily be implemented by offering the user a new library function too. I just happen to think what I'm proposing here is more elegant.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



July 17, 2010
Le 2010-07-17 ? 1:08, Walter Bright a ?crit :

> Yes, for that sequence, the rest are irrelevant. But what I don't understand is why is it so bad that they also generate messages, which you just ignore looking over the log file?

I don't know about you, but in my regular code I often write my assertions like this:

	int* ptr = getPointer();
	assert(ptr !is null);
	doSomething(ptr);

The assertion detects that something is wrong and prevents the code from doing dangerous things when the state is already incoherent. In fact, there isn't much other use to an assert in regular code than to detect invalid states and abort.

But then you make it do something different in a unittest:

	unittest {
		int* ptr = getPointer();
		assert(ptr !is null);
		doSomething(ptr);
	}

Here, moving forward passed the assert creates a seg fault and thus the following unit tests won't be run. The results could be more or less severe depending on the exact nature of what the assert guards against. Now you've already suggested this solution:

	unittest {
		int* ptr = getPointer();
		if (ptr !is null)
			doSomething(ptr);
		else
			assert(0);
	}

... which works (both in asserts an in regular code). But this means that inside unit tests you have to use a different (and more verbose) solution to guard against invalid state than in your regular code. It's true that you can train someone to do something in some situation and something else in another, but it obviously is more complicated and error prone than if you always do the same thing to get the same result.

Another thing. You might argue that most of the time it poses no problem to continue after an assert in a unit test; and I'll agree with you. However, it's not always the case, and it's easy to do a mistake for the dangerous cases because people are trained to use assert in regular code as a protection. Now, if you had the reverse -- throwing by default -- you will be overprotective in many cases (those that won't crash when not throwing), but it could be made easy to opt-in for the non-throwing behaviour whenever its useful.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



July 17, 2010
On Saturday 17 July 2010 05:17:09 Michel Fortin wrote:
> Le 2010-07-17 ? 1:07, Walter Bright a ?crit :
> > Yes, but it's not hard. For me, the issue is making things for the user more complicated. D is already a very large language, and adding more and more stuff like this makes it larger, buggier, and less approachable. For example, nothing stands out to say what "unittest assert" does vs "assert".
> 
> I just want to point out that "unittest assert(x);" would just be a shortcut for "unittest { assert(x); }", which is quite similar to how you can omit the braces for conditional and loop statements in regular code. This is only a small shift from the earlier concept of unittest in D: instead of having unittest blocks scattered in your module, you have unittest statements (which can be blocks) scattered around. It's not much of a change, it's just being more permissive.
> 
> 	module abc;
> 
> 	unittest assert(sqrt(4) == 2); // single-statement unit test
> 	unittest assert(sqrt(16) == 4);
> 
> As for nesting unittests, I agree that it might get harder to conceptualize how it works. But still, there isn't anything really 'special' about it, they have the same semantics as the module-level ones. It's better explained with a concrete example. Say a module includes those unit tests:
> 
> 	module abc;
> 
> 	unittest assert(sqrt(4) == 2);
> 	unittest assert(sqrt(16) == 4);
> 
> 	unittest {
> 		int[] result = makeSomeTable(399, 383, 927);
> 		assert(result.length == 3); // guard against invalid state
> 
> 		unittest assert(result[0] == 281);
> 		unittest assert(result[1] == 284);
> 		unittest assert(result[2] == 283);
> 
> 		unittest {
> 			int[] derivedResult = derive(result);
> 			assert(derivedResult.length == 2); // guard against invalid state
> 
> 			unittest assert(derivedResult[0] ==  3);
> 			unittest assert(derivedResult[1] == -1);
> 		}
> 		// ... other tests reusing 'result'
> 	}
> 
> How do you translate this to a module-level unittest function? First define this function in the runtime:
> 
> 	void __doUnitTest(void delegate() test) {
> 		try { test(); }
> 		catch (AssertionError e) { __unitTestLogFailure(e); }
> 	}
> 
> Then replace each unittest statement or block with a call to the above function, passing the unittest code as the argument. The three module-level unittests I wrote above would translate to this module unittest function:
> 
> 	void __module_abc_unittests() {
> 		__doUnitTest({  assert(sqrt(4) == 2); });
> 		__doUnitTest({  assert(sqrt(16) == 4); });
> 
> 		__doUnitTest({
> 			int[] result = makeSomeTable(399, 383, 927);
> 			assert(result.length == 3); // guard against invalid state
> 
> 			__doUnitTest({  assert(result[0] == 281);  });
> 			__doUnitTest({  assert(result[1] == 284);  });
> 			__doUnitTest({  assert(result[2] == 283);  });
> 
> 			__doUnitTest({
> 				int[] derivedResult = derive(result);
> 				assert(derivedResult.length == 2); // guard against invalid
state
> 
> 				__doUnitTest({  assert(derivedResult[0] ==  3);  });
> 				__doUnitTest({  assert(derivedResult[1] == -1);  });
> 			});
> 			// ... other tests reusing 'result'
> 		});
> 	}
> 
> As other mentioned, what I do here with nested unit tests could easily be implemented by offering the user a new library function too. I just happen to think what I'm proposing here is more elegant.

I'd have to argue with you on that one. Personally, I find it to be far uglier than simply replacing assert with expect in cases where you don't want the test to stop on failure. A unittest block without braces is a bit odd, but I could see that working and not being a big deal. But a unit test block inside another? Sure, you could do it, but I think that it would be confusing to new users. It's also more verbose than simply changing assert to expect, and I think that it's a lot uglier. Obviously, this is at least somewhat subjective, however, since you clearly think that nesting unittest is more elegant.

Also, if you went down the road of using expect (verify or check or whatever) instead of assert for tests where you didn't want the test to stop, then that opens the door to the possibility of adding other such functions later if it's deemed appropriate. You could probably even have a library module dedicated to additional features for unittests. Many unit test frameworks have functions like assertEquals() and assertNull() to allow for better debugging. Now, I'm not suggesting that we do that right now, but if we go the route of adding a new function to be used instead of assert for testing without stopping the test, adding such a library would then be reasonable extension of that which could be done at a later date. Using nested unittests does not make things more flexible in that manner. It just contorts the current construct so that it does more.

Not to mention, I am a proponent of the idea of named unit tests (e.g. unittest(test_name) {}). And they don't really make sense when nested. Sure, you could have named external ones and non-named nested ones, but then you'd have to worry about whether a test was nested or not when allowing a name on it.

Really, it seems to me that the most straightforward and extensible approach is to make assertions function normally (throwing from the unittest block on failure and ceasing its execution), make it so that each unittest block is run individually regardless of whether any others have succeeded or failed, and add a second function which tests its assertion but does not throw on failure (though it would still set the appropriate flag so that the overall unittest block would count as failed, and main() would not be run). It's straightfoward, extensible, easily understood, and I really do think that it's the way to go at this point.

- Jonathan M Davis
July 17, 2010
Walter Bright wrote:
>
>
> Michel Fortin wrote:
>>
>> And here's an illustration of a second test modelling a sequence of events. If one of these test fails, all others will fail too because they depend on each other:
>>
>>     unittest {
>>         S s = new S;
>>         s.giveMeAName();
>>         assert(s.fileName);
>>         s.value = 1;
>>         assert(s.value == 1);
>>         assert(s.needsToBeSaved);
>>         s.save();
>>         assert(!s.needsToBeSaved);
>>         s.value = 2;
>>         assert(s.value == 2);
>>         assert(s.needsToBeSaved);
>>         s.revert();
>>         assert(s.value == 1);
>>         assert(!s.needsToBeSaved);
>>         ...
>>     }
>>
>> So here you only need to know about the first error, and the rest is
>> garbage.
>> 
>
> Yes, for that sequence, the rest are irrelevant. But what I don't understand is why is it so bad that they also generate messages, which you just ignore looking over the log file? Me, I'd rather see a few extra cascaded messages, fix the ones that matter and ignore the extra,

The issue is that some people won't be getting "a few" extra messages to be ignored.They will get one message they want, followed by dozens that they need to ignore, followed by one they want, ...

If there is more than one message in a row that is to be ignored, it makes finding the real messages harder and makes it worse than the original design.

July 17, 2010
Le 2010-07-17 ? 10:10, Jonathan M Davis a ?crit :

> On Saturday 17 July 2010 05:17:09 Michel Fortin wrote:
>> As other mentioned, what I do here with nested unit tests could easily be implemented by offering the user a new library function too. I just happen to think what I'm proposing here is more elegant.
> 
> I'd have to argue with you on that one. Personally, I find it to be far uglier than simply replacing assert with expect in cases where you don't want the test to stop on failure. A unittest block without braces is a bit odd, but I could see that working and not being a big deal. But a unit test block inside another? Sure, you could do it, but I think that it would be confusing to new users. It's also more verbose than simply changing assert to expect, and I think that it's a lot uglier. Obviously, this is at least somewhat subjective, however, since you clearly think that nesting unittest is more elegant.

Well, I did believe it was more elegant a few hours ago while looking into it. Now I'm more ambivalent: both are fine with me if the library function has a good name which does not clash easily while being short enough. I do think 'check' or 'verify' are too clash-prone. Perhaps 'expect' would be a better choice.


> Not to mention, I am a proponent of the idea of named unit tests (e.g. unittest(test_name) {}). And they don't really make sense when nested. Sure, you could have named external ones and non-named nested ones, but then you'd have to worry about whether a test was nested or not when allowing a name on it.

Assuming we add one day the capability to give a name to a module-level unittest, why couldn't we do the same for a nested unittest?

Note that the implementation I'm proposing (the __doUnitTest function) would make it very easy to add extra features to unittest because the compiler could simply insert a call to __doUnitTest with more arguments (such as a name) and the runtime would do what it wants with it. This works both with nested and non-nested unit tests.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



July 17, 2010
On 07/17/2010 11:24 AM, Benjamin Shropshire wrote:
> The issue is that some people won't be getting "a few" extra messages to be ignored.They will get one message they want, followed by dozens that they need to ignore, followed by one they want, ...

Ha, never thought of it that way. That's because all unittests are run now. So essentially we have useful failure messages interspersed with useless contingent messages, instead of one useful failure message followed by useless contingent messages.

Andrei