March 03, 2009
Andrei Alexandrescu wrote:
> I agree. I'm having the same problem: I put a contract in there, I know it's as good as assert. So I can't do e.g. input validation because in most functions input must always be validated. I also know that contracts are doing the wrong thing with inheritance and can't apply to interfaces, which is exactly the (only?) place they'd be interesting. So I send the contracts home and use assert, enforce, and unittest.

Contracts are not for input validation! They are checking if the logic of your program is correct or not. Think of it this way - your program should behave exactly the same with or without the contracts turned on.

Contracts should NOT be used for scrubbing user input, checking for errors from other components, or validating any input from external to the dll.

If you feel the need to leave them on in a release build, then:
1) your testing is inadequate
2) you are using them incorrectly

For example, Windows API functions check all their input. This is not contract programming - it's validating user input over which Microsoft has no control.
March 03, 2009
On Tue, 03 Mar 2009 11:00:36 -0800, Walter Bright <newshound1@digitalmars.com> wrote:

>Andrei Alexandrescu wrote:
>> I agree. I'm having the same problem: I put a contract in there, I know it's as good as assert. So I can't do e.g. input validation because in most functions input must always be validated. I also know that contracts are doing the wrong thing with inheritance and can't apply to interfaces, which is exactly the (only?) place they'd be interesting. So I send the contracts home and use assert, enforce, and unittest.
>
>Contracts are not for input validation! They are checking if the logic of your program is correct or not. Think of it this way - your program should behave exactly the same with or without the contracts turned on.
>
>Contracts should NOT be used for scrubbing user input, checking for errors from other components, or validating any input from external to the dll.
>
>If you feel the need to leave them on in a release build, then:
>1) your testing is inadequate
>2) you are using them incorrectly
>
>For example, Windows API functions check all their input. This is not contract programming - it's validating user input over which Microsoft has no control.

This is exactly how I look at them. However I've never tried to use pre/post conditions. I guess it's because of the syntax.

By the way, about that image on the contracts page. Is the bullet flying away from the D-man because it's disgusted by his extreme ugliness? :)
March 03, 2009
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Andrei Alexandrescu wrote:
> > I agree. I'm having the same problem: I put a contract in there, I know it's as good as assert. So I can't do e.g. input validation because in most functions input must always be validated. I also know that contracts are doing the wrong thing with inheritance and can't apply to interfaces, which is exactly the (only?) place they'd be interesting. So I send the contracts home and use assert, enforce, and unittest.
> Contracts are not for input validation! They are checking if the logic of your program is correct or not. Think of it this way - your program should behave exactly the same with or without the contracts turned on. Contracts should NOT be used for scrubbing user input, checking for errors from other components, or validating any input from external to the dll.

Why should contracts be limited to parameter checking of internally used functions only?  If I write a function and document parameter constraints then I certainly expect those constraints to be followed regardless of whether I'm calling the function or someone else is calling the function. Checking these via a contract simply provides an optional means of ensuring that a logic error didn't occur within the program as a whole.

If you're talking about application input however, then I agree completely.
ie. stuff typed in by the user, read from a file, etc, should never be validated
within a contract because an input failure at that level doesn't represent
a program logic error but rather user error.  An assertion failure isn't
a terribly good way of notifying the user that they shouldn't have put an
alphabetic character in a box intended to receive an integer :-)


Sean
March 03, 2009
On Tue, 03 Mar 2009 11:00:36 -0800, Walter Bright wrote:

> Contracts are not for input validation!

Hear! Hear! This is exactly correct.


-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
March 03, 2009
Walter Bright wrote:
> Georg Wrede wrote:
>> We've had Walter make nice features to D that were laborious to create, only to see nobody use them. It's happened, ask him.
> 
> Sure. Often the only way to see if a feature is useful is to actually implement it and see what happens. Some features have succeeded and found uses far beyond my expectations (CTFE, string mixins) while others have pretty much languished (design by contract, complex numbers).

I fucking love contracts. I need to use them more, but I do use them.
March 04, 2009
Sean Kelly wrote:
> == Quote from Walter Bright (newshound1@digitalmars.com)'s article
>> Andrei Alexandrescu wrote:
>>> I agree. I'm having the same problem: I put a contract in there, I know
>>> it's as good as assert. So I can't do e.g. input validation because in
>>> most functions input must always be validated. I also know that
>>> contracts are doing the wrong thing with inheritance and can't apply to
>>> interfaces, which is exactly the (only?) place they'd be interesting. So
>>> I send the contracts home and use assert, enforce, and unittest.
>> Contracts are not for input validation! They are checking if the logic
>> of your program is correct or not. Think of it this way - your program
>> should behave exactly the same with or without the contracts turned on.
>> Contracts should NOT be used for scrubbing user input, checking for
>> errors from other components, or validating any input from external to
>> the dll.
> 
> Why should contracts be limited to parameter checking of internally used
> functions only?  If I write a function and document parameter constraints
> then I certainly expect those constraints to be followed regardless of
> whether I'm calling the function or someone else is calling the function.
> Checking these via a contract simply provides an optional means of
> ensuring that a logic error didn't occur within the program as a whole.

The distinction is not whether you or others write stuff. It's about whether it is for debugging *only*, as opposed to general input validation.

Sort of, like it's not prudent to put an assert anywhere other than where the source code (that is, a bug or goof by the programmer) causes the assert to fire.

> If you're talking about application input however, then I agree completely.
> ie. stuff typed in by the user, read from a file, etc, should never be validated
> within a contract because an input failure at that level doesn't represent
> a program logic error but rather user error.  An assertion failure isn't
> a terribly good way of notifying the user that they shouldn't have put an
> alphabetic character in a box intended to receive an integer :-)
> 
> 
> Sean
March 04, 2009
Georg Wrede wrote:
> Sean Kelly wrote:
>> == Quote from Walter Bright (newshound1@digitalmars.com)'s article
>>> Andrei Alexandrescu wrote:
>>>> I agree. I'm having the same problem: I put a contract in there, I know
>>>> it's as good as assert. So I can't do e.g. input validation because in
>>>> most functions input must always be validated. I also know that
>>>> contracts are doing the wrong thing with inheritance and can't apply to
>>>> interfaces, which is exactly the (only?) place they'd be interesting. So
>>>> I send the contracts home and use assert, enforce, and unittest.
>>> Contracts are not for input validation! They are checking if the logic
>>> of your program is correct or not. Think of it this way - your program
>>> should behave exactly the same with or without the contracts turned on.
>>> Contracts should NOT be used for scrubbing user input, checking for
>>> errors from other components, or validating any input from external to
>>> the dll.
>>
>> Why should contracts be limited to parameter checking of internally used
>> functions only?  If I write a function and document parameter constraints
>> then I certainly expect those constraints to be followed regardless of
>> whether I'm calling the function or someone else is calling the function.
>> Checking these via a contract simply provides an optional means of
>> ensuring that a logic error didn't occur within the program as a whole.
> 
> The distinction is not whether you or others write stuff. It's about whether it is for debugging *only*, as opposed to general input validation.

So I guess the real question is whether a function is expected to validate its parameters.  I'd argue that it isn't, but then I'm from a C/C++ background.  For me, validation is a debugging tool, or at least an optional feature for applications that want the added insurance.


Sean
March 04, 2009
Sean Kelly wrote:
> Georg Wrede wrote:
>> The distinction is not whether you or others write stuff. It's about whether it is for debugging *only*, as opposed to general input validation.
> 
> So I guess the real question is whether a function is expected to validate its parameters.  I'd argue that it isn't, but then I'm from a C/C++ background.  For me, validation is a debugging tool, or at least an optional feature for applications that want the added insurance.

Interesting. My policy is to favor validation whenever it doesn't impact performance. Imagine for example that strlen() validated its input for non-null. Would that show on the profiling chart of any C application? No, unless the application's core loop only called strlen() on a 1-character string or so.

One simple case that clarifies the necessary tradeoff is binary search. That assumes the range to be searched is sorted. If you actually checked for that, it would render binary search useless as a linear search would be in fact faster. So you need to assume. One way to do so is in the documentation. You write in the docs that findSorted expects a sorted range. Another way is to encode this information in the type of the sorted range. But that's onerous as most of the time you have an array you just sorted, not a SortedArray value.

The approach I took with the new phobos is:

int[] haystack;
int[] needle;
...
auto pos1 = find(haystack, needle); // linear
sort(haystack);
auto pos2 = find(assumeSorted(haystack), needle);

The assumeSorted function wraps the haystack in an AssumeSorted!(int[]) type without adding members or running extra code. It's there to clarify to everyone what's going on. And it's usable with other arguments or functions too, e.g.

auto pos3 = find(haystack, assumeSorted(needle));
setIntersection(assumeSorted(haystack), assumeSorted(needle));

Interestingly, assumeSorted can actually do checking without impacting the complexity of the search. In debug mode, it can arrange to run random isSorted tests every 1/N calls, where N is the average length of the incoming arrays, then its complexity impact is amortized constant.


Andrei
March 04, 2009
On Wed, 04 Mar 2009 08:47:50 -0800, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

>Sean Kelly wrote:
>> Georg Wrede wrote:
>>> The distinction is not whether you or others write stuff. It's about whether it is for debugging *only*, as opposed to general input validation.
>> 
>> So I guess the real question is whether a function is expected to validate its parameters.  I'd argue that it isn't, but then I'm from a C/C++ background.  For me, validation is a debugging tool, or at least an optional feature for applications that want the added insurance.
>
>Interesting. My policy is to favor validation whenever it doesn't impact performance. Imagine for example that strlen() validated its input for non-null. Would that show on the profiling chart of any C application? No, unless the application's core loop only called strlen() on a 1-character string or so.
>
>One simple case that clarifies the necessary tradeoff is binary search. That assumes the range to be searched is sorted. If you actually checked for that, it would render binary search useless as a linear search would be in fact faster. So you need to assume. One way to do so is in the documentation. You write in the docs that findSorted expects a sorted range. Another way is to encode this information in the type of the sorted range. But that's onerous as most of the time you have an array you just sorted, not a SortedArray value.
>
>The approach I took with the new phobos is:
>
>int[] haystack;
>int[] needle;
>...
>auto pos1 = find(haystack, needle); // linear
>sort(haystack);
>auto pos2 = find(assumeSorted(haystack), needle);
>
>The assumeSorted function wraps the haystack in an AssumeSorted!(int[]) type without adding members or running extra code. It's there to clarify to everyone what's going on. And it's usable with other arguments or functions too, e.g.
>
>auto pos3 = find(haystack, assumeSorted(needle));
>setIntersection(assumeSorted(haystack), assumeSorted(needle));
>
>Interestingly, assumeSorted can actually do checking without impacting the complexity of the search. In debug mode, it can arrange to run random isSorted tests every 1/N calls, where N is the average length of the incoming arrays, then its complexity impact is amortized constant.
>
>
>Andrei

If you intruduce a dummy type, why not make it perform validation in a debug build when sumthing like debug=slowButSafe is set?
March 04, 2009
Andrei Alexandrescu wrote:
> Sean Kelly wrote:
>> Georg Wrede wrote:
>>> The distinction is not whether you or others write stuff. It's about whether it is for debugging *only*, as opposed to general input validation.
>>
>> So I guess the real question is whether a function is expected to validate its parameters.  I'd argue that it isn't, but then I'm from a C/C++ background.  For me, validation is a debugging tool, or at least an optional feature for applications that want the added insurance.
> 
> Interesting. My policy is to favor validation whenever it doesn't impact performance. Imagine for example that strlen() validated its input for non-null. Would that show on the profiling chart of any C application? No, unless the application's core loop only called strlen() on a 1-character string or so.

Interesting.  So the inexpensive checks would go in the function body itself, with the exhaustive extra stuff in contracts.  That does seem reasonable, though I still like the visual separation that the 'in' clause provides, and I'd love to be able to use the proposed inheritance feature of contracts, which seems like it might necessitate duplicating these inexpensive checks in the contract and in the function body itself.

> One simple case that clarifies the necessary tradeoff is binary search. That assumes the range to be searched is sorted. If you actually checked for that, it would render binary search useless as a linear search would be in fact faster. So you need to assume. One way to do so is in the documentation. You write in the docs that findSorted expects a sorted range. Another way is to encode this information in the type of the sorted range. But that's onerous as most of the time you have an array you just sorted, not a SortedArray value.
> 
> The approach I took with the new phobos is:
> 
> int[] haystack;
> int[] needle;
> ...
> auto pos1 = find(haystack, needle); // linear
> sort(haystack);
> auto pos2 = find(assumeSorted(haystack), needle);
> 
> The assumeSorted function wraps the haystack in an AssumeSorted!(int[]) type without adding members or running extra code. It's there to clarify to everyone what's going on. And it's usable with other arguments or functions too, e.g.
> 
> auto pos3 = find(haystack, assumeSorted(needle));
> setIntersection(assumeSorted(haystack), assumeSorted(needle));
> 
> Interestingly, assumeSorted can actually do checking without impacting the complexity of the search. In debug mode, it can arrange to run random isSorted tests every 1/N calls, where N is the average length of the incoming arrays, then its complexity impact is amortized constant.

One thing I've always really liked about pointer arguments is that they tend to document what's happening at the call-side as well (because of the address-of operator typically needed to obtain the address of a variable).  I tend to avoid boolean parameters for similar reasons, unless the meaning can be communicated clearly at the call point.  It seems like this serves a similar purpose, and I like it despite the potential for a user accidentally calling the slow overload when he could actually use the fast one--better it be correct than fast, after all.

I'm not terribly fond of the added verbosity however, or that this seems like I couldn't use the property form:

    assumeSorted("abcd").find('c')

Truth be told, my initial inclination would be to repackage the binary search as a one-liner with a different name, which kind of sabotages the whole idea.  But I'll try to resist this urge.