September 19, 2013
On Thursday, 19 September 2013 at 18:27:52 UTC, H. S. Teoh wrote:
> In my understanding, an out-contract is some condition C that the
> function F promises to the caller will hold after it has finished
> executing. Since F's internal state is uninteresting to the caller, C is
> expressed in terms of what is visible to the caller -- that is, the
> original function parameters and its return value(s) (which may include
> the final state of the parameters if changes to those parameters are
> visible to the caller). An out-contract shouldn't depend on the state of
> local variables in F, since those are not visible to the caller and are
> therefore uninteresting.  If you wish to check that F's internal state
> satisfies some condition X at the end of F, then it should be done as assert(X) before return, not as an out-contract.

OK, you could define an out-contract that way -- but what's wrong in principle with the contract also being able to promise, "My internal state was sane on exit"?

Nitpicking over what should and shouldn't be allowed to be in contracts seems like a good way to complicate implementing them for no obvious technical benefit.  Is there really any technical cost to allowing the out-contract to address internal function variables, if the programmer wants it?
September 19, 2013
On Thu, Sep 19, 2013 at 09:46:35PM +0200, Joseph Rushton Wakeling wrote:
> On Thursday, 19 September 2013 at 18:27:52 UTC, H. S. Teoh wrote:
> >In my understanding, an out-contract is some condition C that the function F promises to the caller will hold after it has finished executing. Since F's internal state is uninteresting to the caller, C is expressed in terms of what is visible to the caller -- that is, the original function parameters and its return value(s) (which may include the final state of the parameters if changes to those parameters are visible to the caller). An out-contract shouldn't depend on the state of local variables in F, since those are not visible to the caller and are therefore uninteresting.  If you wish to check that F's internal state satisfies some condition X at the end of F, then it should be done as assert(X) before return, not as an out-contract.
> 
> OK, you could define an out-contract that way -- but what's wrong in principle with the contract also being able to promise, "My internal state was sane on exit"?

Use an assert. That's what asserts are meant for.

And if you have multiple return statements and don't want to duplicate code, just write:

	scope(exit) assert(myStateIsSane);

at the top of the function.


> Nitpicking over what should and shouldn't be allowed to be in contracts seems like a good way to complicate implementing them for no obvious technical benefit.

I'm not saying that the compiler should purposely check and reject such uses. I'm saying that such uses are wrong usages of out-contracts.  It's not the compiler's job to reject a badly-designed class API, for example (it wouldn't know how to determine if something was badly-designed, in the first place), but that doesn't mean poor API design is a good idea.

What the OP was concerned about was how to write an out-contract that references the *original* values of the function arguments, which is definitely a legitimate case that should be supported. This legitimate use case should take precedence over the wrong usage of referencing local variables in an out-contract (if indeed it's not worth the trouble to make the compiler reject the latter case).

There is an obvious benefit of using out-contracts correctly: the function becomes self-documenting:

	real sqrt(real x)
	in { assert(x >= 0.0); }
	out(result) { assert(abs(result*result - x) < EPSILON); }
	body { ... }

Just by reading the contracts of sqrt, you know that (1) it expects its
argument to be non-negative, and (2) its return value is the square root
of x (i.e., the return value multiplied by itself is equal to x up to
the specified precision).

But if 'x' in the out-contract refers to a possibly modified value of x rather than the original argument, then the out-contract is basically meaningless to the caller of this function, since, not knowing how sqrt is implemented, the final value of x could be anything, and any condition satisfied by x is of no relevance outside of sqrt itself.  And since it has no relevance outside of sqrt, why bother with out-contracts at all? Just put the assert inside the function body, where it belongs.

The only case where the modified value of an argument is relevant is when the argument is passed by reference, and the caller can observe the effects of the function on it. For example:

	void sqrtInPlace(ref real x)
	in { assert(x >= 0.0); }
	out {
		// Note: .out and .in is just hypothetical syntax.
		assert(abs(x.out * x.out - x.in) < EPSILON);
	}
	body { ... }

The out-contract is very useful, because it tells the user exactly how the function will modify its argument, and what relation the final value has to the original value. Currently, however, we don't have .out syntax, so the above contract is inexpressible. Worse yet, in the current implementation, we may write:

	void sqrtInPlace(ref real x)
	in { assert(x >= 0.0); }
	out {
		// Does this x refer to the initial value of x, or the
		// final value?
		assert(x >= 0.0);

		// It sure looks like the initial value to me. Which
		// makes it totally pointless. But it actually refers to
		// the final value, which is non-obvious, and also of
		// limited benefit since we can't express a more
		// definite guarantee about its final value w.r.t. its
		// original value (i.e., that the final value squared
		// equals the original value up to a given precision).
	}
	body { ... }


> Is there really any technical cost to allowing the out-contract to address internal function variables, if the programmer wants it?

There is no cost -- it currently already lets you do that. :)

But it's also useless, because then there is no benefit to using an out-contract as opposed to a plain old assert before the return statement. In terms of the final executable code, there is basically no advantage that an out-contract offers above an in-body assert statement. The added value of an out-contract is the documented (and runtime-checked) guarantee it provides to the users of the API. Referencing implementation details that the users of the API don't care about, such as local variables in the out-contract, invalidates this benefit, which makes it useless.


T

-- 
Just because you can, doesn't mean you should.
September 19, 2013
On Thursday, 19 September 2013 at 17:09:49 UTC, Maxim Fomin wrote:
> Interfaces are irrelevant here.
> By the way, taking into account current implementation I think you will have to wait a lot of time to see the requested satisfied.

Yes, most probably. I'd much rather have the "in/out contracts should be conditionally included depending on caller compilation options, the lib's". EG: the reason druntime doesn't do array overlap checking, even when compiling in non-release.

> And even if it will be, nothing stops parameter to be modified by function body even it is not known at CT.

I didn't suggest banning modifying function arguments?

> This is deliberately created problem - you modify passed by value parameter which has little sense and then you make so that out contract should depend on original value of parameter. Now you create a problem and complain about it.

Just to be clear, I'm not complaining. I wanted to have a clearer answer about what the behavior *should* be. To be fair, I expected the answer would be "It should probably work that way, but it'll never happen, sorry".

> By the way, it seems that you view in/out/body as separate functions, probably like invariant, which is wrong

Why is it wrong? Is there a specification that says it *can't* be that way? It'd be (almost) the only way to be able to define contracts in interface files, with implementation inside other files.

> just take a look into assembly as I asked before.

My original question was "is the behavior according to spec". Looking at assembly defeats the question.

> If you want to modify parameter, just create local copy of it.

Yes.

>> An "out" contract should only verify that the value returned is correct "in regards" to the "input arguments".
>>
>> This means the implementation details of the function's body are leaking into the out block. That just isn't right.
> This is pure theory. Nothing promises you that discussed minor implementation detail of out contracts would be implemented in a way you expect.

Pure theory was the point of my question.

> There are several problem arising from your demand:
> 1) How much sense is in (providing support for) modifying passed by value parameter?
> 2) If parameter is passed by reference and supposed to be modified, this would engage into a conflict that original value should be preserved in.
> 3) Implementing your proposal would require - either creating three separate functions for in/out/body or putting pressure on to function frame stack + available registers for saving original values.

I'm perfectly fine if we close this as "won't fix", "for practical reasons, arguments passed by value *may* be modified by the implementation". As long as we document it right.

> Even this does not solve ref problem.

I'm not sure there is any problems with ref. Reference argument would reference the same variable regardless. I'm not asking for a time machine to give me the ref's old value.

> And on the advantage part of the proposal there is no reasoning except suspicious practice of modifying passed by value parameters and your words "I'd like the out contract to operate on its own copies of the function's arguments."
>
> Of course, you can issue enhancement request writing that you 1) do not want to wake a local copy of parameter and 2) do want to modify passed by value parameter and 3) do want to see in the out contract original value (thus requiring 4a) either to create three separate functions instead of one or 4b trashing stack/registers), but I expect it would be closed with boldly WONTFIX.

I think I'm seeing a trend that, regardless of merit or technical correctness, changing the behavior excessively costly anyways.

:/
September 20, 2013
On 19/09/13 22:43, H. S. Teoh wrote:
> I'm not saying that the compiler should purposely check and reject such
> uses. I'm saying that such uses are wrong usages of out-contracts.  It's
> not the compiler's job to reject a badly-designed class API, for example
> (it wouldn't know how to determine if something was badly-designed, in
> the first place), but that doesn't mean poor API design is a good idea.
>
> What the OP was concerned about was how to write an out-contract that
> references the *original* values of the function arguments, which is
> definitely a legitimate case that should be supported. This legitimate
> use case should take precedence over the wrong usage of referencing
> local variables in an out-contract (if indeed it's not worth the trouble
> to make the compiler reject the latter case).

I completely got monarchdodra's concerns, but I'm just not convinced that the right way to go about that is to redefine the contract implementation so that x refers to its value on entry.

An x.old or x.in syntax (however you want to write it) would address that need.  Enforcing "x in the out-contract means x at function entry" isn't possible without breaking the opportunity to use the out-contract to verify final internal state.

> There is an obvious benefit of using out-contracts correctly: the
> function becomes self-documenting:
>
> 	real sqrt(real x)
> 	in { assert(x >= 0.0); }
> 	out(result) { assert(abs(result*result - x) < EPSILON); }
> 	body { ... }
>
> Just by reading the contracts of sqrt, you know that (1) it expects its
> argument to be non-negative, and (2) its return value is the square root
> of x (i.e., the return value multiplied by itself is equal to x up to
> the specified precision).

Understood, but do you lose that much by insisting that the out contract uses x.in or (Eiffel-style) x.old to refer to the original value?

> But if 'x' in the out-contract refers to a possibly modified value of x
> rather than the original argument, then the out-contract is basically
> meaningless to the caller of this function, since, not knowing how sqrt
> is implemented, the final value of x could be anything, and any
> condition satisfied by x is of no relevance outside of sqrt itself.  And
> since it has no relevance outside of sqrt, why bother with out-contracts
> at all? Just put the assert inside the function body, where it belongs.
>
> The only case where the modified value of an argument is relevant is
> when the argument is passed by reference, and the caller can observe the
> effects of the function on it. For example:
>
> 	void sqrtInPlace(ref real x)
> 	in { assert(x >= 0.0); }
> 	out {
> 		// Note: .out and .in is just hypothetical syntax.
> 		assert(abs(x.out * x.out - x.in) < EPSILON);
> 	}
> 	body { ... }

I think this is a good reason for assuming that the out contract by default refers to final values, _unless_ you explicitly ask for x.in.

Why?  Because if you have an input passed by reference, and you mutate it, then the "expected" value of x when you call the out contract is definitely going to be its mutated value.  Same for an input passed by pointer, where the value it points to is changed.

So, if you accept that in _some cases_ the "expected" or natural behaviour is that x will refer to a mutated value, it's better for that expectation to be consistent across all types than to have to make special rules.

> The out-contract is very useful, because it tells the user exactly how
> the function will modify its argument, and what relation the final value
> has to the original value. Currently, however, we don't have .out
> syntax, so the above contract is inexpressible. Worse yet, in the current
> implementation, we may write:
>
> 	void sqrtInPlace(ref real x)
> 	in { assert(x >= 0.0); }
> 	out {
> 		// Does this x refer to the initial value of x, or the
> 		// final value?
> 		assert(x >= 0.0);
>
> 		// It sure looks like the initial value to me. Which
> 		// makes it totally pointless. But it actually refers to
> 		// the final value, which is non-obvious, and also of
> 		// limited benefit since we can't express a more
> 		// definite guarantee about its final value w.r.t. its
> 		// original value (i.e., that the final value squared
> 		// equals the original value up to a given precision).
> 	}
> 	body { ... }

I get the visual convenience of it, but the ambiguities around reference types, pointers, etc. suggest to me that it just can't be that simple.

>> Is there really any technical cost to allowing the out-contract to
>> address internal function variables, if the programmer wants it?
>
> There is no cost -- it currently already lets you do that. :)

... but you can't have "x in the out-contract means x on function entry" without breaking that possibility.

> But it's also useless, because then there is no benefit to using an
> out-contract as opposed to a plain old assert before the return
> statement. In terms of the final executable code, there is basically no
> advantage that an out-contract offers above an in-body assert statement.
> The added value of an out-contract is the documented (and
> runtime-checked) guarantee it provides to the users of the API.
> Referencing implementation details that the users of the API don't care
> about, such as local variables in the out-contract, invalidates this
> benefit, which makes it useless.

I get the conceptual and self-documenting elegance of it, I just think that it's a false elegance when you consider the special cases you'd have to work round.

    void foo(int *x)
      in { assert *x >= 0.0; }
     out { assert *x >= 0.0; }
    body { ... }

What's the "instinctive" understanding of what this means, in this case?  What's the sensible default behaviour?
September 20, 2013
On Fri, Sep 20, 2013 at 04:01:56PM +0200, Joseph Rushton Wakeling wrote:
> On 19/09/13 22:43, H. S. Teoh wrote:
> >I'm not saying that the compiler should purposely check and reject such uses. I'm saying that such uses are wrong usages of out-contracts.  It's not the compiler's job to reject a badly-designed class API, for example (it wouldn't know how to determine if something was badly-designed, in the first place), but that doesn't mean poor API design is a good idea.
> >
> >What the OP was concerned about was how to write an out-contract that references the *original* values of the function arguments, which is definitely a legitimate case that should be supported. This legitimate use case should take precedence over the wrong usage of referencing local variables in an out-contract (if indeed it's not worth the trouble to make the compiler reject the latter case).
> 
> I completely got monarchdodra's concerns, but I'm just not convinced that the right way to go about that is to redefine the contract implementation so that x refers to its value on entry.
> 
> An x.old or x.in syntax (however you want to write it) would address that need.  Enforcing "x in the out-contract means x at function entry" isn't possible without breaking the opportunity to use the out-contract to verify final internal state.

True. I agree that redefining plain "x" to mean "initial value of x" would be difficult to implement for ref arguments, and will also break existing code to boot. Having access to x.old is the best solution where possible (I'm thinking, for the sake of implementational simplicity, that we can restrict x.old to only work for parameters passed by value, since D doesn't have a generic way of cloning a reference argument).


> >There is an obvious benefit of using out-contracts correctly: the function becomes self-documenting:
> >
> >	real sqrt(real x)
> >	in { assert(x >= 0.0); }
> >	out(result) { assert(abs(result*result - x) < EPSILON); }
> >	body { ... }
> >
> >Just by reading the contracts of sqrt, you know that (1) it expects
> >its argument to be non-negative, and (2) its return value is the
> >square root of x (i.e., the return value multiplied by itself is
> >equal to x up to the specified precision).
> 
> Understood, but do you lose that much by insisting that the out contract uses x.in or (Eiffel-style) x.old to refer to the original value?

No, but the current notation is ambiguous and misleading. I agree that x.in and x.out (or some such notation) is better.


[...]
> >The only case where the modified value of an argument is relevant is when the argument is passed by reference, and the caller can observe the effects of the function on it. For example:
> >
> >	void sqrtInPlace(ref real x)
> >	in { assert(x >= 0.0); }
> >	out {
> >		// Note: .out and .in is just hypothetical syntax.
> >		assert(abs(x.out * x.out - x.in) < EPSILON);
> >	}
> >	body { ... }
> 
> I think this is a good reason for assuming that the out contract by default refers to final values, _unless_ you explicitly ask for x.in.

I would rather say that it's illegal to refer to plain 'x' in an out-contract because it's ambiguous. It should always refer to either x.in or x.out. (Of course, we can silently still support the current syntax in order to preserve backward compatibility, but I'd really prefer it to be deprecated.)


> Why?  Because if you have an input passed by reference, and you mutate it, then the "expected" value of x when you call the out contract is definitely going to be its mutated value.  Same for an input passed by pointer, where the value it points to is changed.
> 
> So, if you accept that in _some cases_ the "expected" or natural behaviour is that x will refer to a mutated value, it's better for that expectation to be consistent across all types than to have to make special rules.

It's better to be explicit about which version of the argument you're referring to. Make it mandatory to write x.in and x.out. Then the compiler can catch the cases where you wrote x.in but there is no obvious way to implement it for a ref argument. Better have a compile-time error than a runtime bug where you think x refers to x.in but it actually refers to x.out.


> >The out-contract is very useful, because it tells the user exactly how the function will modify its argument, and what relation the final value has to the original value. Currently, however, we don't have .out syntax, so the above contract is inexpressible. Worse yet, in the current implementation, we may write:
> >
> >	void sqrtInPlace(ref real x)
> >	in { assert(x >= 0.0); }
> >	out {
> >		// Does this x refer to the initial value of x, or the
> >		// final value?
> >		assert(x >= 0.0);
> >
> >		// It sure looks like the initial value to me. Which
> >		// makes it totally pointless. But it actually refers to
> >		// the final value, which is non-obvious, and also of
> >		// limited benefit since we can't express a more
> >		// definite guarantee about its final value w.r.t. its
> >		// original value (i.e., that the final value squared
> >		// equals the original value up to a given precision).
> >	}
> >	body { ... }
> 
> I get the visual convenience of it, but the ambiguities around reference types, pointers, etc. suggest to me that it just can't be that simple.

OK, so we should make x.in and x.out mandatory syntax, and deprecate straight references to unqualified x. That way, the above ambiguous code will be rejected, and you will be forced to write in unambiguously.


> >>Is there really any technical cost to allowing the out-contract to address internal function variables, if the programmer wants it?
> >
> >There is no cost -- it currently already lets you do that. :)
> 
> ... but you can't have "x in the out-contract means x on function entry" without breaking that possibility.

This issue is fixed if we make x.in and x.out mandatory.


> >But it's also useless, because then there is no benefit to using an out-contract as opposed to a plain old assert before the return statement. In terms of the final executable code, there is basically no advantage that an out-contract offers above an in-body assert statement.  The added value of an out-contract is the documented (and runtime-checked) guarantee it provides to the users of the API. Referencing implementation details that the users of the API don't care about, such as local variables in the out-contract, invalidates this benefit, which makes it useless.
> 
> I get the conceptual and self-documenting elegance of it, I just think that it's a false elegance when you consider the special cases you'd have to work round.
> 
>     void foo(int *x)
>       in { assert *x >= 0.0; }
>      out { assert *x >= 0.0; }
>     body { ... }
> 
> What's the "instinctive" understanding of what this means, in this case?  What's the sensible default behaviour?

I'd say this is a good argument for making x.in and x.out mandatory syntax. Then it becomes clear and readable:

	void foo(int *x)
	 in { assert(*x.in >= 0.0; }
	out { assert(*x.out >= 0.0; }
	body { ... }

And for moving non-contract asserts (i.e. those that check implementation sanity rather than make any guarantees of interest to the caller) into the function body rather than putting them in the out-contract.


T

-- 
Живёшь только однажды.
September 20, 2013
On Friday, 20 September 2013 at 16:38:36 UTC, H. S. Teoh wrote:
> True. I agree that redefining plain "x" to mean "initial value of x"
> would be difficult to implement for ref arguments

I think it would be wrong to boot. "ref" is an indirect type, like a pointer. Changes made to "it" (eg the reference'd value) *need* to be seen in the out contract. I don't think it makes sense to talk about "the old value of the reference". That'd be like asking for the old value of what a pointer was pointing to.
September 20, 2013
On 20/09/13 18:37, H. S. Teoh wrote:
> True. I agree that redefining plain "x" to mean "initial value of x"
> would be difficult to implement for ref arguments, and will also break
> existing code to boot. Having access to x.old is the best solution where
> possible (I'm thinking, for the sake of implementational simplicity,
> that we can restrict x.old to only work for parameters passed by value,
> since D doesn't have a generic way of cloning a reference argument).

Agree.

> No, but the current notation is ambiguous and misleading. I agree that
> x.in and x.out (or some such notation) is better.

Yes, I can go with that.  There could be a deprecation path for x => x.out if that's desirable, although there's precedent (as Paolo has pointed out) for using just x.

> I would rather say that it's illegal to refer to plain 'x' in an
> out-contract because it's ambiguous. It should always refer to either
> x.in or x.out. (Of course, we can silently still support the current
> syntax in order to preserve backward compatibility, but I'd really
> prefer it to be deprecated.)

Sure.

> This issue is fixed if we make x.in and x.out mandatory.

Agree, and I was never arguing against that.

> I'd say this is a good argument for making x.in and x.out mandatory
> syntax. Then it becomes clear and readable:
>
> 	void foo(int *x)
> 	 in { assert(*x.in >= 0.0; }
> 	out { assert(*x.out >= 0.0; }
> 	body { ... }
>
> And for moving non-contract asserts (i.e. those that check
> implementation sanity rather than make any guarantees of interest to the
> caller) into the function body rather than putting them in the
> out-contract.

In this case, yes, you can reliably handle it with in/out syntax.  It becomes problematic if x is a more complex data type, e.g. a class with many reference-type members (arrays, other classes, ...).  As there's no way to reliably do a deep copy of such an entity (and of course you might have entities such as input ranges where you really can't preserve the state), there are always going to be function inputs where you can't readily access x.in in the out-contract.
September 20, 2013
On 20/09/13 18:50, monarch_dodra wrote:
> I think it would be wrong to boot. "ref" is an indirect type, like a pointer.
> Changes made to "it" (eg the reference'd value) *need* to be seen in the out
> contract. I don't think it makes sense to talk about "the old value of the
> reference". That'd be like asking for the old value of what a pointer was
> pointing to.

Yup.  But this is why I think it makes sense for a variable x in the out-contract to always mean "the final value of x at the end of the function". If _for reference types_ the out-contract sees the changed value, that should be true for all types, otherwise it's an inconsistency that is going to bite people.

Then, in addition, you should be able to have an explicit request (x.in, x.old ...) for the initial value (as you needed in your contract).
September 20, 2013
On Friday, 20 September 2013 at 18:14:44 UTC, Joseph Rushton Wakeling wrote:
> On 20/09/13 18:50, monarch_dodra wrote:
>> I think it would be wrong to boot. "ref" is an indirect type, like a pointer.
>> Changes made to "it" (eg the reference'd value) *need* to be seen in the out
>> contract. I don't think it makes sense to talk about "the old value of the
>> reference". That'd be like asking for the old value of what a pointer was
>> pointing to.
>
> Yup.  But this is why I think it makes sense for a variable x in the out-contract to always mean "the final value of x at the end of the function". If _for reference types_ the out-contract sees the changed value, that should be true for all types, otherwise it's an inconsistency that is going to bite people.
>
> Then, in addition, you should be able to have an explicit request (x.in, x.old ...) for the initial value (as you needed in your contract).

I disagree. Isn't the entire point of "pass by value" to mean "I operate on a copy, so you won't see any difference to the argument afterwards" and ref to mean "I operate on the original object, so all differences I make you will see"?

I think it is wrong to think in terms of "orginal/final" to begin with. The "out" contract should simply see what the caller sees. If the function body mutates something the caller doesn't see, than neither should the out contract.

The notion of "old"/"new" or "in"/"out" feels strange to me. It's making a lot of special cases for something that should really be very simple to begin with.
September 20, 2013
On 20/09/13 20:21, monarch_dodra wrote:
> I disagree. Isn't the entire point of "pass by value" to mean "I operate on a
> copy, so you won't see any difference to the argument afterwards" and ref to
> mean "I operate on the original object, so all differences I make you will see"?
>
> I think it is wrong to think in terms of "orginal/final" to begin with. The
> "out" contract should simply see what the caller sees. If the function body
> mutates something the caller doesn't see, than neither should the out contract.

I get the concept, but I find it unattractive in practice just because it means you have different variables being treated differently in the out-contract depending on whether they are references or values.

I think that consistency of how variables are treated in the out-contract is probably going to turn out less problematic than a theoretically ideal situation where the contract addresses what the outside user can/should be able to "see".

But I accept that I may be influenced by the fact that I'm used to working in open-source situations where the body of a function is always available if you really want to look at it.  If you're looking at it purely on the basis of an interface or a function declaration, I can understand why your concept might seem preferable.

> The notion of "old"/"new" or "in"/"out" feels strange to me. It's making a lot
> of special cases for something that should really be very simple to begin with.

Well, I think it's inescapable that in different circumstances you are going to want the out-contract to be able to access either the initial or the final value of a particular variable.  Preventing the programmer from doing one or the other seems an unnecessary restriction.

As Paolo has pointed out, it's in keeping with how contracts were originally defined.