August 06, 2014
>
>
>>> The main argument seems to revolve around whether this is actually a
>> change
>> or not.  In my (and others') view, the treatment of assert as 'assume' is
>> not a change at all.  It was in there all along, we just needed the wizard
>> to tell us.
>>
>>
> How can there be any question? This is a change in the compiler, a change in the docs, change in what your program does, change of the very bytes in the executable. If my program worked before and doesn't now, how is that not a change? This must be considered a change by any reasonable definition of the word change.
>
> I don't think I can take seriously this idea that someone's unstated, unmanifested intentions define change more so than things that are .. you know.. actual real changes.
>
>
Yes, sorry, there will be actual consequences if the optimizations are implemented.  What I meant with the somewhat facetious statement was that there is no semantic change - broken code will still be broken, it will just be broken in a different way.

If you subscribe to the idea that a failed assertion indicates all subsequent code is invalid, then subsequent code is broken (and undefined, if the spec says it is undefined).  The change would be clarifying this in the spec, and dealing with the fallout of previously broken-but-still-working code behaving differently under optimization.

At least, that's how I understand it... I hope I an not mischaracterizing others' positions here (let me know, and I'll shut up).




>> Well I think I outlined the issues in the OP. As for solutions, there
> have been some suggestions in this thread, the simplest is to leave things as is and introduce the optimizer hint as a separate function, assume().
>
> I don't think there was any argument presented against a separate function besides that Walter couldn't see any difference between the two behaviors, or the intention thing which doesn't really help us here.
>
>
An argument against the separate function: we already have a function, called assert.  Those that want the nonoptimizing version (a disable-able 'if false throw' with no wider meaning) should get their own method darnit.


I guess the only real argument against it is that pre-existing asserts
> contain significant optimization information that we can't afford to not reuse.


Yessss.  This gets to the argument - asserts contain information about the program.  Specifically, a statement about the valid program state at a given point.  So we should treat them as such.


> But this is a claim I'm pretty skeptical of.


Ah man, thought I had you.


> Andrei admitted it's just a hunch at this point. Try looking through your code base to see how many asserts would be useful for optimizing.
>

Ironically, I don't tend to use asserts at all in my code.  I do not want code that will throw or not throw based on a compiler flag.  Why am I even arguing about this stuff?

If asserts were used for optimizing, I would look at actually using them more.


(Can we programmatically (sp?) identify and flag/resolve
>> issues that occur from a mismatch of expectations?)
>>
>
> I'm not an expert on this, but my guess is it's possible in theory but would never happen in practice. Such things are very complex to implement, if Walter won't agree to a simple and easy solution, I'm pretty sure there's no way in hell he would agree to a complex one that takes a massive amount of effort.
>

If gcc et all do similar optimizations, how do they handle messaging?


August 06, 2014
On 8/5/14, 11:28 PM, Tofu Ninja wrote:
> On Wednesday, 6 August 2014 at 00:52:32 UTC, Walter Bright wrote:
>> On 8/3/2014 4:51 PM, Mike Farnsworth wrote:
>>> This all seems to have a very simple solution, to use something like:
>>> expect()
>>
>> I see code coming that looks like:
>>
>>    expect(x > 2);  // must be true
>>    assert(x > 2);  // check that it is true
>>
>> All I can think of is, shoot me now :-)
>
> How about something like
> @expected assert(x > 2); or @assumed assert(x > 2);
>
> It wouldn't introduce a new keyword, but still introduces the
> expected/assumed semantics. You should keep in mind that you
> might have to make a compromise, regardless of your feelings on
> the subject.

I think "assert" is good to use for optimization, and "debug assert" would be a good choice for soft assertions. Care must be exercised with tying new optimizations to build flags.

> Also, I am going to try to say this in as respectful a way as I
> can...
>
> Please stop responding in such a dismissive way, I think it is
> already pretty obvious that some are getting frustrated by these
> threads. Responding in a dismissive way makes it seem like you
> don't take the arguments seriously.

I have difficulty figuring how such answers can be considered dismissive. The quoted code is considered an antipattern at least e.g. at my workplace. (Wouldn't pass review, and disproportionate insistence on such might curb one's career.) Even though some might not agree with Walter's opinion, it's entirely reasonable to express dislike of that code; I don't quite get why that would be consider dismissive. I think we're at the point where everybody understands one another, and there must be a way to express polite but definite disagreement. What would that be?


Thanks,

Andrei

August 06, 2014
>
> I feel that, at this stage, is only about how a compiler glag, specifically "-release" works. For other configurations, there is no problem: event if the optimizer optimizes based on asserts, the asserts themselves are part of the code: code is there and the assertion will fail before execution enters the optimized path. This is just like any other optimization, nothing special about it.
>
> The problem with "-release" might be formulated in that it optimizes based on a code that is no longer present (the asserts are wiped out). It keeps the optimization, but it dismisses the garde-fous.
>


Yes!

[lengthy and imprecise rambling about assert definition omitted]


August 06, 2014
On Wednesday, 6 August 2014 at 06:56:40 UTC, eles wrote:
> I feel that, at this stage, is only about how a compiler glag, specifically "-release" works. For other configurations, there is no problem: event if the optimizer optimizes based on asserts, the asserts themselves are part of the code: code is there and the assertion will fail before execution enters the optimized path. This is just like any other optimization, nothing special about it.

Not right:

b = a+1
assume(b>C)

implies

assume(a+1>C)
b = a+1
August 06, 2014
On Wednesday, 6 August 2014 at 07:19:21 UTC, Andrei Alexandrescu wrote:
> The quoted code is considered an antipattern at least e.g. at my workplace.

What about:

«
if(x==0){ …free of x…}
…free of x…
assume(x!=0)
»

being equivalent to

«
assume(x!=0)
if(x==0){ …free of x…}
…free of x…
»

> I think we're at the point where everybody understands one another

Really? I am the point where I realize that a significant portion of programmers have gullible expectations of their own ability to produce provably correct code and a very sloppy understanding of what computing is.

So now we don't have Design by Contract, but Design by Gullible Assumptions.

Great…
August 06, 2014
On Wednesday, 6 August 2014 at 07:19:21 UTC, Andrei Alexandrescu wrote:
> On 8/5/14, 11:28 PM, Tofu Ninja wrote:
>> On Wednesday, 6 August 2014 at 00:52:32 UTC, Walter Bright wrote:
>>> On 8/3/2014 4:51 PM, Mike Farnsworth wrote:

> I think "assert" is good to use for optimization, and "debug assert" would be a good choice for soft assertions. Care must

Conceptually, this means a "release assert" (both in debug and release builds) and a "debug assert" (only in debug builds).

Thus, question: it is acceptable to optimize a (release) build based on code that is present only into another (debug) build?
August 06, 2014
On 8/5/2014 11:28 PM, Tofu Ninja wrote:
> Please stop responding in such a dismissive way, I think it is
> already pretty obvious that some are getting frustrated by these
> threads. Responding in a dismissive way makes it seem like you
> don't take the arguments seriously.

I responded to the equivalent design proposal several times already, with detailed answers. This one is shorter, but the essential aspects are there. I know those negative aspects came across because they are addressed with your counter:

> How about something like
> @expected assert(x > 2); or @assumed assert(x > 2);
> It wouldn't introduce a new keyword, but still introduces the
expected/assumed semantics.

The riposte:

1. it's long with an unappealing hackish look
2. it follows in the C++ tradition of the best practice being the long ugly way, and the deprecated practice is the straightforward way (see arrays in C++)
3. users will be faced with two kinds of asserts, with a subtle difference that is hard to explain, hard to remember which is which, and will most likely use inappropriately
4. everyone who wants faster assert optimizations will have to rewrite their (possibly extensive) use of asserts that we'd told them was best practice. I know I'd be unhappy about having to do such to my D code.


> You should keep in mind that you might have to make a compromise, regardless > of your feelings on the subject.

This is not about my feelings, other than my desire to find the best design based on a number of tradeoffs.


I'll sum up with the old saw that any programming problem can be solved with another level of indirection. I submit a corollary that any language issue can be solved by adding another keyword or compiler flag. The (much) harder thing is to solve a problem with an elegant solution that does not involve new keywords/flags, and fits in naturally.
August 06, 2014
On Wednesday, 6 August 2014 at 07:29:02 UTC, Ola Fosheim Grøstad wrote:
> On Wednesday, 6 August 2014 at 06:56:40 UTC, eles wrote:

> Not right:
>
> b = a+1
> assume(b>C)
>
> implies
>
> assume(a+1>C)
> b = a+1

b = a+1
if(C<=b) exit(1);

implies

if(C<=a+1) exit(1);
b = a+1

Is not the same? Still, one would allow the optimizer to exit before executing b=a+1 line (in the first version) based on that condition (considering no volatile variables).

I stick to my point: as long as the code is there, optimization based on it is acceptable (as long as the functionality of the program is not changed, of course). The sole floating point is what to do when the code that is used for optimization is discarded.

Would you accept optimization of a C program based on code that is guarded by:

#ifndef _NDEBUG
//code that could be used for optimization
#endif

in the -D_NDEBUG version?

(I am not convinced 100% that for D is the same, but should help with the concepts)
August 06, 2014
On Wednesday, 6 August 2014 at 01:11:55 UTC, Jeremy Powers via Digitalmars-d wrote:
>> That's in the past. This is all about the pros and cons of changing it now
>> and for the future.
>>
>
> The main argument seems to revolve around whether this is actually a change
> or not.  In my (and others') view, the treatment of assert as 'assume' is
> not a change at all.  It was in there all along, we just needed the wizard
> to tell us.

This is already the first misunderstanding: The argument is about whether it's a good idea, not whether it's newly introduced or has been the intended meaning since assert's conception.

>
>
>
> The below can safely be ignored, as I just continue the pedantic
> discussions....
>
>
> OK, but my point was you were using a different definition of undefined
>> behavior. We can't communicate if we aren't using the same meanings of
>> words.
>>
>>
> Yes, very true.  My definition of undefined in this case hinges on my
> definition of what assert means.  If a failed assert means all code after
> it is invalid, then by definition (as I interpret the definition) that code
> is invalid and can be said to have undefined behaviour.  That is, it makes
> sense to me that it is specified as undefined, by the spec that is
> incredibly unclear.  I may be reading too much into it here, but this
> follows the strict definition of undefined - it is undefined because it is
> defined to be undefined.  This is the 'because I said so' defense.

Of course you can define your own concept and call it "undefined", but I don't see how it matters. The concept described by the usual definition of "undefined" still exists, and it still has very different implications than your concept has. To give a more practical example:

You're writing an authentication function. It takes a username and a password, and returns true or false, depending on whether the password is correct for this username. Unfortunately, the verification algorithm is wrong: due to an integer overflow in the hash calculation, it rejects some valid passwords, but never accepts invalid ones. The program is clearly incorrect, but its behaviour is still well-defined and predictable (overflow is not undefined in D).

Now, if the flaw in the algorithm were due to an actual undefined operation, everything could happen, including the function accepting invalid passwords. I hope it's clear that this is a very different class of brokenness.

> My stance is that this new/old definition is a good thing, as it matches
> how I thought things were already, and any code that surfaces as broken
> because of it was already broken in my definition.  Therefore this 'change'
> is good, does not introduce breaking changes, and arguments about such
> should be redirected towards mitigation and fixing of expectations.
>
> In an attempt to return this discussion to something useful, question:
>
> If assert means what I think it means (and assuming we agree on what the
> actual two interpretations are), how do we allay concerns about it?  Is
> there anything fundamentally/irretrievably bad if we use this new/old
> definition?  (Can we programmatically (sp?) identify and flag/resolve
> issues that occur from a mismatch of expectations?)

My understanding (which is probably the same as that of most people participating in the discussion, because as I said above, I _don't_ think the argument is about a misunderstanding on this level):

Walter's assert:
* Expresses a statement that the programmer intends to be true. It is only checked in non-release mode.
* The compiler can assume that it is true - even in release mode - because the programmer explicitly said so, and the compiler may not have figured it out by itself (similar to casts, which also express assumptions by the programmer that the compiler cannot know otherwise).
* Asserting a condition that is false is undefined behaviour.

The other assert:
* Expresses a statement that the programmer intends to be true. It is only checked in non-release mode.
* Because it is unlikely that the programmer has proved the correctness in the general case, the compiler must not assume it is true unless it can prove it to be, either at compile time, or with a runtime check. Release mode disables the runtime checks.
* Asserting a condition that is false either raises an error at runtime, aborts compilation, or doesn't do anything. It never causes undefined behaviour by itself.

As I already wrote elsewhere, an assert with the suggested/intended behaviour is a very dangerous tool that should not be used as widely as it is today. If the asserted condition is wrong (for whatever reason), it would create not just wrong behaviour, but undefined behaviour (as described above, not your concept).

H.S. Teoh however suggested to extend compile time checking for assertions. I believe this is the way to go forward, and it has great potential. What I don't agree with, of course, is to just believe anything in the assertions to be true without verifying it.
August 06, 2014
On Tuesday, 5 August 2014 at 21:17:14 UTC, H. S. Teoh via Digitalmars-d wrote:
> On Tue, Aug 05, 2014 at 08:11:16PM +0000, via Digitalmars-d wrote:
>> On Tuesday, 5 August 2014 at 18:57:40 UTC, H. S. Teoh via Digitalmars-d
>> wrote:
>> >Exactly. I think part of the problem is that people have been using
>> >assert with the wrong meaning. In my mind, 'assert(x)' doesn't mean
>> >"abort if x is false in debug mode but silently ignore in release
>> >mode", as some people apparently think it means. To me, it means "at
>> >this point in the program, x is true".  It's that simple.
>> 
>> A language construct with such a meaning is useless as a safety
>> feature.
>
> I don't see it as a safety feature at all.

Sorry, I should have written "correctness feature". I agree that it's not very useful for safety per se. (But of course, safety and correctness are not unrelated.)

>
>
>> If I first have to prove that the condition is true before I can
>> safely use an assert, I don't need the assert anymore, because I've
>> already proved it.
>
> I see it as future proofing: I may have proven the condition for *this*
> version of the program, but all software will change (unless it's dead),
> and change means the original proof may no longer be valid, but this
> part of the code is still written under the assumption that the
> condition holds. In most cases, it *does* still hold, so in general
> you're OK, but sometimes a change invalidates an axiom that, in
> consequence, invalidates the assertion.  Then the assertion will trip
> (in non-release mode, of course), telling me that my program logic has
> become invalid due to the change I made.  So I'll have to fix the
> problem so that the condition holds again.

Well, I think it's unlikely that you actually did prove the assert condition, except in trivial situations. This is related to the discussion about the ranges example, so I'll respond there.

>
>
>> If it is intended to be an optimization hint, it should be implemented
>> as a pragma, not as a prominent feature meant to be widely used. (But
>> I see that you have a different use case, see my comment below.)
>
> And here is the beauty of the idea: rather than polluting my code with
> optimization hints, which are out-of-band (and which are generally
> unverified and may be outright wrong after the code undergoes several
> revisions), I am stating *facts* about my program logic that must hold
> -- which therefore fits in very logically with the code itself. It even
> self-documents the code, to some extent. Then as an added benefit, the
> compiler is able to use these facts to emit more efficient code. So to
> me, it *should* be a prominent, widely-used feature. I would use it, and
> use it a *lot*.

I think this is where we disagree mainly: What you call facts is something I see as intentions that *should* be true, but are not *proven* to be so. Again, see below.

>
> 
>> >The optimizer only guarantees (in theory) consistent program
>> >behaviour if the program is valid to begin with. If the program is
>> >invalid, all bets are off as to what its "optimized" version does.
>> 
>> There is a difference between invalid and undefined: A program is
>> invalid ("buggy"), if it doesn't do what it's programmer intended,
>> while "undefined" is a matter of the language specification. The
>> (wrong) behaviour of an invalid program need not be undefined, and
>> often isn't in practice.
>
> To me, this distinction doesn't matter in practice, because in practice,
> an invalid program produces a wrong result, and a program with undefined
> behaviour also produces a wrong result. I don't care what kind of wrong
> result it is; what I care is to fix the program to *not* produce a wrong
> result.

Please see my response to Jeremy; the distinction is important:
http://forum.dlang.org/thread/hqxoldeyugkazolllsna@forum.dlang.org?page=11#post-eqlyruvwmzbpemvnrebw:40forum.dlang.org

>
>
>> An optimizer may only transform code in a way that keeps the resulting
>> code semantically equivalent. This means that if the original
>> "unoptimized" program is well-defined, the optimized one will be too.
>
> That's a nice property to have, but again, if my program produces a
> wrong result, then my program produces a wrong result. As a language
> user, I don't care that the optimizer may change one wrong result to a
> different wrong result.  What I care about is to fix the code so that
> the program produces the *correct* result. To me, it only matters that
> the optimizer does the Right Thing when the program is correct to begin
> with. If the program was wrong, then it doesn't matter if the optimizer
> makes it a different kind of wrong; the program should be fixed so that
> it stops being wrong.

We're not living in an ideal world, unfortunately. It is bad enough that programs are wrong as they are written, we don't need the compiler to transform these programs to do something that is still wrong, but also completely different. This would make your goal of fixing the program very hard to achieve. In an extreme case, a small error in several million lines of code could manifest at a completely different place, because you cannot rely on any determinism once undefined behaviour is involved.

>
>
>> >Yes, the people using assert as a kind of "check in debug mode but
>> >ignore in release mode" should really be using something else
>> >instead, since that's not what assert means. I'm honestly astounded
>> >that people would actually use assert as some kind of
>> >non-release-mode-check instead of the statement of truth that it was
>> >meant to be.
>> 
>> Well, when this "something else" is introduced, it will need to
>> replace almost every existing instance of "assert", as the latter must
>> only be used if it is proven that the condition is always true. To
>> name just one example, it cannot be used in range `front` and
>> `popFront` methods to assert that the range is not empty, unless there
>> is an additional non-assert check directly before it.
>
> I don't follow this reasoning. For .front and .popFront to assert that
> the range is non-empty, simply means that user code that attempts to do
> otherwise is wrong by definition, and must be fixed. I don't care if
> it's wrong as in invalid, or wrong as in undefined, the bottom line is
> that code that calls .front or .popFront on an empty range is
> incorrectly written, and therefore must be fixed.

Just above you wrote that you "may have proven the condition". But in code like the following, there cannot be a proof:

    @property T front() {
        assert(!empty);
        return _other_range.front;
    }

This is in the standard library. The authors of this piece of code cannot have proven that the user of the library only calls `front` on a non-empty range. Now consider the following example (mostly made up, but not unrealistic) that parses a text file (this could be a simple text-based data format):

    // ...
    // some processing
    // ...
    input.popFront();
    // end of line? => swallow and process next line
    if(input.front == '\n') { // <- this is wrong
        input.popFront();
        continue;
    }
    // ...
    // more code that doesn't call `input.popFront`
    // ...
    // more processing of input
    if(!input.empty) {    // <- HERE
        // use input.front
    }

With the above definition of `front`, the second check marked "HERE" can be removed by the compiler. Even worse, if you insert `writeln(input.empty)` before the check for debugging, it might also output "false" (depending on how far the compiler goes).

Yes, this code is wrong. But it's an easy mistake to make, it might not be detected during testing because you only use correctly formatted input files, and it might also not lead to crashes (the buffer is unlikely to end at a boundary to unmapped memory).

Now the assert - which is supposed to be helping the programmer write correct code - has made it _harder_ to detect the cause of an error.

What's worse is that it also removed a check that was necessary. This check could have been inserted by the programmer because the section of the code is security relevant, and they didn't want to rely on the input file to be correct. The compiler has thereby turned a rather harmless mistake that would under normal circumstances only lead to an incorrect output into a potentially exploitable security bug.


> -- snip --
> But if I've convinced myself that it is
> correct, then I might as well disable the emptiness checks so that my
> product will deliver top performance -- since that wouldn't be a problem
> in a correct program.

The problem is, as I explained above, that it doesn't just disable the emptiness checks where the asserts are. A simple mistake can have subtle and hard to debug effects all over your program.

> In theory, the optimizer could use CTFE to reduce the function call, and
> thereby discover that the code is invalid. We don't have that today, but
> conceivably, we can achieve that one day.
>
> But taking a step back, there's only so much the compiler can do at
> compile-time. You can't stop *every* unsafe usage of something without
> also making it useless. While the manufacturer of a sharp cutting tool
> will presumably do their best to ensure the tool is safe to use, it's
> impossible to prevent *every* possible unsafe usage of said tool. If the
> user points the chainsaw to his foot, he will lose his foot, and there's
> nothing the manufacturer can do to prevent this except shipping a
> non-functional chainsaw. If the user insists on asserting things that
> are untrue, there will always be a way to bypass the compiler's static
> checks and end up with undefined behaviour at runtime.

I wouldn't be so pessimistic ;-)

I guess most assert conditions are simple, mostly just comparisons or equality checks of one value with a constant. This should be relatively easy to verify with some control/data flow analysis (which Walter avoided until now, understandably).

But CTFE is on the wrong level. It could only detect some of the failed conditions. It needs to be checked on a higher lever, as real correctness proofs. If an assert conditions cannot be proved - because it's always wrong, or just sometimes, or because the knowledge available to the compiler is not enough - it must be rejected. Think of it like an extension of type and const checking.

>
>
>> It would be great if this were possible. In the example of `front` and
>> `popFront`, programs that call these methods on a range that could
>> theoretically be empty wouldn't compile. This might be useful for
>> optimization, but above that it's useful for verifying correctness.
>
> A sufficiently-aggressive optimizer might be able to verify this at
> compile-time by static analysis. But even that has its limits... for
> example:
>
> 	MyRange range;
> 	assert(range.empty);
> 	if (solveRiemannHypothesis()) // <-- don't know if this is true
> 		range.addElements(...);
>
> 	range.popFront(); // <-- should this compile or not?

It shouldn't, because it's not provable. However, most asserts are far less involved. There could be a specification of what is guaranteed to work, and what all compilers must therefore support.

>
>
>> Unfortunately this is not what has been suggested (and was evidently
>> intended from the beginning)...
>
> I don't think static analysis is *excluded* by the current proposal. I
> can see it as a possible future enhancement. But the fact that we don't
> have it today doesn't mean we shouldn't step in that direction.

I just don't see how we're stepping into that direction at all. It seems like the opposite: instead of trying to prove the assertions statically, they're going to be believed without verification.
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19