Thread overview
Loop invariants
Sep 27
monkyyy
Oct 03
Dukc
September 27

In spirit of this issue comment, but here with better syntax.

We could use the invariant keyword for loop invariants. Of course, loop invariants can be specified already, so they’d fall into a similar category as scope guards: They make it easy to convey intent and get things right without thinking too much. And in fact, doing loop invariants by hand is a lot harder than doing scope guards by hand, one reason is that doing invariants by hand uses a scope guard.

As with any form of assertion, loop invariants can help programmers find errors in the implementation or their reasoning. Loop invariants are very aggressive and must hold on more spots than most people expect (including myself before writing this).

If you open a textbook or the Wikipedia page on Loop invariant, it’ll tell you: A loop invariant must hold before the loop runs, at the beginning of the loop, at the end of the loop, and after the loop.

I interpret “at the end of the loop” as exiting the loop by any means other than throwing an exception. (Detailed reasoning at the end of the post.)

The basic idea is that after the loop header, allow for invariant(AssertArguments) or invariant{ Statements }, where (as with aggregate invariant) the syntax invariant(AssertArguments) is a shorthand for invariant{ assert(AssertArguments); }.

D has 4 kinds of loops:

  • dowhile loops
  • while loops
  • for loops
  • foreach loops (includes foreach_reverse)

Each of these has to be considered individually. I ordered them by my personal perceived complexity of adding invariants to them.

Any assert statements in loop invariants are emitted if and only if the compiler option -check=invariant is active, independent of -check=assert.

In the following, if it says that an expression is “sufficiently pure” it means that the function derived from the expression as follows compiles:
Replace the Expression by an immediately invoked lambda ((Parameters) pure => Expression)(Arguments) where Arguments is the list of all referenced mutable variables (including globals) and Parameters is the same list as Arguments, just with every parameter prefixed const ref. For example: i < n is transformed to ((const ref i, const ref n) pure => i < n)(i, n).
Note that this is not equivalent to just casting every mutable variable to const inside a pure immediately invoked lambda, i.e. (() pure => ExpressionwithConst)() because accessing globals is expressly allowed. It is purely about the lack of side-effects and mutation. Merely reading a mutable global is fine.

The dowhile loop

This loop is the easiest case, and it is already hard.

Example with proposed syntax:

int x = 10;
do invariant{ assert(x >= 0, "optional message") }
{
    // loop body
}
while (x > 0);

Example without proposed syntax:

int x = 10;
do
{
    { assert(x >= 0, "optional message"); }
    scope(success) { assert(x >= 0, "optional message"); }
    // loop body
}
while (x > 0);
{ assert(x >= 0, "optional message"); } // not needed if the loop condition is sufficiently pure

The invariant check after do can be moved before do if the loop condition is sufficiently pure.

The while loop

Example with proposed syntax:

int x = 10;
while (x > 0) invariant(x >= 0)
{
    // loop body
}

Example without proposed syntax:

int x = 10;
assert(x >= 0);
while (x > 0)
{
    assert(x >= 0);
    scope(success) assert(x >= 0);
    // loop body
}
assert(x >= 0); // not needed if the loop condition is sufficiently pure

However, when a variable is declared in the loop condition and that variable is referred to by the invariant, the two assert expressions before and after the loop are not well-formed.

In those, consider the assert condition a tree of && and || and replace nodes that reference the loop variable by true if the parent node is && or the assert, and false if the parent node is ||. If the message references the loop variable, remove the message.

If the invariant uses block syntax and the loop variable is referenced outside assert, the whole invariant is ignored (i.e. vacuously true).

Example with proposed syntax:

int m = 0;
while (int x = init()) invariant(x >= m && m > 0)
{
    // loop body
}

int n = 0;
while (int x = init()) invariant(x >= n)
{
    // loop body
}

Example without proposed syntax:

int m = 0;
assert(true && m > 0);
while (int x = init())
{
    assert(x >= m && m > 0);
    scope(success) assert(x >= m && m > 0);
    // loop body
}
assert(true && m > 0); // not needed if the loop condition is sufficiently pure

int n = 0;
// assert(true); // need not be emitted
while (int x = init())
{
    assert(x >= n);
    scope(success) assert(x >= n);
    // loop body
}
// assert(true); // need not be emitted

The for loop

Example with proposed syntax:

for (int i = 0, n = 10; i < n; ++i)
invariant(i <= n && n > 0)
{
    // loop body
}

Example without proposed syntax:

{
int i = 0, n = 10;
assert(i <= n && n > 0);
for (; i < n; ++i)
{
    assert(i <= n && n > 0);
    scope(success) assert(i <= n && n > 0);
    // loop body
}
assert(i <= n && n > 0); // MUST BE EMITTED EVEN IF condition is sufficiently pure
}

In the for loop, no replacement is needed. The variables declared in the initialization part can be moved before the check, so they are accessible after the loop ends. A replacement like in the while loop would be needed if the condition of a for loop could declare a variable as well, but as of now, it can’t.

Another difference is that the assert after the loop must be emitted even if the loop condition is sufficiently pure. Only if the if the loop condition is sufficiently pure and the increment is sufficiently pure (which is rare in practice), it can be omitted. That is because the scope(success) runs before the increment.

The foreach loop

Normally, a foreach loop is lowered to a for loop.
This lowering cannot be used for the handling of invariants as the invariant would be tested before the foreach variable is set.

The invariant lowering must happen before the lowering of foreach to a for loop.

Example with proposed syntax:

foreach (x; xs)
invariant(x >= 0 && xs.length > 0)
{
    // loop body
}

Example without proposed syntax:

assert(true && xs.length > 0);
foreach (x; xs)
{
    assert(x >= 0 && xs.length > 0);
    scope(success) assert(x >= 0 && xs.length > 0);
    // loop body
}
assert(true && xs.length > 0); // May be omitted (see below)

As with a while loop, before and after the loop, uses of the loop variable in the invariant are replaced by true or false, and if the invariant uses a block that references the loop variable outside an assert, the whole invariant is ignored.

The omission of the assert after the loop depends on what the foreach runs over.

  • It can be omitted if the foreach runs over a built-in array or slice type, as the generated increment (of the index) is pure and the condition has no access to the index.
  • It can be omitted if the foreach runs over a built-in associative array as the compiler can assume the code between the last iteration and the end of _aaApply is pure and the condition has no access to the iteration state.
  • It can be omitted if the aggregate is a range and the popFront call is sufficiently pure. In practice, it basically never is, as it modifies the range, but some trivial infinite ranges may actually have a sufficiently pure popFront. It can also be omitted if popFront is pure and the condition has no access to the range (which is in general hard to determine).
  • It can be omitted if the aggregate uses opApply and that is sufficiently pure (it rarely is, as it requires passing a strongly pure delegate that is additionally const or immutable – actually that, not with the current buggy implementation) or the assert condition is sufficiently pure and has no access to the range (which is in general hard to determine).

(I might be wrong on these conditions.)


In general, it is rather clear what it means that the invariant must hold before the loop, at the start of the loop and after the loop. The only piece of contention is what exactly “at the end of the loop” means. Of course, it includes that control reaches the end of the loop, but a loop can be exited in other ways: break is one, goto with a label outside the loop is one, and via exception is another one.

In my personal opinion, break is a common way to exit a loop and the invariant must hold on break. While some might argue that goto with a label outside the loop is special, I think it’s not. On the other hand, exiting via exception is a different beast. You know where you break and goto, but it’s in general hard to foresee where an exception might end the loop prematurely and if a failed invariant changes the thrown Exception to an Error, that might be really unexpected. This is why I opted for scope(success) and not scope(exit).

September 27

On Friday, 27 September 2024 at 18:10:13 UTC, Quirin Schroll wrote:

>
int x = 10;
assert(x >= 0);
while (x > 0)
{
    assert(x >= 0);
    scope(success) assert(x >= 0);
    // loop body
}
assert(x >= 0); // not needed if the loop condition is ```

While not just simply make this to an (even worse) do-while loop

int x=10;
do_allot{
  assert(x>=0);
} while(x>0){
  ...
}

where do_allot will run all these extra times?

>

If you open a textbook or the Wikipedia page on Loop invariant, it’ll tell you: A loop invariant must hold before the loop runs, at the beginning of the loop, at the end of the loop, and after the loop.

Why would wikipedias definition matter? If the current naive syntax of foreach(...){assert() runs N times, how much value is there in a special case that runs N+2 times?

October 03

On Friday, 27 September 2024 at 18:10:13 UTC, Quirin Schroll wrote:

>

In spirit of this issue comment, but here with better syntax.

Would these invariants be used often? Personally I find that my assertion usage is almost exclusively plain assertions, in contracts and unit test assertions. I rarely end up using the type invariants. I think it'd be even rarer for me to use the loop invariant. For me personally, this would be needless complexity.

Maybe it's just me though. What do others think? Do you use (or would you use, given some kind of project) assertions so aggressively that you feel this would improve the language?

invariant is already a keyword and this isn't much different from how the existing contracts work so it wouldn't a complex feature. I therefore think that a significant minority who would benefit from this would be enough justification for the feature. It doesn't have to be over 50%.

October 04

On Friday, 27 September 2024 at 18:10:13 UTC, Quirin Schroll wrote:

>

In spirit of this issue comment, but here with better syntax.

According to the linked issue, this feature was proposed in the past, but was "discarded as being of negligible value." Unless you have some specific reason to think that loop invariants will be more useful to D programmers today than they were back then, this is probably not worth pursuing.