On Wednesday, 24 May 2023 at 23:04:53 UTC, Cecil Ward wrote:
> Due to some stupidity of mine involving misuse of templates I ended up with a routine that simply called itself infinitely. Directly, that is, and in this case the routine was optimised down into a simple infinite loop with no loop body content.
I had this almost happen in my C++ code. It never ran because I got a compile error. The compiler is MSVC.
What I did is quite simple: Have an overload that forwards to another overload; essentially:
R f(actual args) { /*actual work...*/ }
R f(alternative args) => f(actual args);
I misspelled the second and it called itself.
> Is it possible for our compilers to detect direct infinite recursion? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.
As MSVC shows, the answer is yes. Obviously, simple cases are detectable. Of course, solving this 100% is equivalent to the halting problem. This is not a problem whatsoever; D has multiple examples of best-effort approaches: @safe
is a good example. Not every code that actually is without undefined behavior can be detected as such, but a lot of simple cases can. Likewise, not every infinite recursion (or infinite loop without side-effects) can be detected, but simple cases can.
I know a little about C++ compilers: GCC and Clang produce warnings in the font-end, but MSVC produces warnings in the back-end. For that reason, MSVC can report problems that became apparent after optimization.
In C++, an infinite loop without observable effects is undefined behavior. The optimizer can use this in the form that a loop that is not obviously finite can be assumed to be finite if it does not have observable effects.
To my knowledge, D does not make infinite loop without observable effects undefined behavior, but D has another policy: It’s reasonable to make an error what the programmer cannot possibly want. An infinite loop without observable effects is an example of that.
> And while we’re at it, is it possible to detect all kinds of straightforward infinite loops (not coming from recursion)?
I’d say that “all kinds of straightforward” is a contradiction in itself. The term “straightforward” is subjective. The detection provably cannot be perfect. We have to select cases.
> That is where the generated code, perhaps as a result of optimisation, is a straight loop with no body. Can we implement that?
Doing that after optimization has one problem: Optimization is optional and subject to invisible improvement. Whether you get compile-errors should not depend on which compiler you use and what optimization level you specified.
Practical example: Say you have code that is practically dead code and has an infinite loop. It’s practice never reached (or provably unreachable, but the compiler doesn’t detect it). Everything compiles and is fine. Now we make the infinite loop an error, when it’s detected after optimization. The optimizer doesn’t reduce it enough, so still everything compiles and is fine. In a later compiler version, the optimizer is improved, and now the loop is detected and the build fails. This is not a great user experience.
It’s totally another thing if it’s a warning.
It’s totally another thing if the detection is in the front end, the changelog mentions an improvement to infinite loop detection and the code falls under it.
If my boss asks my why the build failed, I could point to the compiler update and the changelog.
> It ought to be possible because the backend can spot a pattern like "L1: jmp L1".
- If it comes from the front-end and is consistent among compilers and optimization levels, it should be an error. It could be a warning, but the no-warnings policy applies.
- If it comes from the back-end, it should be a warning.
> I promise I’m not trying to solve the Halting Problem.
The Halting Problem can be solved in special cases. It is perfectly reasonable to ask for best-effort solutions. @safe
is one of them.
A crucial component is pure
, which D fortunately already has. A quick sketch would be:
- A loop with a constant-
true
condition is infinite if it contains no break
. It has definitely no observable effects if all operations in it are pure
.
- A function definitely recurses infinitely if the only reachable
return
statements lead to a recursion. It has no observable effects if all operations between the function entry and those return
statements are pure
.
In contrast to @safe
, which is an under-approximation, the infinite loop detection is an over-approximation. By that I mean that – accepts-invalid bugs aside – code that is @safe
today will remain @safe
in the future; code that today isn’t @safe
but would be valid as @trusted
today might be @safe
in v2.142.1. The @safe
checks reject code that might be unsafe. The infinite loop detection wouldn’t reject code that might be an infinite loop (there are languages that do that; e.g. Coq guarantees termination); it would reject some code that definitely is an infinite loop. And with improvements, there might be more and more (not less and less) cases that will be detected as definitely an infinite loop.
DMD neither detects while(1){}
nor int f(int x) => f(x);
. Those would be bare-minimum test cases. You get an error for the latter if you make the return type auto
, so there is something to work with. auto
return type detection accounts for infinite loops, even via mutual recursion (e.g. f
calls g
, g
calls h
, h
calls f
).
Actually, you can count the fact that, in D, an empty statement is invalid for loop bodies, is a measure against unintentional infinite loops: while (cond());
in usual C++ compilers triggers an infinite-loop warning; in D, it’s syntactically invalid.