Jump to page: 1 2
Thread overview
Direct recursion detection possible?
May 24, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 25, 2023
Timon Gehr
May 25, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 25, 2023
Quirin Schroll
May 25, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 25, 2023
Ali Çehreli
May 25, 2023
Quirin Schroll
May 25, 2023
Dennis
May 25, 2023
Cecil Ward
May 25, 2023
Cecil Ward
May 24, 2023
Due to some stupidity of mine involving misuse of templates I ended up with a routine that simply called itself infinitely. Directly, that is, and in this case the routine was optimised down into a simple infinite loop with no loop body content.

Is it possible for our compilers to detect direct infinite recursion ? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.

And while we’re at it, is it possible to detect all kinds of straightforward infinite loops (not coming from recursion)? That is where the generated code, perhaps as a result of optimisation, is a straight loop with no body. Can we implement that?

It ought to be possible because the backend can spot a pattern like "L1:  jmp  L1".

I promise I’m not trying to solve the Halting Problem :)

Cecil Ward.
May 24, 2023

On 5/24/23 7:04 PM, Cecil Ward wrote:

>

Is it possible for our compilers to detect direct infinite recursion ? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.

Not in the general case. Direct recursion is a common pattern (direct inifite recursion obviously is not desired, but detecting that is the halting problem).

-Steve

May 25, 2023
On Thursday, 25 May 2023 at 02:30:35 UTC, Steven Schveighoffer wrote:
> On 5/24/23 7:04 PM, Cecil Ward wrote:
>
>> Is it possible for our compilers to detect direct infinite recursion ? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.
>
> Not in the general case. Direct recursion is a common pattern (direct inifite recursion obviously is not desired, but detecting that is the halting problem).
>
> -Steve

Could that one limited case of direct straight loops be detected in the backend, just as one limited common pattern, then sending a message upwards to throw out an error message for the user in the error listing and stop object code generation? It seems worthwhile to me having been bitten by it for the first time. Spotting "l1: jmp l1" could be almost like a regex but I wouldn’t think that’s how it works.
May 25, 2023
On 5/25/23 09:30, Cecil Ward wrote:
> On Thursday, 25 May 2023 at 02:30:35 UTC, Steven Schveighoffer wrote:
>> On 5/24/23 7:04 PM, Cecil Ward wrote:
>>
>>> Is it possible for our compilers to detect direct infinite recursion ? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.
>>
>> Not in the general case. Direct recursion is a common pattern (direct inifite recursion obviously is not desired, but detecting that is the halting problem).
>>
>> -Steve
> 
> Could that one limited case of direct straight loops be detected in the backend, just as one limited common pattern, then sending a message upwards to throw out an error message for the user in the error listing and stop object code generation? It seems worthwhile to me having been bitten by it for the first time. Spotting "l1: jmp l1" could be almost like a regex but I wouldn’t think that’s how it works.

I guess one challenge is you don't really know if the code is reachable. In principle it might even happen that the optimizer introduces new unreachable code that happens to have an infinite loop in it, then fails to detect it's not reachable.
May 25, 2023
On Thursday, 25 May 2023 at 02:30:35 UTC, Steven Schveighoffer wrote:
> On 5/24/23 7:04 PM, Cecil Ward wrote:
>
>> Is it possible for our compilers to detect direct infinite recursion ? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.
>
> Not in the general case. Direct recursion is a common pattern (direct inifite recursion obviously is not desired, but detecting that is the halting problem).
>
> -Steve

Understood, quite so. Indeed that’s why I kept it to ‘direct’. Is anyone up for helping me deal with that one limited case of direct straight loops ?

This one case could even be detected in a backend, for want of a better solution. Just one limited common pattern, when spotted in the machine code generation, on seeing the "l1: jmp l1" pattern, which is without content from any basic block in the body, when an unconditional jump target is put to the output, that match then spots the pattern and triggers the output of an error level message for the user in the error listing and prevents object file output. Of course a check at a much higher level for this one specific pattern would be far better.

It’s only my own opinion, but it does seem a worthwhile specific error checking case to me, having been bitten by it for the first time, because I get any message at all. Spotting "l1: jmp l1" could be almost like a regex but I wouldn’t think that’s how things work. Nor I wouldn’t think that that’s necessarily the right place for such detection. There’s perhaps a ‘right’ place higher up. But I’m out of my depth with compiler innards here.

Would other users also like the safety of direct loop detection? Whether or not it comes from recursion, that is. A bad loop condition test too would be another case, where the compiler has worked out that the condition has a known boolean value.
May 25, 2023
On Thursday, 25 May 2023 at 08:03:24 UTC, Cecil Ward wrote:
> On Thursday, 25 May 2023 at 02:30:35 UTC, Steven Schveighoffer wrote:
>> [...]
>
> Understood, quite so. Indeed that’s why I kept it to ‘direct’. Is anyone up for helping me deal with that one limited case of direct straight loops ?
>
> [...]

Our posts crossed, as I was updating mine, Timon.
May 25, 2023
On Thursday, 25 May 2023 at 08:08:45 UTC, Cecil Ward wrote:
> On Thursday, 25 May 2023 at 08:03:24 UTC, Cecil Ward wrote:
>> On Thursday, 25 May 2023 at 02:30:35 UTC, Steven Schveighoffer wrote:
>>> [...]
>>
>> Understood, quite so. Indeed that’s why I kept it to ‘direct’. Is anyone up for helping me deal with that one limited case of direct straight loops ?
>>
>> [...]
>
> Our posts crossed, as I was updating mine, Timon.

This is how I got into my infinite recursion, roughly. See below. I had to kludge it by renaming the first overload given below, ‘specific 16-bit arts’ one. The template below generated an asm instruction quite happily.


/* 16-bit alternative for the 2-operand form, the raw opcode, with the correct widenings */
uint
bextr16( in uint16_t src_rm, in uint16_t bitfieldstartwidth_rb )  pure nothrow @nogc @trusted
	{
	return bextr( cast(uint32_t) src_rm, cast(uint32_t) bitfieldstartwidth_rb );
	}
	
T
bextr(T)( in T src_rm, in T bitfieldstartwidth_rb ) pure nothrow @nogc @trusted
                 if (is( T == uint64_t ) || is( T == uint32_t ) )
	{
	asm { … }
	}
May 25, 2023

On Wednesday, 24 May 2023 at 23:04:53 UTC, Cecil Ward wrote:

>

Due to some stupidity of mine involving misuse of templates I ended up with a routine that simply called itself infinitely. Directly, that is, and in this case the routine was optimised down into a simple infinite loop with no loop body content.

I had this almost happen in my C++ code. It never ran because I got a compile error. The compiler is MSVC.

What I did is quite simple: Have an overload that forwards to another overload; essentially:

R f(actual args) { /*actual work...*/ }
R f(alternative args) => f(actual args);

I misspelled the second and it called itself.

>

Is it possible for our compilers to detect direct infinite recursion? I’m ignoring the indirect call case, and I’m also ignoring anything involving function pointers, so given those restrictions, we will be able to see that the generated code of a routine is calling its own address, because that address is a literal and we know what the current function is that we’re currently in.

As MSVC shows, the answer is yes. Obviously, simple cases are detectable. Of course, solving this 100% is equivalent to the halting problem. This is not a problem whatsoever; D has multiple examples of best-effort approaches: @safe is a good example. Not every code that actually is without undefined behavior can be detected as such, but a lot of simple cases can. Likewise, not every infinite recursion (or infinite loop without side-effects) can be detected, but simple cases can.

I know a little about C++ compilers: GCC and Clang produce warnings in the font-end, but MSVC produces warnings in the back-end. For that reason, MSVC can report problems that became apparent after optimization.

In C++, an infinite loop without observable effects is undefined behavior. The optimizer can use this in the form that a loop that is not obviously finite can be assumed to be finite if it does not have observable effects.

To my knowledge, D does not make infinite loop without observable effects undefined behavior, but D has another policy: It’s reasonable to make an error what the programmer cannot possibly want. An infinite loop without observable effects is an example of that.

>

And while we’re at it, is it possible to detect all kinds of straightforward infinite loops (not coming from recursion)?

I’d say that “all kinds of straightforward” is a contradiction in itself. The term “straightforward” is subjective. The detection provably cannot be perfect. We have to select cases.

>

That is where the generated code, perhaps as a result of optimisation, is a straight loop with no body. Can we implement that?

Doing that after optimization has one problem: Optimization is optional and subject to invisible improvement. Whether you get compile-errors should not depend on which compiler you use and what optimization level you specified.

Practical example: Say you have code that is practically dead code and has an infinite loop. It’s practice never reached (or provably unreachable, but the compiler doesn’t detect it). Everything compiles and is fine. Now we make the infinite loop an error, when it’s detected after optimization. The optimizer doesn’t reduce it enough, so still everything compiles and is fine. In a later compiler version, the optimizer is improved, and now the loop is detected and the build fails. This is not a great user experience.

It’s totally another thing if it’s a warning.

It’s totally another thing if the detection is in the front end, the changelog mentions an improvement to infinite loop detection and the code falls under it.
If my boss asks my why the build failed, I could point to the compiler update and the changelog.

>

It ought to be possible because the backend can spot a pattern like "L1: jmp L1".

  • If it comes from the front-end and is consistent among compilers and optimization levels, it should be an error. It could be a warning, but the no-warnings policy applies.
  • If it comes from the back-end, it should be a warning.
>

I promise I’m not trying to solve the Halting Problem.

The Halting Problem can be solved in special cases. It is perfectly reasonable to ask for best-effort solutions. @safe is one of them.

A crucial component is pure, which D fortunately already has. A quick sketch would be:

  • A loop with a constant-true condition is infinite if it contains no break. It has definitely no observable effects if all operations in it are pure.
  • A function definitely recurses infinitely if the only reachable return statements lead to a recursion. It has no observable effects if all operations between the function entry and those return statements are pure.

In contrast to @safe, which is an under-approximation, the infinite loop detection is an over-approximation. By that I mean that – accepts-invalid bugs aside – code that is @safe today will remain @safe in the future; code that today isn’t @safe but would be valid as @trusted today might be @safe in v2.142.1. The @safe checks reject code that might be unsafe. The infinite loop detection wouldn’t reject code that might be an infinite loop (there are languages that do that; e.g. Coq guarantees termination); it would reject some code that definitely is an infinite loop. And with improvements, there might be more and more (not less and less) cases that will be detected as definitely an infinite loop.

DMD neither detects while(1){} nor int f(int x) => f(x);. Those would be bare-minimum test cases. You get an error for the latter if you make the return type auto, so there is something to work with. auto return type detection accounts for infinite loops, even via mutual recursion (e.g. f calls g, g calls h, h calls f).

Actually, you can count the fact that, in D, an empty statement is invalid for loop bodies, is a measure against unintentional infinite loops: while (cond()); in usual C++ compilers triggers an infinite-loop warning; in D, it’s syntactically invalid.

May 25, 2023
On Thursday, 25 May 2023 at 10:35:45 UTC, Quirin Schroll wrote:
> On Wednesday, 24 May 2023 at 23:04:53 UTC, Cecil Ward wrote:
>> [...]
>
> I had this almost happen in my C++ code. It never ran because I got a compile error. The compiler is MSVC.
>
> [...]

My apologies, when I wrote ‘straightforward’ I should meant to say, ‘without body, with no content’. You mentioned without side-effects, which is more general and very useful.
May 25, 2023
On Thursday, 25 May 2023 at 13:46:32 UTC, Cecil Ward wrote:
> On Thursday, 25 May 2023 at 10:35:45 UTC, Quirin Schroll wrote:
>> On Wednesday, 24 May 2023 at 23:04:53 UTC, Cecil Ward wrote:
>>> [...]
>>
>> I had this almost happen in my C++ code. It never ran because I got a compile error. The compiler is MSVC.
>>
>> [...]
>
> My apologies, when I wrote ‘straightforward’ I should meant to say, ‘without body, with no content’. You mentioned without side-effects, which is more general and very useful.

I would prefer it to come from the front end, but I thought that would be too hard to get any volunteers, so that is why I suggested the back end. Following your idea I suggest there should also be a test in the back end but I would make that an error too seeing as some people ignore warnings and if ever there were a case more serious than a warning this is it.

I don’t agree with your point about optimisation because optimisation often discovers andditional information. CTFE and constant propagation are surely examples - is that correct? And additional information can detect more bugs in principle.

« First   ‹ Prev
1 2