How to track down a bad llvm optimization pass (page 4)

On 30 Jun 2016, at 16:40, Joakim via digitalmars-d-ldc wrote: > I assumed that undef was some kind of poison value undef is indeed "some kind of poison value", in that each use of it evaluates to a (potentially different) arbitrary bit string. By itself, using an undef isn't undefined behaviour, but of course for many operations it ultimately is, because there are bit string inputs for which these operations are undefined (e.g. loads, stores). LLVM knows a concept called "poison values" too, which are undefs with slightly stronger semantics produced by C-style signed integer arithmetic overflow and similar operations – in loose terms, any operation that depends on them in an externally visible way has undefined behaviour. I usually find the LLVM language reference (http://llvm.org/docs/LangRef.html) to be quite a clear resource for these sorts of questions. > [should] the inlining pass […] just be returning undef […]? Since this is at compile-time, I don't think it should. […] > Are we supposed to be running sanitizers or something else to avoid these bugs? First off, as it currently stands, this is certainly not an issue in LLVM. The lshr instruction is documented as resulting in undefined behaviour when used with an out-of-range shift. Replacing the whole call with `undef` is thus a valid IR transformation. So far for LLVM working as designed. The question of course becomes whether, being a compiler writer's tool, it would be nice for it to emit a warning on such transformations. And here things suddenly become muddy. Yes, in this case, getting a warning would be useful. However, if the code was not actually reachable dynamically, a warning would be wrong. Of course, this can be solved by offering a way to declare basic blocks/functions to be considered reachable for that purpose, but that introduces extra complexity – I wouldn't be surprised if the fact that you'd need to design something along these lines was the main reason why LLVM does not try to report such conditions. Of course, language frontends can always emit dynamical checks to avoid executing llvm::Instructions with UB-inducing arguments, whether in the form of sanitisers, or by default as part of faithfully lowering their semantics. — David

July 02, 2016

Re: How to track down a bad llvm optimization pass

Posted by Joakim
in reply to David Nadlinger

Permalink

Joakim

Posted in reply to David Nadlinger

Permalink

On Thursday, 30 June 2016 at 18:10:04 UTC, David Nadlinger wrote:
> On 30 Jun 2016, at 16:40, Joakim via digitalmars-d-ldc wrote:
>> I assumed that undef was some kind of poison value
>
> undef is indeed "some kind of poison value", in that each use of it evaluates to a (potentially different) arbitrary bit string. By itself, using an undef isn't undefined behaviour, but of course for many operations it ultimately is, because there are bit string inputs for which these operations are undefined (e.g. loads, stores).
>
> LLVM knows a concept called "poison values" too, which are undefs with slightly stronger semantics produced by C-style signed integer arithmetic overflow and similar operations – in loose terms, any operation that depends on them in an externally visible way has undefined behaviour.
>
> I usually find the LLVM language reference (http://llvm.org/docs/LangRef.html) to be quite a clear resource for these sorts of questions.

OK, I read some IR reference on undef, but hadn't seen that llvm has a separate formal notion of a poison value, because it isn't part of the IR.

>> [should] the inlining pass […] just be returning undef […]? Since this is at compile-time, I don't think it should. […]
>> Are we supposed to be running sanitizers or something else to avoid these bugs?
>
> First off, as it currently stands, this is certainly not an issue in LLVM. The lshr instruction is documented as resulting in undefined behaviour when used with an out-of-range shift. Replacing the whole call with `undef` is thus a valid IR transformation.
>
> So far for LLVM working as designed. The question of course becomes whether, being a compiler writer's tool, it would be nice for it to emit a warning on such transformations. And here things suddenly become muddy. Yes, in this case, getting a warning would be useful.

I tried this with clang, it does exactly the same as ldc.  I've appended the output to the bug report:

https://llvm.org/bugs/show_bug.cgi?id=28224#c7

> However, if the code was not actually reachable dynamically, a warning would be wrong. Of course, this can be solved by offering a way to declare basic blocks/functions to be considered reachable for that purpose, but that introduces extra complexity – I wouldn't be surprised if the fact that you'd need to design something along these lines was the main reason why LLVM does not try to report such conditions.

I don't know that it matters if the code is "reachable dynamically," ie you mean it isn't actually used at runtime?  Most of the time, you'd want to know that llvm is just removing what it considers to be bad code, with no notice whatsoever.

> Of course, language frontends can always emit dynamical checks to avoid executing llvm::Instructions with UB-inducing arguments, whether in the form of sanitisers, or by default as part of faithfully lowering their semantics.

I tried the clang sanitizer, as noted in the bug report: it works, though at runtime.  As for the frontend doing it at compile-time, either we build a bunch of static analysis in to catch such bugs or we need some way to work with llvm IR and its optimization passes so it can tell us when such things are happening.  I was surprised that llvm is just silently doing this, with no warning.

On Saturday, 2 July 2016 at 07:29:47 UTC, Joakim wrote: > > I was surprised that llvm is just silently doing this, with no warning. This blog article gives some insight: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html

On Saturday, 2 July 2016 at 07:42:31 UTC, Johan Engelen wrote: > On Saturday, 2 July 2016 at 07:29:47 UTC, Joakim wrote: >> >> I was surprised that llvm is just silently doing this, with no warning. > > This blog article gives some insight: > http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html Thanks for the link, it's very relevant. I've mentioned these issues and closed the llvm bug as invalid, as that GVN pass was only surfacing this undefined behavior. Leaving aside the optimizer, the unoptimized phobos debug tests should have caught this, shouldn't they? I'll look into why those passed next.

On Monday, 4 July 2016 at 09:21:19 UTC, Joakim wrote: > On Saturday, 2 July 2016 at 07:42:31 UTC, Johan Engelen wrote: >> On Saturday, 2 July 2016 at 07:29:47 UTC, Joakim wrote: >>> >>> I was surprised that llvm is just silently doing this, with no warning. >> >> This blog article gives some insight: >> http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html > > Thanks for the link, it's very relevant. I've mentioned these issues and closed the llvm bug as invalid, as that GVN pass was only surfacing this undefined behavior. > > Leaving aside the optimizer, the unoptimized phobos debug tests should have caught this, shouldn't they? I'll look into why those passed next. Alright, tracked this down, it was a test that had a bad failure mode. I'll write it up in the General forum. Johan, you mentioned this other shift bug that you ran into: https://github.com/dlang/phobos/pull/4509 Was that just one that you had hit recently or did you go checking all 20+ uses of logical right shift in Phobos to look for bugs?

On Wednesday, 6 July 2016 at 18:00:40 UTC, Joakim wrote: > > Johan, you mentioned this other shift bug that you ran into: > > https://github.com/dlang/phobos/pull/4509 > > Was that just one that you had hit recently or did you go checking all 20+ uses of logical right shift in Phobos to look for bugs? Yeah that's the one I wrote about before. I discovered it after a long time debugging :( I did check the other >>>= in Phobos and they looked OK, but I did not spend much time on checking them.

Forums