How to track down a bad llvm optimization pass - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » LDC » How to track down a bad llvm optimization pass

Thread overview

How to track down a bad llvm optimization pass
Jun 16, 2016 Joakim
Jun 17, 2016 David Nadlinger
Jun 17, 2016 Kagamin
Jun 17, 2016 David Nadlinger
Jun 17, 2016 Kagamin
Jun 18, 2016 Joakim
Jun 18, 2016 Joakim
Jun 19, 2016 Dan Olson
Jun 19, 2016 Dan Olson
Jun 19, 2016 Joakim
Jun 20, 2016 David Nadlinger
Jun 20, 2016 Dan Olson
Jun 20, 2016 Rainer Schuetze
Jun 21, 2016 Joakim
Jun 21, 2016 Joakim
Jun 21, 2016 Dan Olson
Jun 21, 2016 Dan Olson
Jun 22, 2016 Dan Olson
Jun 22, 2016 Dan Olson
Jun 22, 2016 kink
Jun 22, 2016 Joakim
Jun 22, 2016 Joakim
Jun 22, 2016 Dan Olson
Jun 29, 2016 Joakim
Jun 29, 2016 Johan Engelen
Jun 30, 2016 Joakim
Jun 30, 2016 Johan Engelen
Jun 30, 2016 Joakim
Jun 30, 2016 David Nadlinger
Jul 02, 2016 Joakim
Jul 02, 2016 Johan Engelen
Jul 04, 2016 Joakim
Jul 06, 2016 Joakim
Jul 06, 2016 Johan Engelen
Jun 30, 2016 Johan Engelen
Jun 17, 2016 Dan Olson

June 16, 2016

How to track down a bad llvm optimization pass

Posted by Joakim

Joakim

Since the update to ddmdfe 2.070, a single assert trips up on Android/ARM when running the druntime/phobos tests:

https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716

I've copy-pasted the relevant lines of the test into another file, testconv.d:

version(unittest) import std.conv, std.array, std.range;
unittest
{
    auto r = toChars!(16)(16u);
    assert(r.length == 2);
    assert(r[1..2].array == "0");
}

The test runs fine with -O1, but with -O2 or -O3, ie the levels when inlining are enabled, the second assert fails.  If I compile with -O2/3 -disable-inlining _or_ comment out the first assert, it passes.  Here's the IR generated for the unittest block with the first assert commented out, and with it included.

./bin/ldc2 -unittest -O2 --output-ll -c testconv.d -of=without.ll

define void @_D8testconv14__unittestL2_1FZv() comdat {
  br label %forcond.i.i

forcond.i.i:                                      ; preds = %forcond.i.i, %0
  %indvars.iv.i.i = phi i32 [ %indvars.iv.next.i.i, %forcond.i.i ], [ 1, %0 ] ; [#uses = 2, type = i32]
  %value.0.i.i = phi i32 [ %1, %forcond.i.i ], [ 16, %0 ] ; [#uses = 1, type = i32]
  %1 = lshr i32 %value.0.i.i, 4                   ; [#uses = 2]
  %2 = icmp eq i32 %1, 0                          ; [#uses = 1]
  %indvars.iv.next.i.i = add nuw nsw i32 %indvars.iv.i.i, 1 ; [#uses = 1]
  br i1 %2, label %bounds.ok.i, label %forcond.i.i

bounds.ok.i:                                      ; preds = %forcond.i.i
  %indvars.iv.i.i.lcssa = phi i32 [ %indvars.iv.i.i, %forcond.i.i ] ; [#uses = 1, type = i32]
  %3 = tail call i8* @_D4core6memory2GC6mallocFNaNbkkxC8TypeInfoZPv(%object.TypeInfo* null, i32 2, i32 1) ; [#uses = 2]
  %4 = insertvalue { i32, i8* } { i32 1, i8* undef }, i8* %3, 1 ; [#uses = 1]
  %5 = shl i32 %indvars.iv.i.i.lcssa, 2           ; [#uses = 1]
  %6 = and i32 %5, 1020                           ; [#uses = 1]
  %7 = add nsw i32 %6, -12                        ; [#uses = 1]
  %8 = lshr i32 16, %7                            ; [#uses = 1]
  %9 = or i32 %8, 48                              ; [#uses = 1]
  %10 = trunc i32 %9 to i8                        ; [#uses = 1]
  store i8 %10, i8* %3, align 1
  %11 = tail call i32 @_adEq2({ i32, i8* } %4, { i32, i8* } { i32 1, i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str, i32 0, i32 0) }, %object.TypeInfo* nonnull @_D11TypeInfo_Aa6__initZ) #1 ; [#uses = 1]
  %12 = icmp eq i32 %11, 0                        ; [#uses = 1]
  br i1 %12, label %assertFailed, label %assertPassed

assertPassed:                                     ; preds = %bounds.ok.i
  ret void

assertFailed:                                     ; preds = %bounds.ok.i
  tail call void @_d_assert({ i32, i8* } { i32 10, i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str.1, i32 0, i32 0) }, i32 4) #2
  unreachable
}

./bin/ldc2 -unittest -O2 --output-ll -c testconv.d -of=with.ll

define void @_D8testconv14__unittestL2_1FZv() comdat {
  br label %forcond.i.i

forcond.i.i:                                      ; preds = %forcond.i.i, %0
  %indvars.iv.i.i = phi i32 [ %indvars.iv.next.i.i, %forcond.i.i ], [ 1, %0 ] ; [#uses = 2, type = i32]
  %value.0.i.i = phi i32 [ %1, %forcond.i.i ], [ 16, %0 ] ; [#uses = 1, type = i32]
  %1 = lshr i32 %value.0.i.i, 4                   ; [#uses = 2]
  %2 = icmp eq i32 %1, 0                          ; [#uses = 1]
  %indvars.iv.next.i.i = add nuw nsw i32 %indvars.iv.i.i, 1 ; [#uses = 1]
  br i1 %2, label %_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit, label %forcond.i.i

_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit: ; preds = %forcond.i.i
  %indvars.iv.i.i.lcssa = phi i32 [ %indvars.iv.i.i, %forcond.i.i ] ; [#uses = 1, type = i32]
  %3 = and i32 %indvars.iv.i.i.lcssa, 255         ; [#uses = 1]
  %4 = icmp eq i32 %3, 2                          ; [#uses = 1]
  br i1 %4, label %bounds.ok.i, label %assertFailed

bounds.ok.i:                                      ; preds = %_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit
  %5 = tail call i8* @_D4core6memory2GC6mallocFNaNbkkxC8TypeInfoZPv(%object.TypeInfo* null, i32 2, i32 1) ; [#uses = 2]
  %6 = insertvalue { i32, i8* } { i32 1, i8* undef }, i8* %5, 1 ; [#uses = 1]
  store i8 -1, i8* %5, align 1
  %7 = tail call i32 @_adEq2({ i32, i8* } %6, { i32, i8* } { i32 1, i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str.1, i32 0, i32 0) }, %object.TypeInfo* nonnull @_D11TypeInfo_Aa6__initZ) #3 ; [#uses = 1]
  %8 = icmp eq i32 %7, 0                          ; [#uses = 1]
  br i1 %8, label %assertFailed2, label %assertPassed1

assertFailed:                                     ; preds = %_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit
  tail call void @_d_assert({ i32, i8* } { i32 10, i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i32 0, i32 0) }, i32 3) #2
  unreachable

assertPassed1:                                    ; preds = %bounds.ok.i
  ret void

assertFailed2:                                    ; preds = %bounds.ok.i
  tail call void @_d_assert({ i32, i8* } { i32 10, i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i32 0, i32 0) }, i32 4) #2
  unreachable
}

Clearly the problem is that those seven instructions before the call to _adEq2 in the bounds.ok.i section of the first IR get turned into this nonsense instruction in the second IR:

store i8 -1, i8* %5, align 1

Why including the first assert combines with some optimization pass and inlining to produce this junk instruction instead, I don't know.  I don't think this is something that needs to be fixed on the ldc end, but who knows.  Anyone have any tips on tracking this down?

June 17, 2016

Re: How to track down a bad llvm optimization pass

Posted by David Nadlinger
in reply to Joakim

David Nadlinger

Posted in reply to Joakim

On 17 Jun 2016, at 0:23, Joakim via digitalmars-d-ldc wrote:
> Why including the first assert combines with some optimization pass and inlining to produce this junk instruction instead, I don't know.  I don't think this is something that needs to be fixed on the ldc end, but who knows.  Anyone have any tips on tracking this down?

Without having looked at the details of what's going on here in particular, my first step would be to use the "-print-after-all" option and look where the part that you think is clearly wrong first appears (e.g. just by textual search). Having a look at what the input to that pass was (i.e. the previous pass result) can often yield extra insight.

bugpoint, the LLVM tool for reducing test cases, also has some functionality to reduce the set of passes needed to trigger a miscompilation. It has worked for me in the past, but I always found configuring the linking/executing step to be a bit finicky.

As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.).

 — David

June 17, 2016

Re: How to track down a bad llvm optimization pass

Posted by Kagamin
in reply to David Nadlinger

Kagamin

Posted in reply to David Nadlinger

On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
> As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.).

Can't it be ruled out by the `opt` tool? Though I thought it's strange how it works on ir-to-ir level.

June 17, 2016

Re: How to track down a bad llvm optimization pass

Posted by David Nadlinger
in reply to Kagamin

David Nadlinger

Posted in reply to Kagamin

On 17 Jun 2016, at 13:39, Kagamin via digitalmars-d-ldc wrote:
> Can't it be ruled out by the `opt` tool? Though I thought it's strange how it works on ir-to-ir level.

I'm not sure I understand. What would `opt` rule out exactly? Yes, it allows you to run a custom subset of passes – `ldc2 -O2 -debug-pass=Arguments …` is very useful for that, by the way – but you still need to manually remove subsets of passes and re-build the executable until the issue disappears. (As an aside: Now that LDC accepts bitcode files on the command line, this doesn't require manually messing with object file emission/linking anymore.)

By the way, the fact that it both consumes and produces IR is actually not strange at all; that's simply how the main LLVM optimiser is architected (with the exception of a further, mostly target-specific, optimisation stage during code generation).

 — David

June 17, 2016

Re: How to track down a bad llvm optimization pass

Posted by Kagamin
in reply to David Nadlinger

Kagamin

Posted in reply to David Nadlinger

If `opt` converts valid ir to invalid ir, then it's not ldc's failure.

June 17, 2016

Re: How to track down a bad llvm optimization pass

Posted by Dan Olson
in reply to Joakim

Dan Olson

Posted in reply to Joakim

Joakim <dlang@joakim.fea.st> writes:

> Since the update to ddmdfe 2.070, a single assert trips up on Android/ARM when running the druntime/phobos tests:
>
> https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716

Hi Joakim,

Target arm-unknown-linux-gnueabihf is doing ok even with v2.071.0 front end.  Isn't Android essentially the same, but softfp?  Should be able to change triple and compare IR and see what is different between android triple (what is it by the way, arm-linux-android?) and gnueabihf triple. Or maybe it is llvm version?  I've only been using 3.5 and 3.6.2 so far.

I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version.
-- 
Dan

June 18, 2016

Re: How to track down a bad llvm optimization pass

Posted by Joakim
in reply to David Nadlinger

Joakim

Posted in reply to David Nadlinger

On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
> On 17 Jun 2016, at 0:23, Joakim via digitalmars-d-ldc wrote:
>> Why including the first assert combines with some optimization pass and inlining to produce this junk instruction instead, I don't know.  I don't think this is something that needs to be fixed on the ldc end, but who knows.  Anyone have any tips on tracking this down?
>
> Without having looked at the details of what's going on here in particular, my first step would be to use the "-print-after-all" option and look where the part that you think is clearly wrong first appears (e.g. just by textual search). Having a look at what the input to that pass was (i.e. the previous pass result) can often yield extra insight.

Hmm, I had tried -print-after-all with --debug-pass=Executions before and those flags dumped so much info that I didn't go through it much.  Using -print-after-all alone is better: I see that the pass that muffs it up is "Global Value Numbering."  If I disable that pass in llvm's PassManagerBuilder, the problem goes away.

Looks like a regression in that llvm pass, I'll try building earlier versions of llvm to see if they had the same problem.

> As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.).

Given it works fine for other combinations of optimizations, I doubt it's a problem on our end, but I don't know enough about llvm's assumptions to say for certain.

On Friday, 17 June 2016 at 15:51:40 UTC, Dan Olson wrote:
> Joakim <dlang@joakim.fea.st> writes:
>
>> Since the update to ddmdfe 2.070, a single assert trips up on Android/ARM when running the druntime/phobos tests:
>>
>> https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716
>
> Hi Joakim,
>
> Target arm-unknown-linux-gnueabihf is doing ok even with v2.071.0 front end.  Isn't Android essentially the same, but softfp?  Should be able to change triple and compare IR and see what is different between android triple (what is it by the way, arm-linux-android?) and gnueabihf triple. Or maybe it is llvm version?  I've only been using 3.5 and 3.6.2 so far.

I get the same problem with that armhf triple and llvm 3.8.0, looks like it may be an llvm regression.

> I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version.

I'll try building and linking against llvm 3.6 and 3.7 and see if it makes a difference.

June 18, 2016

Re: How to track down a bad llvm optimization pass

Posted by Joakim
in reply to Joakim

Joakim

Posted in reply to Joakim

On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
> On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
>> I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version.
>
> I'll try building and linking against llvm 3.6 and 3.7 and see if it makes a difference.

I just built and linked against llvm 3.7.1 and the problem goes away, so it appears that it is a regression in the GVN optimization pass with llvm 3.8.0.  I'll check if it goes away with llvm master, and file an llvm bug if it doesn't.

June 18, 2016

Re: How to track down a bad llvm optimization pass

Posted by Dan Olson
in reply to Joakim

Dan Olson

Posted in reply to Joakim

Joakim <dlang@joakim.fea.st> writes:

> On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
> I just built and linked against llvm 3.7.1 and the problem goes away,
> so it appears that it is a regression in the GVN optimization pass
> with llvm 3.8.0.  I'll check if it goes away with llvm master, and
> file an llvm bug if it doesn't.

And I am trying opposite: builting ldc master against llvm 3.8.0.  See if I can duplicate the problem.

June 18, 2016

Re: How to track down a bad llvm optimization pass

Posted by Dan Olson
in reply to Dan Olson

Dan Olson

Posted in reply to Dan Olson

Dan Olson <gorox@comcast.net> writes:

> Joakim <dlang@joakim.fea.st> writes:
>
>> On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
>> I just built and linked against llvm 3.7.1 and the problem goes away,
>> so it appears that it is a regression in the GVN optimization pass
>> with llvm 3.8.0.  I'll check if it goes away with llvm master, and
>> file an llvm bug if it doesn't.
>
> And I am trying opposite: builting ldc master against llvm 3.8.0.  See if I can duplicate the problem.

Yup, fails same way on a Raspberry Pi with 3.8.0.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation