Thread overview | ||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 16, 2016 How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Since the update to ddmdfe 2.070, a single assert trips up on Android/ARM when running the druntime/phobos tests: https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716 I've copy-pasted the relevant lines of the test into another file, testconv.d: version(unittest) import std.conv, std.array, std.range; unittest { auto r = toChars!(16)(16u); assert(r.length == 2); assert(r[1..2].array == "0"); } The test runs fine with -O1, but with -O2 or -O3, ie the levels when inlining are enabled, the second assert fails. If I compile with -O2/3 -disable-inlining _or_ comment out the first assert, it passes. Here's the IR generated for the unittest block with the first assert commented out, and with it included. ./bin/ldc2 -unittest -O2 --output-ll -c testconv.d -of=without.ll define void @_D8testconv14__unittestL2_1FZv() comdat { br label %forcond.i.i forcond.i.i: ; preds = %forcond.i.i, %0 %indvars.iv.i.i = phi i32 [ %indvars.iv.next.i.i, %forcond.i.i ], [ 1, %0 ] ; [#uses = 2, type = i32] %value.0.i.i = phi i32 [ %1, %forcond.i.i ], [ 16, %0 ] ; [#uses = 1, type = i32] %1 = lshr i32 %value.0.i.i, 4 ; [#uses = 2] %2 = icmp eq i32 %1, 0 ; [#uses = 1] %indvars.iv.next.i.i = add nuw nsw i32 %indvars.iv.i.i, 1 ; [#uses = 1] br i1 %2, label %bounds.ok.i, label %forcond.i.i bounds.ok.i: ; preds = %forcond.i.i %indvars.iv.i.i.lcssa = phi i32 [ %indvars.iv.i.i, %forcond.i.i ] ; [#uses = 1, type = i32] %3 = tail call i8* @_D4core6memory2GC6mallocFNaNbkkxC8TypeInfoZPv(%object.TypeInfo* null, i32 2, i32 1) ; [#uses = 2] %4 = insertvalue { i32, i8* } { i32 1, i8* undef }, i8* %3, 1 ; [#uses = 1] %5 = shl i32 %indvars.iv.i.i.lcssa, 2 ; [#uses = 1] %6 = and i32 %5, 1020 ; [#uses = 1] %7 = add nsw i32 %6, -12 ; [#uses = 1] %8 = lshr i32 16, %7 ; [#uses = 1] %9 = or i32 %8, 48 ; [#uses = 1] %10 = trunc i32 %9 to i8 ; [#uses = 1] store i8 %10, i8* %3, align 1 %11 = tail call i32 @_adEq2({ i32, i8* } %4, { i32, i8* } { i32 1, i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str, i32 0, i32 0) }, %object.TypeInfo* nonnull @_D11TypeInfo_Aa6__initZ) #1 ; [#uses = 1] %12 = icmp eq i32 %11, 0 ; [#uses = 1] br i1 %12, label %assertFailed, label %assertPassed assertPassed: ; preds = %bounds.ok.i ret void assertFailed: ; preds = %bounds.ok.i tail call void @_d_assert({ i32, i8* } { i32 10, i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str.1, i32 0, i32 0) }, i32 4) #2 unreachable } ./bin/ldc2 -unittest -O2 --output-ll -c testconv.d -of=with.ll define void @_D8testconv14__unittestL2_1FZv() comdat { br label %forcond.i.i forcond.i.i: ; preds = %forcond.i.i, %0 %indvars.iv.i.i = phi i32 [ %indvars.iv.next.i.i, %forcond.i.i ], [ 1, %0 ] ; [#uses = 2, type = i32] %value.0.i.i = phi i32 [ %1, %forcond.i.i ], [ 16, %0 ] ; [#uses = 1, type = i32] %1 = lshr i32 %value.0.i.i, 4 ; [#uses = 2] %2 = icmp eq i32 %1, 0 ; [#uses = 1] %indvars.iv.next.i.i = add nuw nsw i32 %indvars.iv.i.i, 1 ; [#uses = 1] br i1 %2, label %_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit, label %forcond.i.i _D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit: ; preds = %forcond.i.i %indvars.iv.i.i.lcssa = phi i32 [ %indvars.iv.i.i, %forcond.i.i ] ; [#uses = 1, type = i32] %3 = and i32 %indvars.iv.i.i.lcssa, 255 ; [#uses = 1] %4 = icmp eq i32 %3, 2 ; [#uses = 1] br i1 %4, label %bounds.ok.i, label %assertFailed bounds.ok.i: ; preds = %_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit %5 = tail call i8* @_D4core6memory2GC6mallocFNaNbkkxC8TypeInfoZPv(%object.TypeInfo* null, i32 2, i32 1) ; [#uses = 2] %6 = insertvalue { i32, i8* } { i32 1, i8* undef }, i8* %5, 1 ; [#uses = 1] store i8 -1, i8* %5, align 1 %7 = tail call i32 @_adEq2({ i32, i8* } %6, { i32, i8* } { i32 1, i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str.1, i32 0, i32 0) }, %object.TypeInfo* nonnull @_D11TypeInfo_Aa6__initZ) #3 ; [#uses = 1] %8 = icmp eq i32 %7, 0 ; [#uses = 1] br i1 %8, label %assertFailed2, label %assertPassed1 assertFailed: ; preds = %_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit tail call void @_d_assert({ i32, i8* } { i32 10, i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i32 0, i32 0) }, i32 3) #2 unreachable assertPassed1: ; preds = %bounds.ok.i ret void assertFailed2: ; preds = %bounds.ok.i tail call void @_d_assert({ i32, i8* } { i32 10, i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i32 0, i32 0) }, i32 4) #2 unreachable } Clearly the problem is that those seven instructions before the call to _adEq2 in the bounds.ok.i section of the first IR get turned into this nonsense instruction in the second IR: store i8 -1, i8* %5, align 1 Why including the first assert combines with some optimization pass and inlining to produce this junk instruction instead, I don't know. I don't think this is something that needs to be fixed on the ldc end, but who knows. Anyone have any tips on tracking this down? |
June 17, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joakim | On 17 Jun 2016, at 0:23, Joakim via digitalmars-d-ldc wrote:
> Why including the first assert combines with some optimization pass and inlining to produce this junk instruction instead, I don't know. I don't think this is something that needs to be fixed on the ldc end, but who knows. Anyone have any tips on tracking this down?
Without having looked at the details of what's going on here in particular, my first step would be to use the "-print-after-all" option and look where the part that you think is clearly wrong first appears (e.g. just by textual search). Having a look at what the input to that pass was (i.e. the previous pass result) can often yield extra insight.
bugpoint, the LLVM tool for reducing test cases, also has some functionality to reduce the set of passes needed to trigger a miscompilation. It has worked for me in the past, but I always found configuring the linking/executing step to be a bit finicky.
As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.).
— David
|
June 17, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
> As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.).
Can't it be ruled out by the `opt` tool? Though I thought it's strange how it works on ir-to-ir level.
|
June 17, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On 17 Jun 2016, at 13:39, Kagamin via digitalmars-d-ldc wrote:
> Can't it be ruled out by the `opt` tool? Though I thought it's strange how it works on ir-to-ir level.
I'm not sure I understand. What would `opt` rule out exactly? Yes, it allows you to run a custom subset of passes – `ldc2 -O2 -debug-pass=Arguments …` is very useful for that, by the way – but you still need to manually remove subsets of passes and re-build the executable until the issue disappears. (As an aside: Now that LDC accepts bitcode files on the command line, this doesn't require manually messing with object file emission/linking anymore.)
By the way, the fact that it both consumes and produces IR is actually not strange at all; that's simply how the main LLVM optimiser is architected (with the exception of a further, mostly target-specific, optimisation stage during code generation).
— David
|
June 17, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | If `opt` converts valid ir to invalid ir, then it's not ldc's failure. |
June 17, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joakim | Joakim <dlang@joakim.fea.st> writes: > Since the update to ddmdfe 2.070, a single assert trips up on Android/ARM when running the druntime/phobos tests: > > https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716 Hi Joakim, Target arm-unknown-linux-gnueabihf is doing ok even with v2.071.0 front end. Isn't Android essentially the same, but softfp? Should be able to change triple and compare IR and see what is different between android triple (what is it by the way, arm-linux-android?) and gnueabihf triple. Or maybe it is llvm version? I've only been using 3.5 and 3.6.2 so far. I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version. -- Dan |
June 18, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote: > On 17 Jun 2016, at 0:23, Joakim via digitalmars-d-ldc wrote: >> Why including the first assert combines with some optimization pass and inlining to produce this junk instruction instead, I don't know. I don't think this is something that needs to be fixed on the ldc end, but who knows. Anyone have any tips on tracking this down? > > Without having looked at the details of what's going on here in particular, my first step would be to use the "-print-after-all" option and look where the part that you think is clearly wrong first appears (e.g. just by textual search). Having a look at what the input to that pass was (i.e. the previous pass result) can often yield extra insight. Hmm, I had tried -print-after-all with --debug-pass=Executions before and those flags dumped so much info that I didn't go through it much. Using -print-after-all alone is better: I see that the pass that muffs it up is "Global Value Numbering." If I disable that pass in llvm's PassManagerBuilder, the problem goes away. Looks like a regression in that llvm pass, I'll try building earlier versions of llvm to see if they had the same problem. > As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.). Given it works fine for other combinations of optimizations, I doubt it's a problem on our end, but I don't know enough about llvm's assumptions to say for certain. On Friday, 17 June 2016 at 15:51:40 UTC, Dan Olson wrote: > Joakim <dlang@joakim.fea.st> writes: > >> Since the update to ddmdfe 2.070, a single assert trips up on Android/ARM when running the druntime/phobos tests: >> >> https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716 > > Hi Joakim, > > Target arm-unknown-linux-gnueabihf is doing ok even with v2.071.0 front end. Isn't Android essentially the same, but softfp? Should be able to change triple and compare IR and see what is different between android triple (what is it by the way, arm-linux-android?) and gnueabihf triple. Or maybe it is llvm version? I've only been using 3.5 and 3.6.2 so far. I get the same problem with that armhf triple and llvm 3.8.0, looks like it may be an llvm regression. > I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version. I'll try building and linking against llvm 3.6 and 3.7 and see if it makes a difference. |
June 18, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joakim | On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
> On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
>> I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version.
>
> I'll try building and linking against llvm 3.6 and 3.7 and see if it makes a difference.
I just built and linked against llvm 3.7.1 and the problem goes away, so it appears that it is a regression in the GVN optimization pass with llvm 3.8.0. I'll check if it goes away with llvm master, and file an llvm bug if it doesn't.
|
June 18, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joakim | Joakim <dlang@joakim.fea.st> writes:
> On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
> I just built and linked against llvm 3.7.1 and the problem goes away,
> so it appears that it is a regression in the GVN optimization pass
> with llvm 3.8.0. I'll check if it goes away with llvm master, and
> file an llvm bug if it doesn't.
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
|
June 18, 2016 Re: How to track down a bad llvm optimization pass | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dan Olson | Dan Olson <gorox@comcast.net> writes:
> Joakim <dlang@joakim.fea.st> writes:
>
>> On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
>> I just built and linked against llvm 3.7.1 and the problem goes away,
>> so it appears that it is a regression in the GVN optimization pass
>> with llvm 3.8.0. I'll check if it goes away with llvm master, and
>> file an llvm bug if it doesn't.
>
> And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Yup, fails same way on a Raspberry Pi with 3.8.0.
|
Copyright © 1999-2021 by the D Language Foundation