February 07, 2019
On Thursday, 7 February 2019 at 03:50:32 UTC, Vladimir Panteleev wrote:
> On Monday, 17 December 2018 at 21:59:59 UTC, JN wrote:
>> while working on my game engine project, I encountered a DMD codegen bug. It occurs only when compiling in release mode, debug works.
>
> Old thread, but FWIW, such bugs can be easily and precisely reduced with DustMite. In your test script, just compile with and without the compiler option which causes the bug to manifest, and check that one works and the other doesn't.
>
> I put together a short article on the DustMite wiki describing how to do this:
> https://github.com/CyberShadow/DustMite/wiki/Reducing-a-bug-with-a-specific-compiler-option

Does it also work for dub projects?

Anyway, I managed to reduce the source code greatly manually:

https://github.com/helikopterodaktyl/repro_d_release/

unfortunately I can't get rid of the dlib dependency. When built with debug, test outputs [0: Object], with release it outputs [0: null].

commenting this line out:
f.rotation = Quaternionf.fromEulerAngles(Vector3f(0.0f, 0.0f, 0.0f));
or changing it to:
f.rotation = Quaternionf.identity();

is enough to make release output [0: Object] as well. I guess dlib is doing something dodgy with memory layout, but I can't see anything suspicious :(
February 07, 2019
On Thu, Feb 07, 2019 at 10:16:19PM +0000, JN via Digitalmars-d-learn wrote: [...]
> Anyway, I managed to reduce the source code greatly manually:
> 
> https://github.com/helikopterodaktyl/repro_d_release/
> 
> unfortunately I can't get rid of the dlib dependency. When built with debug, test outputs [0: Object], with release it outputs [0: null].
> 
> commenting this line out:
> f.rotation = Quaternionf.fromEulerAngles(Vector3f(0.0f, 0.0f, 0.0f));
> or changing it to:
> f.rotation = Quaternionf.identity();
> 
> is enough to make release output [0: Object] as well. I guess dlib is doing something dodgy with memory layout, but I can't see anything suspicious :(

Hmm. I can't seem to reproduce this in my environment (Linux/x86_64). Tried it with various combinations of `dub -b release|debug|etc.`, manually compiling with `dmd -I~/.dub/packages/dlib-0.15.0/dlib` with various combinations of -release, -debug, etc..

I wonder if you somehow have an ABI mismatch caused by stale cached objects in dub?  Perhaps try `dub --force` to force a rebuild of everything?  Or, if you're daring, delete the entire dub cache and rebuild, just to be sure there are no stray stale files lying around somewhere.


Barring that, one way to narrow this down further is to copy the relevant dlib sources into your own source tree, remove the dub dependency, and then reduce the dlib sources as well.  I did a quick and crude test, and discovered that you only need the following files:

	dlib/math/matrix.d
	dlib/math/linsolve.d
	dlib/math/quaternion.d
	dlib/math/decomposition.d
	dlib/math/package.d
	dlib/math/vector.d
	dlib/math/utils.d
	dlib/core/package.d
	dlib/core/tuple.d

Replace dlib/core/package.d with an empty file, and edit dlib/math/package.d to import only dlib.math.quaternion and dlib.math.vector.

Since you're only using a very small number of functions, you can probably quickly eliminate most of the above files too. Just edit the files directly (since they're your own copy) and delete everything that isn't directly needed by your code.  Of course, at the same time check also that deleting doesn't change the bug behaviour. If it does, then whatever you just deleted may possibly be (part of) the cause of the problem.

Sorry I can't help you with reproducing the problem, as the bug doesn't seem to show up in my environment.  (I suspect it's still there, just that subtle differences in my environment may be masking it somehow.)


T

-- 
Political correctness: socially-sanctioned hypocrisy.
February 08, 2019
On Thursday, 7 February 2019 at 22:16:19 UTC, JN wrote:
> Does it also work for dub projects?

It will work if you can put all the relevant D code in one directory, which is harder for Dub, as it likes to pull dependencies from all over the place. When "dub dustmite" is insufficient (as in this case), the safest way to proceed would be to build with dub in verbose mode, take note of the compiler command lines it's using, then put them in a shell script and all mentioned D files in one directory, then pass that to Dustmite.

February 08, 2019
On Friday, 8 February 2019 at 07:30:41 UTC, Vladimir Panteleev wrote:
> On Thursday, 7 February 2019 at 22:16:19 UTC, JN wrote:
>> Does it also work for dub projects?
>
> It will work if you can put all the relevant D code in one directory, which is harder for Dub, as it likes to pull dependencies from all over the place. When "dub dustmite" is insufficient (as in this case), the safest way to proceed would be to build with dub in verbose mode, take note of the compiler command lines it's using, then put them in a shell script and all mentioned D files in one directory, then pass that to Dustmite.

I will try. However, one last thing - in the example test scripts, it runs first with one compiler setting (or D version) and the second time with the other compiler setting (or D version). But it looks like the exit code of the first run is ignored anyway, so why run it?
February 08, 2019
On Friday, 8 February 2019 at 09:28:48 UTC, JN wrote:
> I will try. However, one last thing - in the example test scripts, it runs first with one compiler setting (or D version) and the second time with the other compiler setting (or D version). But it looks like the exit code of the first run is ignored anyway, so why run it?

With "set -e", the shell interpreter will exit the script with any command that fails (returns with non-zero status), unless it's in an "if" condition or such. I'll update the article to clarify it.

February 08, 2019
On Friday, 8 February 2019 at 09:30:12 UTC, Vladimir Panteleev wrote:
> On Friday, 8 February 2019 at 09:28:48 UTC, JN wrote:
>> I will try. However, one last thing - in the example test scripts, it runs first with one compiler setting (or D version) and the second time with the other compiler setting (or D version). But it looks like the exit code of the first run is ignored anyway, so why run it?
>
> With "set -e", the shell interpreter will exit the script with any command that fails (returns with non-zero status), unless it's in an "if" condition or such. I'll update the article to clarify it.

I see. Dustmite helped. I had to convert it to windows batch, so my testscript ended up to be:

dmd -O -inline -release -boundscheck=on -i app.d -m64
@IF %ERRORLEVEL% EQU 0 (ECHO No error found) ELSE (EXIT /B 1)
@app | FINDSTR /C:"Object"
@IF %ERRORLEVEL% EQU 0 (ECHO No error found) ELSE (EXIT /B 1)
dmd -O -inline -release -boundscheck=off -i app.d -m64
@IF %ERRORLEVEL% EQU 0 (ECHO No error found) ELSE (EXIT /B 1)
@app | FINDSTR /C:"null"
@IF %ERRORLEVEL% EQU 0 (EXIT /B 0) ELSE (EXIT /B 1)

I managed to greatly reduce the source code. I have filed a bug with the reduced testcase https://issues.dlang.org/show_bug.cgi?id=19662 .
February 08, 2019
On Fri, Feb 08, 2019 at 09:23:40PM +0000, JN via Digitalmars-d-learn wrote: [...]
> I managed to greatly reduce the source code. I have filed a bug with the reduced testcase https://issues.dlang.org/show_bug.cgi?id=19662 .

Haha, you were right!  It's a compiler bug, another one of those nasty -O -inline bugs.  Probably a backend codegen bug.  Ran into one of those before; was pretty nasty.  Fortunately it got fixed soon(ish) after I made noise about it in the forum. :-P


T

-- 
Don't drink and derive. Alcohol and algebra don't mix.
February 08, 2019
On Friday, 8 February 2019 at 21:35:34 UTC, H. S. Teoh wrote:
> On Fri, Feb 08, 2019 at 09:23:40PM +0000, JN via Digitalmars-d-learn wrote: [...]
>> I managed to greatly reduce the source code. I have filed a bug with the reduced testcase https://issues.dlang.org/show_bug.cgi?id=19662 .
>
> Haha, you were right!  It's a compiler bug, another one of those nasty -O -inline bugs.  Probably a backend codegen bug.  Ran into one of those before; was pretty nasty.  Fortunately it got fixed soon(ish) after I made noise about it in the forum. :-P
>
>
> T

Luckily it's not a blocker for me, because it doesn't trigger on debug builds, and for release builds I can always use LDC, but still it's bugging me (pun intended).
February 08, 2019
On Fri, Feb 08, 2019 at 09:42:11PM +0000, JN via Digitalmars-d-learn wrote:
> On Friday, 8 February 2019 at 21:35:34 UTC, H. S. Teoh wrote:
> > On Fri, Feb 08, 2019 at 09:23:40PM +0000, JN via Digitalmars-d-learn wrote: [...]
> > > I managed to greatly reduce the source code. I have filed a bug with the reduced testcase https://issues.dlang.org/show_bug.cgi?id=19662 .
> > 
> > Haha, you were right!  It's a compiler bug, another one of those nasty -O -inline bugs.  Probably a backend codegen bug.  Ran into one of those before; was pretty nasty.  Fortunately it got fixed soon(ish) after I made noise about it in the forum. :-P
[...]
> Luckily it's not a blocker for me, because it doesn't trigger on debug builds, and for release builds I can always use LDC, but still it's bugging me (pun intended).

Pity I still can't reproduce the problem locally. Otherwise I would reduce it even more -- e.g., eliminate std.stdio dependency and have the program fail on assert(obj != null), and a bunch of other things to make it easier for compiler devs to analyze -- and perhaps look at the generated assembly to see what went wrong.  If you have the time (and patience) to do that, it would greatly increase the chances of this being fixed in a timely way, since it would narrow down the bug even more so that it's easier to find in the dmd source code.


T

-- 
I see that you JS got Bach.
February 08, 2019
On Friday, 8 February 2019 at 22:11:31 UTC, H. S. Teoh wrote:
> Pity I still can't reproduce the problem locally. Otherwise I would reduce it even more -- e.g., eliminate std.stdio dependency and have the program fail on assert(obj != null), and a bunch of other things to make it easier for compiler devs to analyze -- and perhaps look at the generated assembly to see what went wrong.  If you have the time (and patience) to do that, it would greatly increase the chances of this being fixed in a timely way, since it would narrow down the bug even more so that it's easier to find in the dmd source code.
>
>
> T

It seems to be a Windows 64-bit only thing. Anyway, I reduced the code further manually. It's very hard to reduce it any further. For example, removing the assignments in fromEulerAngles static method hides the bug. Likewise, replacing writeln with assert makes it work properly too.