Jump to page: 1 2
Thread overview
fabs not being inlined?
Feb 07
NaN
Feb 08
kinke
Feb 08
NaN
Feb 08
NaN
Feb 08
NaN
Feb 08
9il
Feb 09
NaN
Feb 09
kinke
Feb 12
Radu
Feb 12
kinke
February 07
module ohreally;

import std.math;

float foo(float y, float x)
{
    float ax = fabs(x);
    float ay = fabs(y);
    return ax*ay/3.142f;
}

====>

float ohreally.foo(float, float):
        push    rax
        movss   dword ptr [rsp + 4], xmm1
        call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
        movss   dword ptr [rsp], xmm0
        movss   xmm0, dword ptr [rsp + 4]
        call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
        mulss   xmm0, dword ptr [rsp]
        divss   xmm0, dword ptr [rip + .LCPI0_0]
        pop     rax
        ret

Compiled with -O3

Is there something I need to do to get fabs() inlined?
February 07
On 7 Feb 2019, at 23:26, NaN via digitalmars-d-ldc wrote:
> Is there something I need to do to get fabs() inlined?

Try -enable-cross-module-inlining, or use link-time optimization. If there isn't already, we should probably create a tracker bug for making sure we fix inlining for these, whether by switching cross-module inlining on by default again, or implementing/adding `pragma(inline, true)` to all the math shims.

 — David

February 08
On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
> or use link-time optimization

Be sure to use `-flto=<thin|full>` *and* `-defaultlib=druntime-ldc-lto,phobos2-ldc-lto`.

> If there isn't already, we should probably create a tracker bug for making sure we fix inlining for these, whether by switching cross-module inlining on by default again, or implementing/adding `pragma(inline, true)` to all the math shims.

https://github.com/ldc-developers/ldc/issues/2552

February 08
On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
> On 7 Feb 2019, at 23:26, NaN via digitalmars-d-ldc wrote:
>> Is there something I need to do to get fabs() inlined?
>
> Try -enable-cross-module-inlining,

That worked!

I did look at

https://wiki.dlang.org/Using_LDC

but didnt see any reference to cross module inlining.


February 08
On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
> On 7 Feb 2019, at 23:26, NaN via digitalmars-d-ldc wrote:
>> Is there something I need to do to get fabs() inlined?
>
> Try -enable-cross-module-inlining, or use link-time optimization. If there isn't already, we should probably create a tracker bug for making sure we fix inlining for these, whether by switching cross-module inlining on by default again, or implementing/adding `pragma(inline, true)` to all the math shims.
>
>  — David

Ok spoke to soon, went to bed after just testing it on godbolt, but if I add that to my project the exe just hangs, opens a command window but nothing else. Is there any point we trying to figure out why or is it a known problem?
February 08
On Friday, 8 February 2019 at 00:01:58 UTC, kinke wrote:
> On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
>> or use link-time optimization
>
> Be sure to use `-flto=<thin|full>` *and* `-defaultlib=druntime-ldc-lto,phobos2-ldc-lto`.
>

Ok tried that too, it results in ==>

> Executing task: dub run --compiler=ldc2 --build=release --arch=x86_64 <

Performing "release" build using ldc2 for x86_64.
sonijit ~master: building configuration "application"...
lld-link.exe: error: undefined symbol: __chkstk
>>> referenced by lto.tmp:(_d_run_main)
>>> referenced by lto.tmp:(_d_run_main)
>>> referenced by lto.tmp:(_d_run_main)

Error: C:\LDC\bin\lld-link.exe failed with status: 1
ldc2 failed with exit code 1.
The terminal process terminated with exit code: 2

Terminal will be reused by tasks, press any key to close it.

February 08
On Thursday, 7 February 2019 at 23:26:20 UTC, NaN wrote:
> module ohreally;
>
> import std.math;
>
> float foo(float y, float x)
> {
>     float ax = fabs(x);
>     float ay = fabs(y);
>     return ax*ay/3.142f;
> }
>
> ====>
>
> float ohreally.foo(float, float):
>         push    rax
>         movss   dword ptr [rsp + 4], xmm1
>         call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
>         movss   dword ptr [rsp], xmm0
>         movss   xmm0, dword ptr [rsp + 4]
>         call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
>         mulss   xmm0, dword ptr [rsp]
>         divss   xmm0, dword ptr [rip + .LCPI0_0]
>         pop     rax
>         ret
>
> Compiled with -O3
>
> Is there something I need to do to get fabs() inlined?

Try also mir-core DUB package. It has mir.math package, fabs will be inlined in -O builds without any additional flags.
http://code.dlang.org/packages/mir-core
February 08
On Thursday, 7 February 2019 at 23:26:20 UTC, NaN wrote:
>
> Is there something I need to do to get fabs() inlined?

Alternative: inline by hand

https://godbolt.org/z/DS0XIb

Works since LDC 1.0.0 -O1
February 09
On Friday, 8 February 2019 at 14:47:20 UTC, Guillaume Piolat wrote:
> On Thursday, 7 February 2019 at 23:26:20 UTC, NaN wrote:
>>
>> Is there something I need to do to get fabs() inlined?
>
> Alternative: inline by hand
>
> https://godbolt.org/z/DS0XIb
>
> Works since LDC 1.0.0 -O1

This might not be pretty but it coaxes LDC to do abs with a single instruction...

https://godbolt.org/z/0aVvSR


February 09
On Saturday, 9 February 2019 at 15:08:22 UTC, NaN wrote:
> On Friday, 8 February 2019 at 14:47:20 UTC, Guillaume Piolat wrote:
>> Alternative: inline by hand
>>
>> https://godbolt.org/z/DS0XIb
>
> This might not be pretty but it coaxes LDC to do abs with a single instruction...
>
> https://godbolt.org/z/0aVvSR

Both manual versions are ugly and IMO should be avoided at all costs. ;) If LTO/cross-module-inlining is not an option but fabs performance is critical, then use the intrinsic directly:

import ldc.intrinsics;
alias fabs = llvm_fabs;

The reason std.math doesn't just alias (I had a go at this once) is that there are some tests checking that the std.math functions are real functions (and that their address can be taken).

> lld-link.exe: error: undefined symbol: __chkstk

Looks like some linker tricks required for the MinGW-based libs don't work with LTO; I guess it works with the MS toolchain, e.g., when run inside in a Visual Studio command prompt. I'll spare you the dirty details.
« First   ‹ Prev
1 2