Jump to page: 1 2
Thread overview
fabs not being inlined?
Feb 07
NaN
Feb 08
kinke
Feb 08
NaN
Feb 08
NaN
Feb 08
NaN
Feb 08
9il
Feb 09
NaN
Feb 09
kinke
6 days ago
Stefan Koch
6 days ago
Radu
5 days ago
kinke
February 07
module ohreally;

import std.math;

float foo(float y, float x)
{
    float ax = fabs(x);
    float ay = fabs(y);
    return ax*ay/3.142f;
}

====>

float ohreally.foo(float, float):
        push    rax
        movss   dword ptr [rsp + 4], xmm1
        call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
        movss   dword ptr [rsp], xmm0
        movss   xmm0, dword ptr [rsp + 4]
        call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
        mulss   xmm0, dword ptr [rsp]
        divss   xmm0, dword ptr [rip + .LCPI0_0]
        pop     rax
        ret

Compiled with -O3

Is there something I need to do to get fabs() inlined?
February 07
On 7 Feb 2019, at 23:26, NaN via digitalmars-d-ldc wrote:
> Is there something I need to do to get fabs() inlined?

Try -enable-cross-module-inlining, or use link-time optimization. If there isn't already, we should probably create a tracker bug for making sure we fix inlining for these, whether by switching cross-module inlining on by default again, or implementing/adding `pragma(inline, true)` to all the math shims.

 — David

February 08
On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
> or use link-time optimization

Be sure to use `-flto=<thin|full>` *and* `-defaultlib=druntime-ldc-lto,phobos2-ldc-lto`.

> If there isn't already, we should probably create a tracker bug for making sure we fix inlining for these, whether by switching cross-module inlining on by default again, or implementing/adding `pragma(inline, true)` to all the math shims.

https://github.com/ldc-developers/ldc/issues/2552

February 08
On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
> On 7 Feb 2019, at 23:26, NaN via digitalmars-d-ldc wrote:
>> Is there something I need to do to get fabs() inlined?
>
> Try -enable-cross-module-inlining,

That worked!

I did look at

https://wiki.dlang.org/Using_LDC

but didnt see any reference to cross module inlining.


February 08
On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
> On 7 Feb 2019, at 23:26, NaN via digitalmars-d-ldc wrote:
>> Is there something I need to do to get fabs() inlined?
>
> Try -enable-cross-module-inlining, or use link-time optimization. If there isn't already, we should probably create a tracker bug for making sure we fix inlining for these, whether by switching cross-module inlining on by default again, or implementing/adding `pragma(inline, true)` to all the math shims.
>
>  — David

Ok spoke to soon, went to bed after just testing it on godbolt, but if I add that to my project the exe just hangs, opens a command window but nothing else. Is there any point we trying to figure out why or is it a known problem?
February 08
On Friday, 8 February 2019 at 00:01:58 UTC, kinke wrote:
> On Thursday, 7 February 2019 at 23:50:10 UTC, David Nadlinger wrote:
>> or use link-time optimization
>
> Be sure to use `-flto=<thin|full>` *and* `-defaultlib=druntime-ldc-lto,phobos2-ldc-lto`.
>

Ok tried that too, it results in ==>

> Executing task: dub run --compiler=ldc2 --build=release --arch=x86_64 <

Performing "release" build using ldc2 for x86_64.
sonijit ~master: building configuration "application"...
lld-link.exe: error: undefined symbol: __chkstk
>>> referenced by lto.tmp:(_d_run_main)
>>> referenced by lto.tmp:(_d_run_main)
>>> referenced by lto.tmp:(_d_run_main)

Error: C:\LDC\bin\lld-link.exe failed with status: 1
ldc2 failed with exit code 1.
The terminal process terminated with exit code: 2

Terminal will be reused by tasks, press any key to close it.

February 08
On Thursday, 7 February 2019 at 23:26:20 UTC, NaN wrote:
> module ohreally;
>
> import std.math;
>
> float foo(float y, float x)
> {
>     float ax = fabs(x);
>     float ay = fabs(y);
>     return ax*ay/3.142f;
> }
>
> ====>
>
> float ohreally.foo(float, float):
>         push    rax
>         movss   dword ptr [rsp + 4], xmm1
>         call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
>         movss   dword ptr [rsp], xmm0
>         movss   xmm0, dword ptr [rsp + 4]
>         call    pure nothrow @nogc @safe float std.math.fabs(float)@PLT
>         mulss   xmm0, dword ptr [rsp]
>         divss   xmm0, dword ptr [rip + .LCPI0_0]
>         pop     rax
>         ret
>
> Compiled with -O3
>
> Is there something I need to do to get fabs() inlined?

Try also mir-core DUB package. It has mir.math package, fabs will be inlined in -O builds without any additional flags.
http://code.dlang.org/packages/mir-core
February 08
On Thursday, 7 February 2019 at 23:26:20 UTC, NaN wrote:
>
> Is there something I need to do to get fabs() inlined?

Alternative: inline by hand

https://godbolt.org/z/DS0XIb

Works since LDC 1.0.0 -O1
February 09
On Friday, 8 February 2019 at 14:47:20 UTC, Guillaume Piolat wrote:
> On Thursday, 7 February 2019 at 23:26:20 UTC, NaN wrote:
>>
>> Is there something I need to do to get fabs() inlined?
>
> Alternative: inline by hand
>
> https://godbolt.org/z/DS0XIb
>
> Works since LDC 1.0.0 -O1

This might not be pretty but it coaxes LDC to do abs with a single instruction...

https://godbolt.org/z/0aVvSR


February 09
On Saturday, 9 February 2019 at 15:08:22 UTC, NaN wrote:
> On Friday, 8 February 2019 at 14:47:20 UTC, Guillaume Piolat wrote:
>> Alternative: inline by hand
>>
>> https://godbolt.org/z/DS0XIb
>
> This might not be pretty but it coaxes LDC to do abs with a single instruction...
>
> https://godbolt.org/z/0aVvSR

Both manual versions are ugly and IMO should be avoided at all costs. ;) If LTO/cross-module-inlining is not an option but fabs performance is critical, then use the intrinsic directly:

import ldc.intrinsics;
alias fabs = llvm_fabs;

The reason std.math doesn't just alias (I had a go at this once) is that there are some tests checking that the std.math functions are real functions (and that their address can be taken).

> lld-link.exe: error: undefined symbol: __chkstk

Looks like some linker tricks required for the MinGW-based libs don't work with LTO; I guess it works with the MS toolchain, e.g., when run inside in a Visual Studio command prompt. I'll spare you the dirty details.
« First   ‹ Prev
1 2