Thread overview
popcnt intrinsic unused
Jan 19, 2016
Stefan Koch
Jan 19, 2016
David Nadlinger
Jan 20, 2016
Kagamin
Jan 20, 2016
Stefan Koch
Jan 21, 2016
Kagamin
Jan 21, 2016
Stefan Koch
Jan 20, 2016
Stefan Koch
January 19, 2016
I just compiled a simple program calling popcnt in a loop.
It does not generate the intrinsic even when compiled with
-O3 -c -mcpu=amdfam10
January 19, 2016
Hi Stefan,

On 19 Jan 2016, at 14:18, Stefan Koch via digitalmars-d-ldc wrote:
> I just compiled a simple program calling popcnt in a loop.
> It does not generate the intrinsic even when compiled with
> -O3 -c -mcpu=amdfam10

You mean it emits a function call to libdruntime-ldc instead of just the intrinsic? In that case, it's probably the inlining problem that has been haunting us for ages (can't use the LLVM inliner because core.bitop doesn't actually get compiled, and we are not using DMD's front-end inliner either).

If that's the case, a workaround would be to either copy/paste the function into your source code, or add the druntime module to the build (making sure to use -singleobj for the ldc2 driver).

Either way, one of the next important goals for LDC should be finally implementing proper force-inline support (that, unlike DMD's pragma, also works when the inliner is not otherwise active, and across all module boundaries).

 — David
January 20, 2016
On Tuesday, 19 January 2016 at 20:01:31 UTC, David Nadlinger wrote:
> Either way, one of the next important goals for LDC should be finally implementing proper force-inline support (that, unlike DMD's pragma, also works when the inliner is not otherwise active, and across all module boundaries).

Why? Stefan is compiling with -O3.
January 20, 2016
On Tuesday, 19 January 2016 at 20:01:31 UTC, David Nadlinger wrote:
> Hi Stefan,
>
> On 19 Jan 2016, at 14:18, Stefan Koch via digitalmars-d-ldc wrote:
>> [...]
>
> [...]

Hi David,

Thanks for the explanation.

January 20, 2016
On Wednesday, 20 January 2016 at 09:03:16 UTC, Kagamin wrote:
> On Tuesday, 19 January 2016 at 20:01:31 UTC, David Nadlinger wrote:
>> Either way, one of the next important goals for LDC should be finally implementing proper force-inline support (that, unlike DMD's pragma, also works when the inliner is not otherwise active, and across all module boundaries).
>
> Why? Stefan is compiling with -O3.

Because the runtime is not visible as source-code.
If it were, llvm could make this into the popcnt instruction.
But the inliner is blind when it calls a library...
January 21, 2016
On Wednesday, 20 January 2016 at 17:35:02 UTC, Stefan Koch wrote:
> Because the runtime is not visible as source-code.
> If it were, llvm could make this into the popcnt instruction.
> But the inliner is blind when it calls a library...

If a function has pragma(inline) and inliner doesn't inline it, then the pragma is not implemented. And how dmd does it if the function source is not visible?
January 21, 2016
On Thursday, 21 January 2016 at 09:03:54 UTC, Kagamin wrote:
> On Wednesday, 20 January 2016 at 17:35:02 UTC, Stefan Koch wrote:
>> Because the runtime is not visible as source-code.
>> If it were, llvm could make this into the popcnt instruction.
>> But the inliner is blind when it calls a library...
>
> If a function has pragma(inline) and inliner doesn't inline it, then the pragma is not implemented. And how dmd does it if the function source is not visible?

dmd does not. if you use the runtime function popcnt...
it does use the intrinsic if you use the intrinsic _popcnt