Thread overview | |||||
---|---|---|---|---|---|
|
November 22, 2014 naked popcnt function | ||||
---|---|---|---|---|
| ||||
Hello, I would like to write a "popcnt" function. This works fine ulong popcnt(ulong x) { asm { mov RAX, x ; popcnt RAX, RAX ; } } However, if I add the "naked" keyword ( which should improve performance? ) it doesn't work anymore and I can't figure out what change I am supposed to make ( aside from x[RBP] instead of x ) This function is going to be *heavily* used. Thanks for any help. |
November 22, 2014 Re: naked popcnt function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ad | On Saturday, 22 November 2014 at 18:30:06 UTC, Ad wrote:
> Hello, I would like to write a "popcnt" function. This works fine
>
> ulong popcnt(ulong x)
> {
> asm { mov RAX, x ; popcnt RAX, RAX ; }
> }
>
> However, if I add the "naked" keyword ( which should improve performance? ) it doesn't work anymore and I can't figure out what change I am supposed to make ( aside from x[RBP] instead of x )
> This function is going to be *heavily* used.
>
> Thanks for any help.
Last time I used naked asm simply used the calling convention to figure out the location of the parameter (e.g. RCX win64, RDI linux 64, iirc.)
N.B. on LDC & GDC there is an intrinsic for popcnt.
|
November 23, 2014 Re: naked popcnt function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ad | Am Sat, 22 Nov 2014 18:30:05 +0000 schrieb "Ad" <ad@fakmail.fg>: > Hello, I would like to write a "popcnt" function. This works fine > > ulong popcnt(ulong x) > { > asm { mov RAX, x ; popcnt RAX, RAX ; } > } > > However, if I add the "naked" keyword ( which should improve > performance? ) it doesn't work anymore and I can't figure out > what change I am supposed to make ( aside from x[RBP] instead of > x ) > This function is going to be *heavily* used. > > Thanks for any help. It is long ago that I tried "naked", but IIRC it strips all compiler generated code from the function and I see no 'ret' in your function. So it probably runs into whatever code lies behind that function in the executable. I would use a tool like obj2asm or objdump to check what the generated code looks like, or use a debugger that can disassemble on the fly. -- Marco |
Copyright © 1999-2021 by the D Language Foundation