April 08, 2015 Re: How to deal with inline asm functions in Phobos/druntime? | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Wednesday, 8 April 2015 at 13:28:16 UTC, David Nadlinger wrote: > On 8 Apr 2015, at 15:15, Daniel Murphy via digitalmars-d-ldc wrote: >> I don't think it's so much about vectorizing as it is about avoiding the x87 FPU, which you can do when 80-bit precision is not needed. > > Indeed. On x86_64, the SSE registers (%xmm0 and so on) are used by default for single- and double-precision floating point operations. The x87 FPU is not particularly well-optimized on newer CPUs to begin with, and transferring data from the SSE registers to the FPU on function entry and then back again is quite costly too. > > For example, this is what made us (all D compilers) look bad on that Perlin noise microbenchmark (the thread from a couple of months ago). Ah, ok. Didn't realize. For future reference: http://gruntthepeon.free.fr/ssemath |
April 12, 2015 Re: How to deal with inline asm functions in Phobos/druntime? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | I wrote a non-asm ilogb, that actually runs quite a bit faster than what DMD or LDC do standard, and should also be much more portable. See https://github.com/D-Programming-Language/phobos/pull/3186 |
April 13, 2015 Re: How to deal with inline asm functions in Phobos/druntime? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Sunday, 12 April 2015 at 22:35:20 UTC, Johan Engelen wrote:
> I wrote a non-asm ilogb, that actually runs quite a bit faster than what DMD or LDC do standard, and should also be much more portable.
> See
> https://github.com/D-Programming-Language/phobos/pull/3186
Nice! For LDC you can replace bsr with intrinsic llvm.ctlz.i#. It is a template and also CTFE enabled.
Regards,
Kai
|
Copyright © 1999-2021 by the D Language Foundation