Jump to page: 1 2 3
Thread overview
Speed kills
Feb 15, 2016
ixid
Feb 15, 2016
Guillaume Piolat
Feb 15, 2016
Wyatt
Feb 15, 2016
Basile B.
Feb 15, 2016
Jack Stouffer
Feb 17, 2016
Basile B.
Feb 17, 2016
Jack Stouffer
Feb 17, 2016
Basile B.
Feb 17, 2016
Basile B.
Feb 21, 2016
w0rp
Mar 08, 2016
Marco Leise
Feb 21, 2016
Luc J. Bourhis
Feb 15, 2016
Guillaume Piolat
Feb 15, 2016
Basile B.
Feb 16, 2016
Guillaume Piolat
Feb 15, 2016
rsw0x
Mar 08, 2016
ixid
Mar 08, 2016
jmh530
Mar 09, 2016
Daniel Kozak
Mar 09, 2016
cym13
Mar 09, 2016
jmh530
Mar 09, 2016
John Colvin
Mar 09, 2016
John Colvin
Mar 09, 2016
Jon D
Mar 09, 2016
H. S. Teoh
Mar 09, 2016
John Colvin
Mar 10, 2016
Jon D
February 15, 2016
Every time there is a D thread on reddit it feels like the new user is expecting mind-blowing speed from D.

https://www.reddit.com/r/programming/comments/45v03g/porterstemmerd_an_implementation_of_the_porter/

This is the most recent one where John Colvin provided some pointers to speed it up significantly. Walter has done some good work taking the low-hanging fruit to speed up DMD code and there is a lot of effort going on with reference counting machinery but I wondered if some of the common errors people make that slow down D code can be addressed?

Literals used to be a hidden speed bump but I think that was improved, now the append operator is one of the most common culprits, can this not be enhanced behind the scenes to work more like append? Do others notice common pitfalls between the article code and what the D community then suggests where we can bridge the gap so naive users get faster code?
February 15, 2016
On Monday, 15 February 2016 at 13:51:38 UTC, ixid wrote:
> This is the most recent one where John Colvin provided some pointers to speed it up significantly. Walter has done some good work taking the low-hanging fruit to speed up DMD code and there is a lot of effort going on with reference counting machinery but I wondered if some of the common errors people make that slow down D code can be addressed?

Something that annoyed me a bit is floating-point comparisons, DMD does not seem to be able to handle them from SSE registers, it will convert to FPU and do the comparison there IIRC.
February 15, 2016
On Monday, 15 February 2016 at 14:16:02 UTC, Guillaume Piolat wrote:
>
> Something that annoyed me a bit is floating-point comparisons, DMD does not seem to be able to handle them from SSE registers, it will convert to FPU and do the comparison there IIRC.

I feel like this point comes up often, and that a lot of people have argued x87 FP should just not happen anymore.

-Wyatt
February 15, 2016
On Monday, 15 February 2016 at 13:51:38 UTC, ixid wrote:
> Every time there is a D thread on reddit it feels like the new user is expecting mind-blowing speed from D.
>
> [...]

if you want better codegen, don't use dmd.
use ldc, it's usualy only a version-ish behind dmd.
February 15, 2016
On Monday, 15 February 2016 at 14:16:02 UTC, Guillaume Piolat wrote:
> On Monday, 15 February 2016 at 13:51:38 UTC, ixid wrote:
>> This is the most recent one where John Colvin provided some pointers to speed it up significantly. Walter has done some good work taking the low-hanging fruit to speed up DMD code and there is a lot of effort going on with reference counting machinery but I wondered if some of the common errors people make that slow down D code can be addressed?
>
> Something that annoyed me a bit is floating-point comparisons, DMD does not seem to be able to handle them from SSE registers, it will convert to FPU and do the comparison there IIRC.

Same for std.math.lround

they use the FP way while for float and double it's only one sse instruction. Typically with 6 functions similar to this one:


int round(float value)
{
    asm
    {
        naked;
        cvtss2si EAX, XMM0;
        ret;
    }
}

we could get ceil/trunc/round/floor, also almost as easily fmod, hypoth.
classic but I dont get why thery're not in std.math.

Goddamnit, we're in 2016.
February 15, 2016
On Monday, 15 February 2016 at 22:29:00 UTC, Basile B. wrote:
> we could get ceil/trunc/round/floor, also almost as easily fmod, hypoth.
> classic but I dont get why thery're not in std.math.

Seems like you know a lot about the subject, and I know you contributed to phobos before, so how about making a PR for this :)
February 15, 2016
On Monday, 15 February 2016 at 22:29:00 UTC, Basile B. wrote:
> Same for std.math.lround
>
> they use the FP way while for float and double it's only one sse instruction. Typically with 6 functions similar to this one:
>
>
> int round(float value)
> {
>     asm
>     {
>         naked;
>         cvtss2si EAX, XMM0;
>         ret;
>     }
> }
>
> we could get ceil/trunc/round/floor, also almost as easily fmod, hypoth.
> classic but I dont get why thery're not in std.math.
>
> Goddamnit, we're in 2016.

lround and friends have been a big performance problem at times.
Everytime you can use cast(int) instead, it's way faster.
February 15, 2016
On Monday, 15 February 2016 at 23:19:44 UTC, Guillaume Piolat wrote:
>
> lround and friends have been a big performance problem at times.
> Everytime you can use cast(int) instead, it's way faster.

I didn't know this trick. It generates almost the same sse intruction (it truncates) and has the advantage to be inline-able.

Is it documented somewhere ? If not it should.


February 16, 2016
On Monday, 15 February 2016 at 23:35:54 UTC, Basile B. wrote:
> On Monday, 15 February 2016 at 23:19:44 UTC, Guillaume Piolat wrote:
>>
>> lround and friends have been a big performance problem at times.
>> Everytime you can use cast(int) instead, it's way faster.
>
> I didn't know this trick. It generates almost the same sse intruction (it truncates) and has the advantage to be inline-able.
>
> Is it documented somewhere ? If not it should.

In SSE3 you also get an instruction that does this without messing with the x87 control word: FISTTP.
February 17, 2016
On Monday, 15 February 2016 at 23:13:13 UTC, Jack Stouffer wrote:
> On Monday, 15 February 2016 at 22:29:00 UTC, Basile B. wrote:
>> we could get ceil/trunc/round/floor, also almost as easily fmod, hypoth.
>> classic but I dont get why thery're not in std.math.
>
> Seems like you know a lot about the subject, and I know you contributed to phobos before, so how about making a PR for this :)

In the meantime:

https://github.com/BBasile/iz/blob/master/import/iz/math.d

Actually when i've participated to this conversation I didn't remember that it was not good on X86. Using SSE rouding is really only good on AMD64, otherwise loading the input parameter "sucks" a lot (even for a 32 bit float since it's not directly in EAX or XMMO).

Anyway, not good for phobos, why? When looking for documentation yesterday night I've landed on a post by Walter who explained that the library for a system programming language shouldn't be specific to an architecture.


« First   ‹ Prev
1 2 3