February 17, 2016
On Wednesday, 17 February 2016 at 18:26:47 UTC, Basile B. wrote:
> Anyway, not good for phobos, why? When looking for documentation yesterday night I've landed on a post by Walter who explained that the library for a system programming language shouldn't be specific to an architecture.

While I don't know about the post you're talking about, I don't think what Walter says applies to internal version blocks in a function. You could make it so on AMD lround and friends are much faster by using those ASM routines. Also, std.math is already chock full of architecture specific code.
February 17, 2016
On Wednesday, 17 February 2016 at 18:50:45 UTC, Jack Stouffer wrote:
> On Wednesday, 17 February 2016 at 18:26:47 UTC, Basile B. wrote:
>> Anyway, not good for phobos, why? When looking for documentation yesterday night I've landed on a post by Walter who explained that the library for a system programming language shouldn't be specific to an architecture.
>
> While I don't know about the post you're talking about, I don't think what Walter says applies to internal version blocks in a function. You could make it so on AMD lround and friends are much faster by using those ASM routines. Also, std.math is already chock full of architecture specific code.

That's more subtile than that.

The oldest 64 bit processor (AMD64) supports SSE, always. So when we do "cast(int) 0.1;" on X86_64, the backend always generate SSE instructions.

The oldest 32 bit processor (X86) doesn't support SSE, maybe MMX (not sure). So when we do "cast(int) 0.1;" on X86, the backend always generate FPU instructions.

This is how I understand the post 'I've landed onto'.
My current works always use SSE so it's not conform with the "at least available" feature.
February 17, 2016
On Wednesday, 17 February 2016 at 19:01:38 UTC, Basile B. wrote:
> That's more subtile than that.
>
> The oldest 64 bit processor (AMD64) supports SSE, always. So when we do "cast(int) 0.1;" on X86_64, the backend always generate SSE instructions.
>
> The oldest 32 bit processor (X86) doesn't support SSE, maybe MMX (not sure). So when we do "cast(int) 0.1;" on X86, the backend always generate FPU instructions.
>
> This is how I understand the post 'I've landed onto'.
> My current work always use SSE so it's not conform with the "at least available" feature.

Also, forgot to say, but an uniform API is needed to set the rounding mode, whether SSE is used or the FPU...
February 21, 2016
I think it's important that DMD gets more of the easier optimisations. Most new users won't bother trying GDC or LDC, and if DMD doesn't generate fast enough code, they might leave before they try the compilers with better optimisations.
February 21, 2016
On Wednesday, 17 February 2016 at 19:01:38 UTC, Basile B. wrote:
> The oldest 32 bit processor (X86) doesn't support SSE, maybe MMX (not sure). So when we do "cast(int) 0.1;" on X86, the backend always generate FPU instructions.

SSE goes back to Pentium III, doesn't it? And the Pentium 4 supported SSE3, didn't it? Is it an active specification of D to run on Pentium II e.g.?
March 08, 2016
Am Wed, 17 Feb 2016 19:55:08 +0000
schrieb Basile B. <b2.temp@gmx.com>:

> Also, forgot to say, but an uniform API is needed to set the rounding mode, whether SSE is used or the FPU...

At least GCC has a codegen switch for that. A solution would have to either set both rounding modes at once or the compilers would need to expose version MathFPU/MathSSE.

-- 
Marco

March 08, 2016
On Monday, 15 February 2016 at 13:51:38 UTC, ixid wrote:
> Every time there is a D thread on reddit it feels like the new user is expecting mind-blowing speed from D.
>
> https://www.reddit.com/r/programming/comments/45v03g/porterstemmerd_an_implementation_of_the_porter/
>
> This is the most recent one where John Colvin provided some pointers to speed it up significantly. Walter has done some good work taking the low-hanging fruit to speed up DMD code and there is a lot of effort going on with reference counting machinery but I wondered if some of the common errors people make that slow down D code can be addressed?
>
> Literals used to be a hidden speed bump but I think that was improved, now the append operator is one of the most common culprits, can this not be enhanced behind the scenes to work more like append? Do others notice common pitfalls between the article code and what the D community then suggests where we can bridge the gap so naive users get faster code?

Since I posted this thread I've learned std.algorithm.sum is 4 times slower than a naive loop sum. Even if this is for reasons of accuracy this is exactly what I am talking about- this is a hidden iceberg of terrible performance that will reflect poorly on D. That's so slow the function needs a health warning.
March 08, 2016
On Tuesday, 8 March 2016 at 14:14:25 UTC, ixid wrote:
>
> Since I posted this thread I've learned std.algorithm.sum is 4 times slower than a naive loop sum. Even if this is for reasons of accuracy this is exactly what I am talking about- this is a hidden iceberg of terrible performance that will reflect poorly on D. That's so slow the function needs a health warning.

There was a longer discussion here
https://forum.dlang.org/post/vkiwojmfjrwhigbkenaa@forum.dlang.org

March 09, 2016
On 03/08/2016 09:14 AM, ixid wrote:
> Since I posted this thread I've learned std.algorithm.sum is 4 times
> slower than a naive loop sum. Even if this is for reasons of accuracy
> this is exactly what I am talking about- this is a hidden iceberg of
> terrible performance that will reflect poorly on D. That's so slow the
> function needs a health warning.

Whoa. What's happening there? Do we have anyone on it? -- Andrei

March 09, 2016

Dne 9.3.2016 v 14:26 Andrei Alexandrescu via Digitalmars-d napsal(a):
> On 03/08/2016 09:14 AM, ixid wrote:
>> Since I posted this thread I've learned std.algorithm.sum is 4 times
>> slower than a naive loop sum. Even if this is for reasons of accuracy
>> this is exactly what I am talking about- this is a hidden iceberg of
>> terrible performance that will reflect poorly on D. That's so slow the
>> function needs a health warning.
>
> Whoa. What's happening there? Do we have anyone on it? -- Andrei
>
I guess he speaks about this one:

http://forum.dlang.org/post/mailman.4748.1456070484.22025.digitalmars-d-learn@puremagic.com