Jump to page: 1 2
Thread overview
Good project: stride() with constant stride value
Mar 04, 2016
Seb
Mar 04, 2016
John Colvin
Mar 04, 2016
kinke
Mar 04, 2016
jmh530
Mar 04, 2016
Meta
Mar 04, 2016
H. S. Teoh
Mar 05, 2016
John Colvin
Mar 05, 2016
kinke
Mar 04, 2016
Jonathan M Davis
Mar 05, 2016
Jonathan M Davis
Mar 06, 2016
Seb
March 04, 2016
Currently we have a very useful stride() function that allows spanning a random access range with a specified step, e.g. 0, 3, 6, 9, ... for step 3.

I've run some measurements recently and it turns out a compile-time-known stride is a lot faster than a variable. So I was thinking to improve Stride(R) to take an additional parameter: Stride(R, size_t step = 0). If step is 0, then use a runtime-valued stride as until now. If nonzero, Stride should use that compile-time step.

Takers?


Andrei
March 04, 2016
On Friday, 4 March 2016 at 16:45:42 UTC, Andrei Alexandrescu wrote:
> Currently we have a very useful stride() function that allows spanning a random access range with a specified step, e.g. 0, 3, 6, 9, ... for step 3.
>
> I've run some measurements recently and it turns out a compile-time-known stride is a lot faster than a variable. So I was thinking to improve Stride(R) to take an additional parameter: Stride(R, size_t step = 0). If step is 0, then use a runtime-valued stride as until now. If nonzero, Stride should use that compile-time step.
>
> Takers?
>
>
> Andrei

Sounds like fun :) - anything special that I should worry/care about?
March 04, 2016
On Friday, 4 March 2016 at 16:45:42 UTC, Andrei Alexandrescu wrote:
> Currently we have a very useful stride() function that allows spanning a random access range with a specified step, e.g. 0, 3, 6, 9, ... for step 3.
>
> I've run some measurements recently and it turns out a compile-time-known stride is a lot faster than a variable. So I was thinking to improve Stride(R) to take an additional parameter: Stride(R, size_t step = 0). If step is 0, then use a runtime-valued stride as until now. If nonzero, Stride should use that compile-time step.
>
> Takers?
>
>
> Andrei

Surely after inlining (I mean real inlining, not dmd) it makes no difference, a constant is a constant?

I remember doing tests of things like that and finding that not only did it not make a difference to performance, ldc produced near-identical asm either way.
March 04, 2016
On Friday, 4 March 2016 at 17:49:09 UTC, John Colvin wrote:
> Surely after inlining (I mean real inlining, not dmd) it makes no difference, a constant is a constant?
>
> I remember doing tests of things like that and finding that not only did it not make a difference to performance, ldc produced near-identical asm either way.

Then let's not complicate Phobos please. I'm really no friend of special semantics for `step == 0` and stuff like that. Let's keep code as readable and simple as possible, especially in the standard libraries, and let the compilers do their job at optimizing low-level stuff for release builds.
More templates surely impact compilation speed, and that's where DMD shines.
March 04, 2016
On Friday, 4 March 2016 at 18:40:58 UTC, kinke wrote:
>
> Then let's not complicate Phobos please. I'm really no friend of special semantics for `step == 0` and stuff like that. Let's keep code as readable and simple as possible, especially in the standard libraries, and let the compilers do their job at optimizing low-level stuff for release builds.
> More templates surely impact compilation speed, and that's where DMD shines.

Stride is already a template. The compiler would just pick the right template to instantiate. Can't imagine that would be a significant impact on compilation speed.
March 04, 2016
kinke <noone@nowhere.com> wrote:
> On Friday, 4 March 2016 at 17:49:09 UTC, John Colvin wrote:
>> Surely after inlining (I mean real inlining, not dmd) it makes no difference, a constant is a constant?
>> 
>> I remember doing tests of things like that and finding that not only did it not make a difference to performance, ldc produced near-identical asm either way.
> 
> Then let's not complicate Phobos please. I'm really no friend of
> special semantics for `step == 0` and stuff like that. Let's keep
> code as readable and simple as possible, especially in the
> standard libraries, and let the compilers do their job at
> optimizing low-level stuff for release builds.
> More templates surely impact compilation speed, and that's where
> DMD shines.
> 

This is just speculation. When the stride is passed to larger functions the value of the stride is long lost.

I understand the desire for nice and simple code but sadly the stdlib is not a good place for it - everything must be tightly optimized. The value of the project stands. -- Andrei

March 04, 2016
On Friday, 4 March 2016 at 20:14:41 UTC, Andrei Alexandrescu wrote:
> This is just speculation. When the stride is passed to larger functions the value of the stride is long lost.
>
> I understand the desire for nice and simple code but sadly the stdlib is not a good place for it - everything must be tightly optimized. The value of the project stands. -- Andrei

It's easy to implement but isn't this an optimization that LDC/GDC would already do if the stride is known at compile time? Should we be optimizing the standard library for DMD when (speculation but probably true) it's the only one that can't perform such an optimization?
March 04, 2016
On Fri, Mar 04, 2016 at 08:14:41PM +0000, Andrei Alexandrescu via Digitalmars-d wrote:
> kinke <noone@nowhere.com> wrote:
> > On Friday, 4 March 2016 at 17:49:09 UTC, John Colvin wrote:
> >> Surely after inlining (I mean real inlining, not dmd) it makes no difference, a constant is a constant?
> >> 
> >> I remember doing tests of things like that and finding that not only did it not make a difference to performance, ldc produced near-identical asm either way.
> > 
> > Then let's not complicate Phobos please. I'm really no friend of special semantics for `step == 0` and stuff like that. Let's keep code as readable and simple as possible, especially in the standard libraries, and let the compilers do their job at optimizing low-level stuff for release builds.  More templates surely impact compilation speed, and that's where DMD shines.
> > 
> 
> This is just speculation. When the stride is passed to larger functions the value of the stride is long lost.
> 
> I understand the desire for nice and simple code but sadly the stdlib is not a good place for it - everything must be tightly optimized. The value of the project stands. -- Andrei

Why not rather improve dmd optimization, so that such manual optimizations are no longer necessary?


T

-- 
English has the lovely word "defenestrate", meaning "to execute by throwing someone out a window", or more recently "to remove Windows from a computer and replace it with something useful". :-) -- John Cowan
March 04, 2016
On Friday, 4 March 2016 at 16:45:42 UTC, Andrei Alexandrescu wrote:
> Currently we have a very useful stride() function that allows spanning a random access range with a specified step, e.g. 0, 3, 6, 9, ... for step 3.
>
> I've run some measurements recently and it turns out a compile-time-known stride is a lot faster than a variable. So I was thinking to improve Stride(R) to take an additional parameter: Stride(R, size_t step = 0). If step is 0, then use a runtime-valued stride as until now. If nonzero, Stride should use that compile-time step.
>
> Takers?

IMHO, it would be cleaner to make them separate templates so that we don't have to give some special meaning to step == 0. And if it made sense for them to share their implementation, we could still have a helper template that did the step == 0 so that it was hidden from the user.

- Jonathan M Davis
March 04, 2016
On 03/04/2016 04:32 PM, Jonathan M Davis wrote:
> IMHO, it would be cleaner to make them separate templates so that we
> don't have to give some special meaning to step == 0. And if it made
> sense for them to share their implementation, we could still have a
> helper template that did the step == 0 so that it was hidden from the user.

It's definitely simpler, easier to understand, and less code to specialize for step = 0. I do a lot of that in std.allocator. -- Andrei
« First   ‹ Prev
1 2