January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
On 17 January 2012 08:17, Manu <turkeyman@gmail.com> wrote: > On 17 January 2012 03:56, Iain Buclaw <ibuclaw@ubuntu.com> wrote: >> >> On 16 January 2012 22:36, Iain Buclaw <ibuclaw@ubuntu.com> wrote: >> > On 16 January 2012 21:57, Peter Alexander <peter.alexander.au@gmail.com> wrote: >> >> On 16/01/12 8:56 PM, Iain Buclaw wrote: >> >>> >> >>> On 16 January 2012 19:25, Walter Bright<newshound2@digitalmars.com> wrote: >> >>>> >> >>>> On 1/16/2012 11:16 AM, Iain Buclaw wrote: >> >>>> >> >>>>>> >> >>>>>> But don't worry, I'm not planning on working on that at the moment :-) >> >>>>> >> >>>>> >> >>>>> Leave that sort of optimisation for the backend to handle please. ;-) >> >>>> >> >>>> >> >>>> >> >>>> Of course. >> >>>> >> >>>> I suspect Intel's compiler does that one, does gcc? >> >>>> >> >>> >> >>> There's auto-vectorisation for for(), foreach(), and foreach_reverse() >> >>> loops that I have written support for. I am not aware of GCC >> >>> vectorising anything else. >> >>> >> >>> example: >> >>> >> >>> int a[256], b[256], c[256]; >> >>> void foo () { >> >>> for (int i=0; i<256; i++) >> >>> a[i] = b[i] + c[i]; >> >>> } >> >>> >> >> >> >> Unfortunately, if the function was this: >> >> >> >> void foo(int[] a, int[] b, int[] c) { >> >> >> >> for (int i=0; i<256; i++) >> >> a[i] = b[i] + c[i]; >> >> } >> >> >> >> Then it can't vectorize due to aliasing. >> > >> > Compile with -fstrict-aliasing then? >> > >> > I could certainly play about with having this enabled by default, but I forsee there may be issues (maybe have it on for @safe code?) >> >> OK, have turned on strict aliasing by default for D2. You should now be able to vectorise loops that use locals and parameters. :-) > > > What protects these ranges from being overlapping? What if they were sourced > from pointers? > Are just we to say in D that aliasing is not allowed, and 'you shouldn't do > it'? People almost never alias intentionally, it's usually the > most insidious of bugs. :/ D arrays have a .length property that keeps track of the length of the array. When array bounds checking is turned on (default when not compiling with -release) an assert is produced when you step outside the bounds of the array. Is this what you mean? -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0'; |
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright Attachments:
| On 17 January 2012 12:33, Walter Bright <newshound2@digitalmars.com> wrote:
> On 1/17/2012 12:17 AM, Manu wrote:
>
>> What protects these ranges from being overlapping?
>>
>
> A runtime check, like array bounds checking.
>
Awesome.
How does this map to pointer dereferencing?
|
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Attachments:
| On 17 January 2012 11:48, Martin Nowak <dawg@dawgfoto.de> wrote:
> On Tue, 17 Jan 2012 09:20:43 +0100, Manu <turkeyman@gmail.com> wrote:
>
> On 17 January 2012 05:55, bearophile <bearophileHUGS@lycos.com> wrote:
>>
>> Walter:
>>>
>>> > But don't worry, I'm not planning on working on that at the moment :-)
>>>
>>> Until better optimizations are implemented, I see a "simple" optimization
>>> for vector ops. When the compiler knows an arrays are very small it
>>> unrolls
>>> the operation in-place:
>>>
>>> int n = 5;
>>> auto a = new int[n];
>>> auto b = new int[n];
>>> a[] += b[];
>>>
>>> ==>
>>>
>>> int n = 5;
>>> auto a = new int[n]; // a and b are dynamic arrays,
>>> auto b = new int[n]; // but their length is easy to find at compile-time
>>> a[0] += b[0];
>>> a[1] += b[1];
>>> a[2] += b[2];
>>> a[3] += b[4];
>>> a[5] += b[5];
>>>
>>>
>> If this doesn't already exist, I think it's quite important. I was
>> thinking
>> about needing to repeatedly specialise a template last night for a bunch
>> of
>> short lengths of arrays, for this exact reason.
>> Unrolling short loops must be one of the most trivial and worthwhile
>> optimisations...
>>
>
> If the compiler knows it's a compile time constant
> thus you could use a static foreach.
>
Great idea! :)
|
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On 1/17/2012 2:04 AM, Martin Nowak wrote:
> I was about to rewrite it at some point.
> https://gist.github.com/1235470
I think you've got an innovative and clever solution. I'd like to see you finish it!
|
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Tue, 17 Jan 2012 11:53:35 +0100, Walter Bright <newshound2@digitalmars.com> wrote: > On 1/17/2012 2:04 AM, Martin Nowak wrote: >> I was about to rewrite it at some point. >> https://gist.github.com/1235470 > > I think you've got an innovative and clever solution. I'd like to see you finish it! > Mmh, there was something keeping me from specializing templates, https://github.com/D-Programming-Language/dmd/pull/396 :). But right now I'd rather like to finish the shared library merging. |
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On 1/17/2012 5:20 AM, Martin Nowak wrote:
> Mmh, there was something keeping me from specializing templates,
> https://github.com/D-Programming-Language/dmd/pull/396 :).
>
> But right now I'd rather like to finish the shared library merging.
I agree.
|
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 1/17/2012 2:43 AM, Manu wrote:
> On 17 January 2012 12:33, Walter Bright <newshound2@digitalmars.com
> <mailto:newshound2@digitalmars.com>> wrote:
>
> On 1/17/2012 12:17 AM, Manu wrote:
>
> What protects these ranges from being overlapping?
>
>
> A runtime check, like array bounds checking.
>
>
> Awesome.
> How does this map to pointer dereferencing?
It can't. Use dynamic arrays - that's what they're for.
|
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Iain Buclaw | On 16/01/12 10:36 PM, Iain Buclaw wrote: > On 16 January 2012 21:57, Peter Alexander<peter.alexander.au@gmail.com> wrote: >> On 16/01/12 8:56 PM, Iain Buclaw wrote: >>> >>> On 16 January 2012 19:25, Walter Bright<newshound2@digitalmars.com> >>> wrote: >>>> >>>> On 1/16/2012 11:16 AM, Iain Buclaw wrote: >>>> >>>>>> >>>>>> But don't worry, I'm not planning on working on that at the moment :-) >>>>> >>>>> >>>>> Leave that sort of optimisation for the backend to handle please. ;-) >>>> >>>> >>>> >>>> Of course. >>>> >>>> I suspect Intel's compiler does that one, does gcc? >>>> >>> >>> There's auto-vectorisation for for(), foreach(), and foreach_reverse() >>> loops that I have written support for. I am not aware of GCC >>> vectorising anything else. >>> >>> example: >>> >>> int a[256], b[256], c[256]; >>> void foo () { >>> for (int i=0; i<256; i++) >>> a[i] = b[i] + c[i]; >>> } >>> >> >> Unfortunately, if the function was this: >> >> void foo(int[] a, int[] b, int[] c) { >> >> for (int i=0; i<256; i++) >> a[i] = b[i] + c[i]; >> } >> >> Then it can't vectorize due to aliasing. > > Compile with -fstrict-aliasing then? This has nothing to do with strict aliasing. a[257]; foo(a[1..257], a[0..256], a[0..256]); This doesn't break any strict aliasing rule, but the loop still cannot be (trivially) vectorized. As Manu said, you need something like __restrict (or a linear type system) to solve this problem. http://en.wikipedia.org/wiki/Linear_type_system http://en.wikipedia.org/wiki/Uniqueness_typing |
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Peter Alexander | On 1/17/2012 1:20 PM, Peter Alexander wrote:
> As Manu said, you need something like __restrict (or a linear type system) to
> solve this problem.
No, you don't. It can be done with a runtime check, like array bounds checking is done.
|
January 17, 2012 Re: SIMD benchmark | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 17/01/12 9:24 PM, Walter Bright wrote:
> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>> As Manu said, you need something like __restrict (or a linear type
>> system) to
>> solve this problem.
>
> No, you don't. It can be done with a runtime check, like array bounds
> checking is done.
So you'd change it to this, even in release builds?
void foo(int[] a, int[] b, int[] c)
{
if ( /* arrays overlap */ )
{
foreach(i; 0..256)
a[i] = b[i] + c[i];
}
else
{
/* vectorized code */
}
}
i.e. duplicate all loops that can be potentially vectorized depending on aliasing? Please bear in mind that this is a simple example.
Seems a bit inefficient (code size).
|
Copyright © 1999-2021 by the D Language Foundation