January 17, 2012
On 17 January 2012 08:17, Manu <turkeyman@gmail.com> wrote:
> On 17 January 2012 03:56, Iain Buclaw <ibuclaw@ubuntu.com> wrote:
>>
>> On 16 January 2012 22:36, Iain Buclaw <ibuclaw@ubuntu.com> wrote:
>> > On 16 January 2012 21:57, Peter Alexander <peter.alexander.au@gmail.com> wrote:
>> >> On 16/01/12 8:56 PM, Iain Buclaw wrote:
>> >>>
>> >>> On 16 January 2012 19:25, Walter Bright<newshound2@digitalmars.com>  wrote:
>> >>>>
>> >>>> On 1/16/2012 11:16 AM, Iain Buclaw wrote:
>> >>>>
>> >>>>>>
>> >>>>>> But don't worry, I'm not planning on working on that at the moment :-)
>> >>>>>
>> >>>>>
>> >>>>> Leave that sort of optimisation for the backend to handle please. ;-)
>> >>>>
>> >>>>
>> >>>>
>> >>>> Of course.
>> >>>>
>> >>>> I suspect Intel's compiler does that one, does gcc?
>> >>>>
>> >>>
>> >>> There's auto-vectorisation for for(), foreach(), and foreach_reverse()
>> >>> loops that I have written support for.  I am not aware of GCC
>> >>> vectorising anything else.
>> >>>
>> >>> example:
>> >>>
>> >>> int a[256], b[256], c[256];
>> >>> void foo () {
>> >>>   for (int i=0; i<256; i++)
>> >>>     a[i] = b[i] + c[i];
>> >>> }
>> >>>
>> >>
>> >> Unfortunately, if the function was this:
>> >>
>> >> void foo(int[] a, int[] b, int[] c) {
>> >>
>> >>  for (int i=0; i<256; i++)
>> >>    a[i] = b[i] + c[i];
>> >> }
>> >>
>> >> Then it can't vectorize due to aliasing.
>> >
>> > Compile with -fstrict-aliasing then?
>> >
>> > I could certainly play about with having this enabled by default, but I forsee there may be issues (maybe have it on for @safe code?)
>>
>> OK, have turned on strict aliasing by default for D2.  You should now be able to vectorise loops that use locals and parameters. :-)
>
>
> What protects these ranges from being overlapping? What if they were sourced
> from pointers?
> Are just we to say in D that aliasing is not allowed, and 'you shouldn't do
> it'? People almost never alias intentionally, it's usually the
> most insidious of bugs. :/

D arrays have a .length property that keeps track of the length of the array. When array bounds checking is turned on (default when not compiling with -release) an assert is produced when you step outside the bounds of the array.


Is this what you mean?

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
January 17, 2012
On 17 January 2012 12:33, Walter Bright <newshound2@digitalmars.com> wrote:

> On 1/17/2012 12:17 AM, Manu wrote:
>
>> What protects these ranges from being overlapping?
>>
>
> A runtime check, like array bounds checking.
>

Awesome.
How does this map to pointer dereferencing?


January 17, 2012
On 17 January 2012 11:48, Martin Nowak <dawg@dawgfoto.de> wrote:

> On Tue, 17 Jan 2012 09:20:43 +0100, Manu <turkeyman@gmail.com> wrote:
>
>  On 17 January 2012 05:55, bearophile <bearophileHUGS@lycos.com> wrote:
>>
>>  Walter:
>>>
>>> > But don't worry, I'm not planning on working on that at the moment :-)
>>>
>>> Until better optimizations are implemented, I see a "simple" optimization
>>> for vector ops. When the compiler knows an arrays are very small it
>>> unrolls
>>> the operation in-place:
>>>
>>> int n = 5;
>>> auto a = new int[n];
>>> auto b = new int[n];
>>> a[] += b[];
>>>
>>> ==>
>>>
>>> int n = 5;
>>> auto a = new int[n]; // a and b are dynamic arrays,
>>> auto b = new int[n]; // but their length is easy to find at compile-time
>>> a[0] += b[0];
>>> a[1] += b[1];
>>> a[2] += b[2];
>>> a[3] += b[4];
>>> a[5] += b[5];
>>>
>>>
>> If this doesn't already exist, I think it's quite important. I was
>> thinking
>> about needing to repeatedly specialise a template last night for a bunch
>> of
>> short lengths of arrays, for this exact reason.
>> Unrolling short loops must be one of the most trivial and worthwhile
>> optimisations...
>>
>
> If the compiler knows it's a compile time constant
> thus you could use a static foreach.
>

Great idea! :)


January 17, 2012
On 1/17/2012 2:04 AM, Martin Nowak wrote:
> I was about to rewrite it at some point.
> https://gist.github.com/1235470

I think you've got an innovative and clever solution. I'd like to see you finish it!

January 17, 2012
On Tue, 17 Jan 2012 11:53:35 +0100, Walter Bright <newshound2@digitalmars.com> wrote:

> On 1/17/2012 2:04 AM, Martin Nowak wrote:
>> I was about to rewrite it at some point.
>> https://gist.github.com/1235470
>
> I think you've got an innovative and clever solution. I'd like to see you finish it!
>

Mmh, there was something keeping me from specializing templates,
https://github.com/D-Programming-Language/dmd/pull/396 :).

But right now I'd rather like to finish the shared library merging.
January 17, 2012
On 1/17/2012 5:20 AM, Martin Nowak wrote:
> Mmh, there was something keeping me from specializing templates,
> https://github.com/D-Programming-Language/dmd/pull/396 :).
>
> But right now I'd rather like to finish the shared library merging.

I agree.
January 17, 2012
On 1/17/2012 2:43 AM, Manu wrote:
> On 17 January 2012 12:33, Walter Bright <newshound2@digitalmars.com
> <mailto:newshound2@digitalmars.com>> wrote:
>
>     On 1/17/2012 12:17 AM, Manu wrote:
>
>         What protects these ranges from being overlapping?
>
>
>     A runtime check, like array bounds checking.
>
>
> Awesome.
> How does this map to pointer dereferencing?

It can't. Use dynamic arrays - that's what they're for.
January 17, 2012
On 16/01/12 10:36 PM, Iain Buclaw wrote:
> On 16 January 2012 21:57, Peter Alexander<peter.alexander.au@gmail.com>  wrote:
>> On 16/01/12 8:56 PM, Iain Buclaw wrote:
>>>
>>> On 16 January 2012 19:25, Walter Bright<newshound2@digitalmars.com>
>>>   wrote:
>>>>
>>>> On 1/16/2012 11:16 AM, Iain Buclaw wrote:
>>>>
>>>>>>
>>>>>> But don't worry, I'm not planning on working on that at the moment :-)
>>>>>
>>>>>
>>>>> Leave that sort of optimisation for the backend to handle please. ;-)
>>>>
>>>>
>>>>
>>>> Of course.
>>>>
>>>> I suspect Intel's compiler does that one, does gcc?
>>>>
>>>
>>> There's auto-vectorisation for for(), foreach(), and foreach_reverse()
>>> loops that I have written support for.  I am not aware of GCC
>>> vectorising anything else.
>>>
>>> example:
>>>
>>> int a[256], b[256], c[256];
>>> void foo () {
>>>    for (int i=0; i<256; i++)
>>>      a[i] = b[i] + c[i];
>>> }
>>>
>>
>> Unfortunately, if the function was this:
>>
>> void foo(int[] a, int[] b, int[] c) {
>>
>>   for (int i=0; i<256; i++)
>>     a[i] = b[i] + c[i];
>> }
>>
>> Then it can't vectorize due to aliasing.
>
> Compile with -fstrict-aliasing then?

This has nothing to do with strict aliasing.

a[257];
foo(a[1..257], a[0..256], a[0..256]);

This doesn't break any strict aliasing rule, but the loop still cannot be (trivially) vectorized.

As Manu said, you need something like __restrict (or a linear type system) to solve this problem.

http://en.wikipedia.org/wiki/Linear_type_system
http://en.wikipedia.org/wiki/Uniqueness_typing
January 17, 2012
On 1/17/2012 1:20 PM, Peter Alexander wrote:
> As Manu said, you need something like __restrict (or a linear type system) to
> solve this problem.

No, you don't. It can be done with a runtime check, like array bounds checking is done.
January 17, 2012
On 17/01/12 9:24 PM, Walter Bright wrote:
> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>> As Manu said, you need something like __restrict (or a linear type
>> system) to
>> solve this problem.
>
> No, you don't. It can be done with a runtime check, like array bounds
> checking is done.

So you'd change it to this, even in release builds?

void foo(int[] a, int[] b, int[] c)
{
  if ( /* arrays overlap */ )
  {
    foreach(i; 0..256)
      a[i] = b[i] + c[i];
  }
  else
  {
    /* vectorized code */
  }
}


i.e. duplicate all loops that can be potentially vectorized depending on aliasing? Please bear in mind that this is a simple example.

Seems a bit inefficient (code size).