More on vectorized comparisons - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » More on vectorized comparisons

Thread overview

More on vectorized comparisons
Aug 23, 2012 bearophile
Aug 23, 2012 Sean Cavanaugh
Aug 23, 2012 bearophile
Aug 24, 2012 Don Clugston
Aug 24, 2012 bearophile
Aug 27, 2012 Don Clugston
Aug 27, 2012 Andrei Alexandrescu
Aug 27, 2012 Peter Alexander
Aug 27, 2012 bearophile
Aug 27, 2012 Peter Alexander
Aug 24, 2012 tn

August 23, 2012

More on vectorized comparisons

Posted by bearophile

bearophile

Some time ago I have suggested to add support to vector comparisons in D, because this is sometimes useful and in the modern SIMD units there is hardware support for such operations:


double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
bool[] t = a[] > 0;
assert(t == [true, true, false, true, false, false]);

Usable instructions like (here shows the intrinsic):
http://msdn.microsoft.com/en-us/library/11dy102s%28v=vs.80%29.aspx


Now on Reddit I have found a small discussion about a slides pack by Intel:
http://www.reddit.com/r/programming/comments/ym8m6/parallel_programming_for_c_and_c_done_right/

The slides:
https://speakerdeck.com/u/multicoreworld/p/james-reinders-intel-united-states

Link to the PDF:
https://speakerd.s3.amazonaws.com/presentations/5006069136af010002005325/Reinders_KEYNOTE_C_done_right.pdf

At page 69 of those slides there is some code that looks interesting, I think this is a reduced version of part of it, that shows another way to use vectorized comparisons:


void main() {
    double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
    double[] b = [10,   20,   30,  40,  50,   60];
    double[] c = [1,     2,    3,   4,   5,    6];
    if (a[] > 0)
        b[] += c[];
}


I think that code is semantically equivalent to:

void main() {
    double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
    double[] b = [10,   20,   30,  40,  50,   60];
    double[] c = [1,     2,    3,   4,   5,    6];
    foreach (i; 0 .. a.length)
        if (a[i] > 0)
            b[i] += c[i];
}


After that code b is:
[11, 22, 30, 44, 50, 60]


This means the contents of the 'then' branch of the vectorized comparison is done only on items of b and c where the comparison has given true.

This looks useful. Is it possible to implement this in D, and do you like it?

Bye,
bearophile

August 23, 2012

Re: More on vectorized comparisons

Posted by Sean Cavanaugh
in reply to bearophile

Sean Cavanaugh

Posted in reply to bearophile

On 8/22/2012 7:19 PM, bearophile wrote:
> Some time ago I have suggested to add support to vector comparisons in
> D, because this is sometimes useful and in the modern SIMD units there
> is hardware support for such operations:
>
>
> I think that code is semantically equivalent to:
>
> void main() {
>      double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
>      double[] b = [10,   20,   30,  40,  50,   60];
>      double[] c = [1,     2,    3,   4,   5,    6];
>      foreach (i; 0 .. a.length)
>          if (a[i] > 0)
>              b[i] += c[i];
> }
>
>
> After that code b is:
> [11, 22, 30, 44, 50, 60]
>
>
> This means the contents of the 'then' branch of the vectorized
> comparison is done only on items of b and c where the comparison has
> given true.
>
> This looks useful. Is it possible to implement this in D, and do you
> like it?

Well, right now the binary operators == != >= <= > and < are required to return bool instead of allowing a user defined type, which prevents a lot of the sugar you would want to make the code nice to write.  Without the sugar the code would ends up this:

foreach(i; 0 .. a.length)
{
    float4 mask = greaterThan(a[i], float4(0,0,0,0));
    b[i] = select(mask, b[i] + c[i], b[i]);
}

in GPU shader land this expression is at least simpler to write:

foreach(i; 0 .. a.length)
{
    b[i] = (b[i] > 0) ? (b[i] + c[i]) : b[i];
}

All of these implementations are equivalent and remove the branch from the code flow, which is pretty nice for the CPU pipeline.   In SIMD the comparisons generate masks into a register which you can immediately use.  On modern (SSE4) CPUs the select is a single instruction, on older ones it takes three: (mask & A) | (~mask & B), but its all better than a real branch.

If you have a large amount of code needing a branch, you can take the mask generated by the compare, and extract it into a CPU register, and compare it for 0, nonzero, specific or any bits set.  a float4 comparison ends up generating 4 bits, so the code with a real branch is like:

if (any(a[i] > 0))
{
    // do stuff if any of a[i] are greater than zero
}	
if (all(a[i] > 0))
{
    // do stuff if all of a[i] are greater than zero
}
if ((getMask(a[i] > 0) & 0x7) == 0x7)
{
    // do stuff if the first three elements are greater than zero
}

August 23, 2012

Re: More on vectorized comparisons

Posted by bearophile
in reply to Sean Cavanaugh

bearophile

Posted in reply to Sean Cavanaugh

Sean Cavanaugh:

> Well, right now the binary operators == != >= <= > and < are required to return bool instead of allowing a user defined type, which prevents a lot of the sugar you would want to make the code nice to write.

The hypothetical D sugar I was looking for is this, where 'a', 'b' and 'c' are normal dynamic arrays of doubles (not of float[4] of double[2]) (currently this code is a syntax error):

if (a[] > 0)
    b[] += c[];


The front-end is able to implement those two lines of code as it likes, like seeing those normal arrays as arrays of double[2] (or double[4] on more modern CPUs) and put there all the needed intrinsics or assembly needed to implement that semantics.

So what's the problem the > operator causes in this code?

Bye,
bearophile

August 24, 2012

Re: More on vectorized comparisons

Posted by Don Clugston
in reply to bearophile

Don Clugston

Posted in reply to bearophile

On 24/08/12 00:13, bearophile wrote:
> Sean Cavanaugh:
>
>> Well, right now the binary operators == != >= <= > and < are required
>> to return bool instead of allowing a user defined type, which prevents
>> a lot of the sugar you would want to make the code nice to write.
>
> The hypothetical D sugar I was looking for is this, where 'a', 'b' and
> 'c' are normal dynamic arrays of doubles (not of float[4] of double[2])
> (currently this code is a syntax error):
>
> if (a[] > 0)
>      b[] += c[];
>
>
> The front-end is able to implement those two lines of code as it likes,
> like seeing those normal arrays as arrays of double[2] (or double[4] on
> more modern CPUs) and put there all the needed intrinsics or assembly
> needed to implement that semantics.
>
> So what's the problem the > operator causes in this code?
>
> Bye,
> bearophile

It's just syntax sugar for a very obscure operation, and it's somewhat ambiguous -- is it allowed to use short-circuit evaluation?
Mathematically, it doesn't make sense. You can compare scalars, but ordered comparison of vectors is a bit nonsensical, unless it is element-wise.
Usually, a[] > 0, a[] < 0, and a[] == 0 will all be false.

Most likely, you really meant  dot(a[]) > 0.

Something like

if ( all( a[] > 0 ) )
    b[] += c[];

is more reasonable. But an implicit 'reduce' in a vector operation has little to commend it, I think.

August 24, 2012

Re: More on vectorized comparisons

Posted by bearophile
in reply to Don Clugston

bearophile

Posted in reply to Don Clugston

>It's just syntax sugar for a very obscure operation,<

It's an operation included in Cilk Plus, I think Intel devs know
enough what they are doing.
And I think code like this is a common need:

if (a[] > 0) {
     // ...
}


>and it's somewhat ambiguous -- is it allowed to use short-circuit evaluation?<

That code means:

foreach (i; 0 .. a.length) {
     if (a[i] > 0) {
         // ...
     }
}

Here I don't see problems caused by short circuit evaluation.


>You can compare scalars, but ordered comparison of vectors is a bit nonsensical, unless it is element-wise.<

It's a comparison element-wise between each item and a constant,
it's similar to:

a[] = a[] + 5;

Probably you have misunderstood the semantics of what I am
discussing.

Bye,
bearophile

August 24, 2012

Re: More on vectorized comparisons

Posted by tn
in reply to bearophile

tn

Posted in reply to bearophile

On Thursday, 23 August 2012 at 00:19:39 UTC, bearophile wrote:
> At page 69 of those slides there is some code that looks interesting, I think this is a reduced version of part of it, that shows another way to use vectorized comparisons:
>
>
> void main() {
>     double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
>     double[] b = [10,   20,   30,  40,  50,   60];
>     double[] c = [1,     2,    3,   4,   5,    6];
>     if (a[] > 0)
>         b[] += c[];
> }
>
>
> I think that code is semantically equivalent to:
>
> void main() {
>     double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
>     double[] b = [10,   20,   30,  40,  50,   60];
>     double[] c = [1,     2,    3,   4,   5,    6];
>     foreach (i; 0 .. a.length)
>         if (a[i] > 0)
>             b[i] += c[i];
> }

The proposed syntax looks weird. Wouldn't the followind be more intuitive:

foreach (i; a[i] > 0)
    b[i] += c[i];

Or alternatively it would be nice to be able to do it like in Matlab:

i = (a[] > 0);
b[i] += c[i];

August 27, 2012

Re: More on vectorized comparisons

Posted by Don Clugston
in reply to bearophile

Don Clugston

Posted in reply to bearophile

On 24/08/12 15:57, bearophile wrote:
>> It's just syntax sugar for a very obscure operation,<
>
> It's an operation included in Cilk Plus, I think Intel devs know
> enough what they are doing.
> And I think code like this is a common need:
>
> if (a[] > 0) {
>       // ...
> }
>> and it's somewhat ambiguous -- is it allowed to use short-circuit
>> evaluation?<
>
> That code means:
>
> foreach (i; 0 .. a.length) {
>       if (a[i] > 0) {
>           // ...
>       }
> }
>
> Here I don't see problems caused by short circuit evaluation.

Wow. Then I misunderstood, and it's even less appropriate for D than I thought.

vote -= int.max;

Worst syntax I've seen in a very long time.

August 27, 2012

Re: More on vectorized comparisons

Posted by Andrei Alexandrescu
in reply to Don Clugston

Andrei Alexandrescu

Posted in reply to Don Clugston

On 8/27/12 8:06 AM, Don Clugston wrote:
> vote -= int.max;

Beware of wraparound :o).

Andrei

August 27, 2012

Re: More on vectorized comparisons

Posted by Peter Alexander
in reply to bearophile

Peter Alexander

Posted in reply to bearophile

On Friday, 24 August 2012 at 13:57:42 UTC, bearophile wrote:
> That code means:
>
> foreach (i; 0 .. a.length) {
>      if (a[i] > 0) {
>          // ...
>      }
> }

How could this possibly be useful? It's like the loop, but you lose the index variable. I can't see how you could possibly do anything with this.

Can you show an example of some code that uses this?

August 27, 2012

Re: More on vectorized comparisons

Posted by bearophile
in reply to Peter Alexander

bearophile

Posted in reply to Peter Alexander

Peter Alexander:

> How could this possibly be useful? It's like the loop, but you lose the index variable. I can't see how you could possibly do anything with this.

I think this feature is a part of Cilk+. Maybe I have not fully understood this feature, or maybe some Intel developers are mad :-)

I think in code like this:

if (a[] >= 0)
    b[] += c[];

The 'b' and 'c' arrays receive the implicit index of the items of 'a' that aren't negative.

I think its semantics is a bit like this (assuming zip() supports ref iteration), you see no index variable here:

parallel_foreach (ai, ref bi, ci; zip(a, b, c))
    if (ai > 0)
        bi += ci;


I think it's something commonly useful.


> Can you show an example of some code that uses this?

On the net I have found two examples of Cilk+ code that uses that feature. One of them is referenced in the first email of this thread. With Google you can find the full code from Intel that piece of code comes from.

Bye,
bearophile

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation