Jump to page: 1 212  
Page
Thread overview
Any usable SIMD implementation?
Mar 31, 2016
Martin Nowak
Mar 31, 2016
ZombineDev
Apr 01, 2016
Martin Nowak
Apr 02, 2016
Iain Buclaw
Apr 02, 2016
Martin Nowak
Apr 02, 2016
Iain Buclaw
Apr 02, 2016
Martin Nowak
Apr 03, 2016
Johan Engelen
Apr 03, 2016
Martin Nowak
Mar 31, 2016
John Colvin
Mar 31, 2016
Johan Engelen
Mar 31, 2016
Iakh
Apr 03, 2016
9il
Apr 03, 2016
Iain Buclaw
Apr 04, 2016
9il
Apr 04, 2016
Marco Leise
Apr 04, 2016
9il
Apr 11, 2016
Marco Leise
Apr 04, 2016
Walter Bright
Apr 11, 2016
Marco Leise
Apr 11, 2016
Walter Bright
Apr 12, 2016
Marco Leise
Apr 12, 2016
Walter Bright
Apr 12, 2016
Marco Leise
Apr 13, 2016
Walter Bright
Apr 13, 2016
Iain Buclaw
Apr 16, 2016
Marco Leise
Apr 17, 2016
Walter Bright
Apr 17, 2016
Marco Leise
Apr 12, 2016
Iain Buclaw
Apr 13, 2016
Walter Bright
Apr 13, 2016
Iain Buclaw
Apr 13, 2016
Marco Leise
Apr 13, 2016
Iain Buclaw
Apr 13, 2016
Marco Leise
Apr 13, 2016
Walter Bright
Apr 13, 2016
Marco Leise
Apr 13, 2016
Walter Bright
Apr 14, 2016
Iain Buclaw
Apr 14, 2016
Walter Bright
Apr 04, 2016
Walter Bright
Apr 04, 2016
9il
Apr 04, 2016
jmh530
Apr 05, 2016
9il
Apr 04, 2016
Walter Bright
Apr 05, 2016
9il
Apr 05, 2016
Walter Bright
Apr 05, 2016
John Colvin
Apr 05, 2016
Walter Bright
Apr 05, 2016
9il
Apr 06, 2016
Walter Bright
Apr 06, 2016
9il
Apr 18, 2016
Joe Duarte
Apr 18, 2016
Temtaime
Apr 23, 2016
Johan Engelen
Apr 24, 2016
Marco Leise
May 02, 2016
Joe Duarte
Apr 05, 2016
9il
Apr 05, 2016
Walter Bright
Apr 05, 2016
9il
Apr 05, 2016
Walter Bright
Apr 05, 2016
Johan Engelen
Apr 06, 2016
9il
Apr 06, 2016
Manu
Apr 06, 2016
9il
Apr 06, 2016
Johan Engelen
Apr 06, 2016
9il
Apr 07, 2016
Manu
Apr 07, 2016
Walter Bright
Apr 11, 2016
Marco Leise
Apr 07, 2016
Johannes Pfau
Apr 06, 2016
9il
Apr 06, 2016
jmh530
Apr 06, 2016
Manu
Apr 07, 2016
Walter Bright
Apr 07, 2016
Manu
Apr 07, 2016
Walter Bright
Apr 07, 2016
9il
Apr 07, 2016
Walter Bright
Apr 07, 2016
9il
Apr 07, 2016
jmh530
Apr 07, 2016
9il
Apr 07, 2016
Johannes Pfau
Apr 07, 2016
Johannes Pfau
Apr 08, 2016
Walter Bright
Apr 07, 2016
Kai Nacke
Apr 07, 2016
Johannes Pfau
Apr 07, 2016
Johan Engelen
Apr 07, 2016
Johannes Pfau
Apr 07, 2016
Johan Engelen
Apr 08, 2016
Walter Bright
Apr 08, 2016
Manu
Apr 08, 2016
Walter Bright
Apr 07, 2016
Johannes Pfau
Apr 12, 2016
xenon325
Apr 12, 2016
Marco Leise
Apr 12, 2016
Marco Leise
Apr 15, 2016
jmh530
Apr 16, 2016
Marco Leise
Apr 05, 2016
Johan Engelen
Apr 04, 2016
jmh530
Apr 04, 2016
Walter Bright
Apr 11, 2016
Marco Leise
Apr 03, 2016
Manu
Apr 03, 2016
Walter Bright
Apr 04, 2016
Jack Stouffer
Apr 04, 2016
Walter Bright
Apr 04, 2016
Jack Stouffer
Apr 04, 2016
jmh530
Apr 04, 2016
Walter Bright
Apr 04, 2016
Walter Bright
Apr 04, 2016
ZombineDev
Apr 04, 2016
Walter Bright
Apr 15, 2016
Johan Engelen
Apr 04, 2016
Marco Leise
Apr 12, 2016
Etienne
Aug 23, 2016
Ilya Yaroshenko
March 31, 2016
I'm currently working on a templated arrayop implementation (using RPN
to encode ASTs).
So far things worked out great, but now I got stuck b/c apparently none
of the D compilers has a working SIMD implementation (maybe GDC has but
it's very difficult to work w/ the 2.066 frontend).

https://github.com/MartinNowak/druntime/blob/arrayOps/src/core/internal/arrayop.d https://github.com/MartinNowak/dmd/blob/arrayOps/src/arrayop.d

I don't want to do anything fancy, just unaligned loads, stores, and integral mul/div. Is this really the current state of SIMD or am I missing sth.?

-Martin
March 31, 2016
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote:
> I'm currently working on a templated arrayop implementation (using RPN
> to encode ASTs).
> So far things worked out great, but now I got stuck b/c apparently none
> of the D compilers has a working SIMD implementation (maybe GDC has but
> it's very difficult to work w/ the 2.066 frontend).
>
> https://github.com/MartinNowak/druntime/blob/arrayOps/src/core/internal/arrayop.d https://github.com/MartinNowak/dmd/blob/arrayOps/src/arrayop.d
>
> I don't want to do anything fancy, just unaligned loads, stores, and integral mul/div. Is this really the current state of SIMD or am I missing sth.?
>
> -Martin

I don't know how far has Ilya's work [1] advanced, but you may want to join efforts with him. There are also two std.simd packages [2] [3].

BTW, I looked at your code a couple of days ago and I thought that it is a really interesting approach to encode operations like that. I'm just wondering if pursuing this approach is a good idea in the long run, i.e. is it expressible enough to cover the use cases of HPC which would also need something similar, but for custom linear algebra types.

Here's an interesting video about approaches to solving this problem in C++: https://www.youtube.com/watch?v=hfn0BVOegac

[1]: http://forum.dlang.org/post/nilhvnqbsgqhxdshpqfl@forum.dlang.org

[2]: https://github.com/D-Programming-Language/phobos/pull/2862

[3]: https://github.com/Iakh/simd
March 31, 2016
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote:
> I'm currently working on a templated arrayop implementation (using RPN
> to encode ASTs).
> So far things worked out great, but now I got stuck b/c apparently none
> of the D compilers has a working SIMD implementation (maybe GDC has but
> it's very difficult to work w/ the 2.066 frontend).
>
> https://github.com/MartinNowak/druntime/blob/arrayOps/src/core/internal/arrayop.d https://github.com/MartinNowak/dmd/blob/arrayOps/src/arrayop.d
>
> I don't want to do anything fancy, just unaligned loads, stores, and integral mul/div. Is this really the current state of SIMD or am I missing sth.?
>
> -Martin

Am I being stupid or is core.simd what you want?
March 31, 2016
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote:
>
> I don't want to do anything fancy, just unaligned loads, stores, and integral mul/div. Is this really the current state of SIMD or am I missing sth.?

I think you want to write your code using SIMD primitives.
But in case you want the compiler to generate SIMD instructions, perhaps @ldc.attributes.target may help you:
http://wiki.dlang.org/LDC-specific_language_changes#.40.28ldc.attributes.target.28.22feature.22.29.29

I have not checked what LDC does with SIMD with default commandline parameters.

Cheers,
  Johan

March 31, 2016
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote:
> I'm currently working on a templated arrayop implementation (using RPN
> to encode ASTs).
> So far things worked out great, but now I got stuck b/c apparently none
> of the D compilers has a working SIMD implementation (maybe GDC has but
> it's very difficult to work w/ the 2.066 frontend).
>
> https://github.com/MartinNowak/druntime/blob/arrayOps/src/core/internal/arrayop.d https://github.com/MartinNowak/dmd/blob/arrayOps/src/arrayop.d
>
> I don't want to do anything fancy, just unaligned loads, stores, and integral mul/div. Is this really the current state of SIMD or am I missing sth.?
>
> -Martin

Unfortunately my one(https://github.com/Iakh/simd) is far from
production code. For now I'm trying to figure out interface common
to all archs/compilers. And its more about SIMD comparison operations.

You could do loads, stores and mul with default D SIMD support
but not int div
April 02, 2016
On 03/31/2016 10:55 AM, ZombineDev wrote:
> [2]: https://github.com/D-Programming-Language/phobos/pull/2862

Well apparently stores w/ dmd's weird core.simd interface don't work, or I can't figure out (from the non-existent documentation) how to use it.

---
import core.simd;

void test(float4* ptr, float4 val)
{
    __simd_sto(XMM.STOUPS, *ptr, val);
    __simd(XMM.STOUPS, *ptr, val);
    auto val1 = __simd_sto(XMM.STOUPS, *ptr, val);
    auto val2 = __simd(XMM.STOUPS, *ptr, val);
}
---

LDC at least has some intrinsics once you find ldc.gccbuiltins_x86, but for some reason comes with it's own broken ldc.simd.loadUnaligned instead of providing intrinsics.

---
import core.simd, ldc.simd;

float4 test(float* ptr)
{
    return loadUnaligned!float4(ptr);
}
---

/home/dawg/dlang/ldc-0.17.1/bin/../import/ldc/simd.di(212): Error: can't
parse inline LLVM IR:
        %r = load <4 x float>* %p, align 1
                               ^
expected comma after load's type

So are 3 different untested and unused APIs really the current state of SIMD?

-Martin

April 02, 2016
On 2 Apr 2016 12:40 am, "Martin Nowak via Digitalmars-d" < digitalmars-d@puremagic.com> wrote:
>
> On 03/31/2016 10:55 AM, ZombineDev wrote:
> > [2]: https://github.com/D-Programming-Language/phobos/pull/2862
>
> Well apparently stores w/ dmd's weird core.simd interface don't work, or I can't figure out (from the non-existent documentation) how to use it.
>
> ---
> import core.simd;
>
> void test(float4* ptr, float4 val)
> {
>     __simd_sto(XMM.STOUPS, *ptr, val);
>     __simd(XMM.STOUPS, *ptr, val);
>     auto val1 = __simd_sto(XMM.STOUPS, *ptr, val);
>     auto val2 = __simd(XMM.STOUPS, *ptr, val);
> }
> ---
>
> LDC at least has some intrinsics once you find ldc.gccbuiltins_x86, but for some reason comes with it's own broken ldc.simd.loadUnaligned instead of providing intrinsics.
>
> ---
> import core.simd, ldc.simd;
>
> float4 test(float* ptr)
> {
>     return loadUnaligned!float4(ptr);
> }
> ---
>
> /home/dawg/dlang/ldc-0.17.1/bin/../import/ldc/simd.di(212): Error: can't
> parse inline LLVM IR:
>         %r = load <4 x float>* %p, align 1
>                                ^
> expected comma after load's type
>
> So are 3 different untested and unused APIs really the current state of SIMD?
>
> -Martin
>

I would just let the compiler optimize / vectorize the operation, but then again that it is probably just me who thinks these things.

http://goo.gl/XdiKZX

I'm not aware of any intrinsic to load unaligned data. Only to assume alignment.

Iain.


April 02, 2016
On Saturday, 2 April 2016 at 06:13:24 UTC, Iain Buclaw wrote:
> I would just let the compiler optimize / vectorize the operation, but then again that it is probably just me who thinks these things.

It's intended to replace the array ops in druntime, relying on vecorizers won't suffice, e.g. your example already stops working when I pass dynamic instead of static arrays.

> I'm not aware of any intrinsic to load unaligned data. Only to assume alignment.

__builtin_ia32_loadups
__builtin_ia32_storeups
April 02, 2016
On 2 Apr 2016 9:45 am, "Martin Nowak via Digitalmars-d" < digitalmars-d@puremagic.com> wrote:
>
> On Saturday, 2 April 2016 at 06:13:24 UTC, Iain Buclaw wrote:
>>
>> I would just let the compiler optimize / vectorize the operation, but
then again that it is probably just me who thinks these things.
>
>
> It's intended to replace the array ops in druntime, relying on vecorizers
won't suffice, e.g. your example already stops working when I pass dynamic instead of static arrays.
>
>
>> I'm not aware of any intrinsic to load unaligned data. Only to assume
alignment.
>
>
> __builtin_ia32_loadups
> __builtin_ia32_storeups

Any agnostic way to... :-)


April 02, 2016
On 04/02/2016 10:19 AM, Iain Buclaw via Digitalmars-d wrote:
>> > __builtin_ia32_loadups
>> > __builtin_ia32_storeups
> Any agnostic way to... :-)

I'm already using vector types for most operations, so it's somewhat
portable.
But for whatever reason D doesn't allow multiplication/division w/
integral vectors (departing from GCC/clang) and I can't perform
unaligned loads, so I have to resort to intrinsics for that.
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11