March 07, 2015
On Friday, 6 March 2015 at 23:25:40 UTC, Joakim wrote:
> The ground-up redesign of OpenGL, now called Vulkan, has been announced at GDC:
>
> http://www.phoronix.com/scan.php?page=article&item=khronos-vulcan-spirv
>
> Both graphics shaders and the latest verson of OpenCL, which enables computation on the GPU, will target a new IR called SPIR-V:
>
> http://www.anandtech.com/show/9039/khronos-announces-opencl-21-c-comes-to-opencl
>
> Rather than being forced to use C-like languages like GLSL or OpenCL in the past, this new IR will allow writing graphics shaders and OpenCL code using any language, including a subset of C++14 stripped of exceptions, function pointers, and virtual functions.
>
> This would be a good opportunity for D, if ldc or gdc could be made to target SPIR-V.  Ldc would seem to have a leg up, since SPIR was originally based on LLVM IR before diverging with SPIR-V.

Sure, you might target SPIR-V with a C-like language, but how will you generate the IR corresponding to:

- texture accesses
- local memory vs global memory vs mapped pinned host memory. Looks like you need annotations for your pointers.
- sub-blocks operations made core in OpenCL 2.x

All things that OpenCL C or GLSL are aware of.
Having a GPU backend doesn't make general code fit for high level of parallelism. GPUs are not designed to work-around the poor efficiency of the programs they run.
March 07, 2015
"Russel Winder via Digitalmars-d"  wrote in message news:mailman.7408.1425716535.9932.digitalmars-d@puremagic.com...

> Which would mean that anyone interested in CPU/GPU computing will have
> to eschew DMD in favour of LDC and GDC.

Yes.  Or for any of the other dozens of platforms that dmd will never support. 

March 07, 2015
On Saturday, 7 March 2015 at 11:35:59 UTC, ponce wrote:
> On Friday, 6 March 2015 at 23:25:40 UTC, Joakim wrote:
>> The ground-up redesign of OpenGL, now called Vulkan, has been announced at GDC:
>>
>> http://www.phoronix.com/scan.php?page=article&item=khronos-vulcan-spirv
>>
>> Both graphics shaders and the latest verson of OpenCL, which enables computation on the GPU, will target a new IR called SPIR-V:
>>
>> http://www.anandtech.com/show/9039/khronos-announces-opencl-21-c-comes-to-opencl
>>
>> Rather than being forced to use C-like languages like GLSL or OpenCL in the past, this new IR will allow writing graphics shaders and OpenCL code using any language, including a subset of C++14 stripped of exceptions, function pointers, and virtual functions.
>>
>> This would be a good opportunity for D, if ldc or gdc could be made to target SPIR-V.  Ldc would seem to have a leg up, since SPIR was originally based on LLVM IR before diverging with SPIR-V.
>
> Sure, you might target SPIR-V with a C-like language, but how will you generate the IR corresponding to:
>
> - texture accesses
> - local memory vs global memory vs mapped pinned host memory. Looks like you need annotations for your pointers.
> - sub-blocks operations made core in OpenCL 2.x
>
> All things that OpenCL C or GLSL are aware of.
> Having a GPU backend doesn't make general code fit for high level of parallelism. GPUs are not designed to work-around the poor efficiency of the programs they run.

The same way the Haskell, Java, Python and .NET implementations targeting CUDA PTX and HSAIL do.

--
Paulo
March 07, 2015
On Saturday, 7 March 2015 at 09:05:03 UTC, Paulo Pinto wrote:
> Of course, this doesn't matter when using engines, which every sane developer should do anyway.
>
> Any applications coded straight to graphics APIs ends up being a use case specific mini engine.

We'll see, but the downside to having a slim driver is that you risk ending up writing the application engine N times for each GPU rather than once. With a buffering high level driver you get some optimization for free, done by the manufacturer using inside knowledge.
March 12, 2015
On Saturday, 7 March 2015 at 02:18:22 UTC, Iain Buclaw wrote:
> On 6 Mar 2015 23:30, "Joakim via Digitalmars-d" <digitalmars-d@puremagic.com>
> wrote:
>>
>> The ground-up redesign of OpenGL, now called Vulkan, has been announced
> at GDC:
>>
>> http://www.phoronix.com/scan.php?page=article&item=khronos-vulcan-spirv
>>
>> Both graphics shaders and the latest verson of OpenCL, which enables
> computation on the GPU, will target a new IR called SPIR-V:
>>
>>
> http://www.anandtech.com/show/9039/khronos-announces-opencl-21-c-comes-to-opencl
>>
>> Rather than being forced to use C-like languages like GLSL or OpenCL in
> the past, this new IR will allow writing graphics shaders and OpenCL code
> using any language, including a subset of C++14 stripped of exceptions,
> function pointers, and virtual functions.
>>
>> This would be a good opportunity for D, if ldc or gdc could be made to
> target SPIR-V.  Ldc would seem to have a leg up, since SPIR was originally
> based on LLVM IR before diverging with SPIR-V.
>
> Unlike LDC, GDC doesn't need to be *made* to target anything.  It's IR is
> high level enough that you don't need to think (nor care) about your
> backend target.
>
> GCC itself will need a backend to support it though.  ;)
>
> Iain

Relevant: https://gcc.gnu.org/ml/gcc/2015-03/msg00020.html
March 12, 2015
On 12 March 2015 at 15:57, John Colvin via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Saturday, 7 March 2015 at 02:18:22 UTC, Iain Buclaw wrote:
>>
>> On 6 Mar 2015 23:30, "Joakim via Digitalmars-d"
>> <digitalmars-d@puremagic.com>
>> wrote:
>>>
>>>
>>> The ground-up redesign of OpenGL, now called Vulkan, has been announced
>>
>> at GDC:
>>>
>>>
>>> http://www.phoronix.com/scan.php?page=article&item=khronos-vulcan-spirv
>>>
>>> Both graphics shaders and the latest verson of OpenCL, which enables
>>
>> computation on the GPU, will target a new IR called SPIR-V:
>>>
>>>
>>>
>>
>> http://www.anandtech.com/show/9039/khronos-announces-opencl-21-c-comes-to-opencl
>>>
>>>
>>> Rather than being forced to use C-like languages like GLSL or OpenCL in
>>
>> the past, this new IR will allow writing graphics shaders and OpenCL code using any language, including a subset of C++14 stripped of exceptions, function pointers, and virtual functions.
>>>
>>>
>>> This would be a good opportunity for D, if ldc or gdc could be made to
>>
>> target SPIR-V.  Ldc would seem to have a leg up, since SPIR was originally based on LLVM IR before diverging with SPIR-V.
>>
>> Unlike LDC, GDC doesn't need to be *made* to target anything.  It's IR is high level enough that you don't need to think (nor care) about your backend target.
>>
>> GCC itself will need a backend to support it though.  ;)
>>
>> Iain
>
>
> Relevant: https://gcc.gnu.org/ml/gcc/2015-03/msg00020.html


David is an awesome guy.  Would be great if he picks up the baton on this.

I reckon most things would be hashed out via GCC builtins, in which someone writes a library for.
March 13, 2015
Spir-V may be producable from HLL tools, but that doesn't mean it's perfectly ok to use any HLL. Capability for HLL-to-spir is exposed mainly for syntax sugar and shallow precompile optimisations, but mostly to avoid vendor-specific HLL bugs that have plagued GLSL and HLSL (those billion d3dx_1503.dll on your system are bugfixes). Plus, to give the community access to one or several opensource HLL compilers that they can find issues with and submit for everyone to benefit. So, it's mostly to get a flawless opensource GLSL compiler. Dlang's strengths are simply not applicable directly. Though with a bit of work can actually be applied completely. (I've done them in/with our GLSL/backend compilers)

- malloc. SpirV and such don't have malloc. Fix: Preallocate a big chunk of memory, and implement a massively-parallel allocator yourself (it should handle ~2000 requests to allocate per cycle, that's the gist of it). "atomic_add" on a memory location will help. If you don't want to preallocate too much, have a cpu thread poll while a gpu thread stalls (it should stall itself and 60000 other threads) until the cpu allocates a new chunk for the heap and provides a base address. (hope the cpu thread responds quickly enough, or your gpu tasks will be mercilessly killed).

- function-pointers, largely a no-no. Extensions might give you that capability, but implement as big switch-case tables. With the extensions, you will need to guarantee an arbitrary number (64) of threads all happened to call the same actual function.

- stack. I don't know how to break it to you, there's no stack. Only around 256 dwords, that 8-200 threads get to allocate from. Your notion of a stack gets statically flattenized by the compilers. So, your whole program has e.g. 4 dwords to play around and have 64 things hide latency, or 64 dwords but only 4 threads to hide latency - and is 2-4x slower for rudimentary things (and utterly fail at latency hiding, becoming 50 times slower with memory-accesses), or 1 thread with 256 dwords, which is 8-16 times slower at rudimentary stuff and 50+ times slower if you access memory even if cached. Add a manually-managed programmable memory-stack, and your performance goes poof.

- exceptions. A combined issue of the things above.

Combine the limitations of function-pointers and stack, and I hope you get the point. Or well, how pointless the exercise to get Dlang as we know and love it on a gpu. A single-threaded javascript app on a cpu will beat it at performance on everything that's not trivial.
March 14, 2015
On Friday, 13 March 2015 at 18:44:18 UTC, karl wrote:
> Spir-V may be producable from HLL tools, but that doesn't mean it's perfectly ok to use any HLL. Capability for HLL-to-spir is exposed mainly for syntax sugar and shallow precompile optimisations, but mostly to avoid vendor-specific HLL bugs that have plagued GLSL and HLSL (those billion d3dx_1503.dll on your system are bugfixes). Plus, to give the community access to one or several opensource HLL compilers that they can find issues with and submit for everyone to benefit. So, it's mostly to get a flawless opensource GLSL compiler. Dlang's strengths are simply not applicable directly. Though with a bit of work can actually be applied completely. (I've done them in/with our GLSL/backend compilers)
>
> - malloc. SpirV and such don't have malloc. Fix: Preallocate a big chunk of memory, and implement a massively-parallel allocator yourself (it should handle ~2000 requests to allocate per cycle, that's the gist of it). "atomic_add" on a memory location will help. If you don't want to preallocate too much, have a cpu thread poll while a gpu thread stalls (it should stall itself and 60000 other threads) until the cpu allocates a new chunk for the heap and provides a base address. (hope the cpu thread responds quickly enough, or your gpu tasks will be mercilessly killed).
>
> - function-pointers, largely a no-no. Extensions might give you that capability, but implement as big switch-case tables. With the extensions, you will need to guarantee an arbitrary number (64) of threads all happened to call the same actual function.
>
> - stack. I don't know how to break it to you, there's no stack. Only around 256 dwords, that 8-200 threads get to allocate from. Your notion of a stack gets statically flattenized by the compilers. So, your whole program has e.g. 4 dwords to play around and have 64 things hide latency, or 64 dwords but only 4 threads to hide latency - and is 2-4x slower for rudimentary things (and utterly fail at latency hiding, becoming 50 times slower with memory-accesses), or 1 thread with 256 dwords, which is 8-16 times slower at rudimentary stuff and 50+ times slower if you access memory even if cached. Add a manually-managed programmable memory-stack, and your performance goes poof.
>
> - exceptions. A combined issue of the things above.
>
> Combine the limitations of function-pointers and stack, and I hope you get the point. Or well, how pointless the exercise to get Dlang as we know and love it on a gpu. A single-threaded javascript app on a cpu will beat it at performance on everything that's not trivial.

The reason to use D for kernels / shaders would be for its metaprogramming, code-generation abilities and type-system (slices and structs in particular). Of course you wouldn't be allocating heap memory, using function pointers or exceptions. There's a still a lot that D has to offer without those. I regularly write thousands of lines of D in that subset.

P.S. D is in pretty much the same boat as any other C-based language w.r.t. stack space. You have to be careful with the stack in OpenCL C, you would have to be careful with the stack in SPIR-D.
1 2
Next ›   Last »