On Sunday, 25 April 2021 at 22:26:06 UTC, Johan Engelen wrote:
> On Friday, 5 March 2021 at 00:03:26 UTC, Bruce Carneal wrote:
> After updating the first line to '@compute(CompileFor.hostAndDevice) module ...' and adding an 'import ldc.dcompute;' line, the runtime/import/ldc/gccbuiltins_nvvm.di file from a current LDC build apparently gives access to all manner of GPU intrinsics.
Hi Bruce,
Why not submit a PR that modifies gen_gccbuiltins.cpp
such that it adds the @compute
attribute for the relevant intrinsics files?
I think it's OK if gen_gccbuiltins
contains some hacks like that . Please add a small compile test case, so we verify that it won't bitrot in the future.
Wouldn't @compute(CompileFor.deviceOnly)
make more sense, because the intrinsics will not be available on normal CPUs anyway?
I hope all your work will land in either LDC or dcompute's repositories, such that others can easily benefit from it.
cheers,
Johan
Yes, I'll help when the current push is over here, but I think I dont understand enough quite yet. I'm still bumping in to limitations/awkwardness in dcompute that should admit simple solutions. At least it feels that way.
One idea from my experience to date is that we can and probably should create a simpler (from a programmer perspective) and finer granularity way to handle multiple targets. Intrinsic selection is part of that as is library selection.
Also on my mind is how we should handle deployment. For the ultimate in speed we can do AOT per-target specialized compiles and "fat" binaries but using SPIR-V + Vulkan compute could significantly improve penetration and reduce bloat.
I read a relatively recent thread in an LLVM forum indicating that the Intel guys are pushing a "real" SPIR-V IR effort now so maybe we can help out there.
Also, I dont know how MLIR should fit in to our plans.
I'll be in touch when I get my head above water here. Thanks to you and the rest of the LDC crew for the help so far. Looking forward to advancing dlang on GPUs in the future. It really can be much much better than C++ in that arena.
Bruce