Thread overview
dcompute issues?
Feb 10, 2021
Bruce Carneal
Feb 10, 2021
kinke
Feb 10, 2021
Bruce Carneal
Feb 10, 2021
Bruce Carneal
Feb 11, 2021
Imperatorn
Feb 12, 2021
Bruce Carneal
Feb 12, 2021
Imperatorn
February 10, 2021
I'm trying to get the dcompute saxpy test program to run and am having some issues.

The first issue was an ldc2 1.24.0 seg fault over this line:
@compute(CompileFor.deviceOnly) module ...

I commented out the @compute... annotation leaving the top line module declaration after which the test program compiled and linked but it errored out when attempting to queue up the saxpy kernel itself. Everything up to the queue attempt went well: Platform initialization, getDevices, Context, Buffer! allocations, data copies, ... so it appears that the libdcompute.a build is fine.

I examined the newly created kernels_cuda210_64.ptx file and did not find saxpy code.  I only found an 8-liner that, from the mangled name, looks like a GlobalIndex calculation.

If this problem rings a bell with anyone in the LDC community, please chime in.  I'd *really* prefer to work with dcompute rather than fall back to CUDA shimming or SycL exploration.

Finally, many thanks for maintaining/advancing LDC.  It has been great for all of my CPU/SIMD work.

February 10, 2021
On Wednesday, 10 February 2021 at 02:44:37 UTC, Bruce Carneal wrote:
> The first issue was an ldc2 1.24.0 seg fault over this line:
> @compute(CompileFor.deviceOnly) module ...

Please try to reduce it and file an issue. You can also use a CI build with enabled assertions from https://github.com/ldc-developers/ldc/releases/tag/CI, which might give a usable hint.

> Finally, many thanks for maintaining/advancing LDC.  It has been great for all of my CPU/SIMD work.

Thanks, you're welcome.
February 10, 2021
On Wednesday, 10 February 2021 at 17:54:44 UTC, kinke wrote:
> On Wednesday, 10 February 2021 at 02:44:37 UTC, Bruce Carneal wrote:
>> The first issue was an ldc2 1.24.0 seg fault over this line:
>> @compute(CompileFor.deviceOnly) module ...
>
> Please try to reduce it and file an issue. You can also use a CI build with enabled assertions from https://github.com/ldc-developers/ldc/releases/tag/CI, which might give a usable hint.
>
>> Finally, many thanks for maintaining/advancing LDC.  It has been great for all of my CPU/SIMD work.
>
> Thanks, you're welcome.

I've narrowed it to a noop routine with a single GlobalPointer!(float) argument (the no-argument noop routine compiled and ran).

I'll pull down the assert-enabled compiler and keep digging.  Thanks for the pointer.

February 10, 2021
On Wednesday, 10 February 2021 at 17:54:44 UTC, kinke wrote:
> On Wednesday, 10 February 2021 at 02:44:37 UTC, Bruce Carneal wrote:
>> The first issue was an ldc2 1.24.0 seg fault over this line:
>> @compute(CompileFor.deviceOnly) module ...
>
> Please try to reduce it and file an issue. You can also use a CI build with enabled assertions from https://github.com/ldc-developers/ldc/releases/tag/CI, which might give a usable hint.
>
>> Finally, many thanks for maintaining/advancing LDC.  It has been great for all of my CPU/SIMD work.
>
> Thanks, you're welcome.

The ldc2-4bee4e6f-linux-x86_64 CI compiler that you pointed me to works, no segv during compilation and the saxpy results coming back from the GPU check out.  For reference, I'm using a cuda-620 target and running on a cuda-750 card.

You've got me up and running on dcompute.  Thanks!  I'm *really* happy that I might not have to go back to C++/CUDA.



February 11, 2021
On Wednesday, 10 February 2021 at 19:59:45 UTC, Bruce Carneal wrote:
> On Wednesday, 10 February 2021 at 17:54:44 UTC, kinke wrote:
>> On Wednesday, 10 February 2021 at 02:44:37 UTC, Bruce Carneal wrote:
>>> [...]
>>
>> Please try to reduce it and file an issue. You can also use a CI build with enabled assertions from https://github.com/ldc-developers/ldc/releases/tag/CI, which might give a usable hint.
>>
>>> [...]
>>
>> Thanks, you're welcome.
>
> The ldc2-4bee4e6f-linux-x86_64 CI compiler that you pointed me to works, no segv during compilation and the saxpy results coming back from the GPU check out.  For reference, I'm using a cuda-620 target and running on a cuda-750 card.
>
> You've got me up and running on dcompute.  Thanks!  I'm *really* happy that I might not have to go back to C++/CUDA.

Nice! D compute looks interesting. Are/were there any patches for the problem(s) you encountered?
February 12, 2021
On Thursday, 11 February 2021 at 19:11:28 UTC, Imperatorn wrote:
> On Wednesday, 10 February 2021 at 19:59:45 UTC, Bruce Carneal wrote:
>> On Wednesday, 10 February 2021 at 17:54:44 UTC, kinke wrote:
>>> On Wednesday, 10 February 2021 at 02:44:37 UTC, Bruce Carneal wrote:
>>>> [...]
>>>
>>> Please try to reduce it and file an issue. You can also use a CI build with enabled assertions from https://github.com/ldc-developers/ldc/releases/tag/CI, which might give a usable hint.
>>>
>>>> [...]
>>>
>>> Thanks, you're welcome.
>>
>> The ldc2-4bee4e6f-linux-x86_64 CI compiler that you pointed me to works, no segv during compilation and the saxpy results coming back from the GPU check out.  For reference, I'm using a cuda-620 target and running on a cuda-750 card.
>>
>> You've got me up and running on dcompute.  Thanks!  I'm *really* happy that I might not have to go back to C++/CUDA.
>
> Nice! D compute looks interesting. Are/were there any patches for the problem(s) you encountered?

No patches.  Once I installed the CI compiler the saxpy demo/test "just worked".

There are some, documented, "TODOs" in the very readable dcompute source code but no show stoppers so far.  derelict CUDA/CL are also available if you need them.



February 12, 2021
On Friday, 12 February 2021 at 06:39:05 UTC, Bruce Carneal wrote:
> On Thursday, 11 February 2021 at 19:11:28 UTC, Imperatorn wrote:
>> On Wednesday, 10 February 2021 at 19:59:45 UTC, Bruce Carneal wrote:
>>> [...]
>>
>> Nice! D compute looks interesting. Are/were there any patches for the problem(s) you encountered?
>
> No patches.  Once I installed the CI compiler the saxpy demo/test "just worked".
>
> There are some, documented, "TODOs" in the very readable dcompute source code but no show stoppers so far.  derelict CUDA/CL are also available if you need them.

Ok, good to hear 👍