On Saturday, 15 January 2022 at 00:29:20 UTC, Nicholas Wilson wrote:
As for manpower, the reason is I don't have any personal particular need for dcompute these days. I am happy to do features for people that need something in particular, e.g. Vulkan compute shader, textures, and PR are welcome. Though if Bruce makes millions and gives me a job then that will obviously change ;)
He can put me on the application list as well… This sounds like lots of fun!!!
important is latency vs. throughput? How "powerful" is the GPU compared to the CPU?How well suited to the task is the GPU? The list goes on. Its hard enough to do CPU benchmarks in an unbiased way.
I don't think people would expect benchmarks to be unbiased. It could be 3-4 short benchmarks, some showcasing where it is beneficial, some showcasing where data dependencies (or other challenges) makes it less suitable.
- compute autocorrelation over many different lags
- multiply and take the square root of two long arrays
- compute a simple IIR filter (I assume a recursive filter would be a worst case?)
If the intention is to say, "look at the speedup you can for for $TASK using $COMMON_HARDWARE" then yeah, that would be possible. It would certainly be possible to do a benchmark of, say, "ease of implementation with comparable performance" of dcopmute vs CUDA, e.g. LoC, verbosity, brittleness etc., since the main advantage of D/dcompute (vs CUDA) is enumeration of kernel designs for performance. That would give a nice measurable goal to improve usability.
Yes, but I think of it as an inspiration with a tutorial of how to get the benchmarks to run. For instance, like you, I have no need for this at the moment and my current computer isn't really a good showcase of GPU computation either, but I have one long term hobby project where I might use GPU-computations eventually.
I suspect many think of GPU computations as something requiring a significant amount of time to get into. Even though they may be interested that threshold alone is enough to put it in the "interesting, but I'll look at it later" box.
If you can tease people into playing with it for fun, then I think there is a larger chance of them using it at a later stage (or even thinking about the possibility of using it) when they see a need in some heavy computational problem they are working on.
There is a lower threshold to get started with something new if you already have a tiny toy-project you can cut and paste from that you have written yourself.
Also, updated benchmarks could generate new interest on the announce forum thread. Lurking forum readers, probably only read them on occasion, so you have to make several posts to make people aware of it.
Definitely. Homogenous memory is interesting for the ability to make GPUs do the things GPUs are good at and leave the rest to the CPU without worrying about memory transfer across the PCI-e. Something which CUDA can't take advantage of on account of nvidia GPUs being only discrete. I've no idea how cacheing work in a system like that though.
I don't know, but Steam Deck, which appears to come out next month, seems to run under Linux and has an "AMD APU" with a modern GPU and CPU integrated on the same chip, at least that is what I've read. Maybe there will be more technical info available on how that works at the hardware level later, or maybe it is already on AMDs website?
If someone reading this thread has more info on this, it would be nice if they would share what they have found out! :-)