On Saturday, 19 August 2023 at 19:23:38 UTC, Cecil Ward wrote:
>I’m trying to write a cross-platform function that gives access to the CPU’s prefetch instructions such as x86 prefetch0/1/2/prefetchnta and AAarch64 too. I’ve found that the GDC and LDC compilers provide builtin magic functions for this, and are what I need. I am trying to put together a plain-English detailed spec for the respective builtin magic functions.
My questions:
Q1) I need to compare the spec for the GCC and LDC builtin magic functions’ "locality" parameter. Can anyone tell me if GDC and LDC have kept mutual compatibility here?
I'd have thought GCC and LLVM have mutual compatibility thanks to a common target API in Intel's _mm_prefetch()
function (and in fact, the magic locality numbers match _MM_HINT_*
constants).
#define _MM_HINT_T0 1
#define _MM_HINT_T1 2
#define _MM_HINT_T2 3
#define _MM_HINT_NTA 0
> Q2) Could someone help me turn the GCC and LDC specs into english regarding the locality parameter ? - see (2) and (4) below.
https://gcc.gnu.org/projects/prefetch.html
>Q3) Does the locality parameter determine which level of the data cache hierarchy data is fetched into? Or is it always fetched into L1 data cache and the outer ones, and this parameter affects caches’ future behaviour?
It really depends on the CPU, and what features it has.
x86 SSE intrinsics are described in the x86 instruction manual, along with the meaning of T[012], and NTA.
https://www.felixcloutier.com/x86/prefetchh
>Q3) Will these magic builtins work on AAarch64?
It'll work on all targets that define a prefetch insn, or it'll be a no-op. Similarly one or both read-write or locality arguments might be ignored too.