On Sunday, 26 January 2025 at 16:43:19 UTC, Johan wrote:
> The align(32)
applies to the slice a
, not the contents of a
(where a
points to).
Thank you, seems that it is reason for that errors. I remember that dynamic array can be represented as structure with size_t len and pointer to memory location, so do we need to align memory for this memory location, not dynamic array? Even if we align dynamic array structure, we get five zeros at the end of it's address, but memory location pointed to is still unaligned, so do I have align it manually? I have written this code and it works without any error with LDC and DMD on run.dlang.io:
import std.stdio : writeln, writefln;
import std.random : uniform01;
import core.memory : GC;
import core.simd;
T[] initAlignedArr(T)(size_t length) {
auto arr = GC.malloc(T.sizeof * length + 32);
return (cast(T*)(cast(size_t)(arr + 32) & ~ 0x01F))[0..length];
}
void main() {
float[] a = initAlignedArr!float(1024);
float[] b = initAlignedArr!float(1024);
float[] c = initAlignedArr!float(1024);
writeln(&a, " ", &b, " ", &c);
writeln(a.ptr, " ", b.ptr, " ", c.ptr);
writeln("Filling array...");
for (size_t i = 0; i < c.length; ++i) {
a[i] = uniform01();
b[i] = uniform01();
}
writeln("Performing arithmetics...");
for (size_t i = 0; i < c.length; i += 8) {
auto va = *cast(float8 *)(&a[i]);
auto vb = *cast(float8 *)(&b[i]);
auto vc = va * vb;
*cast(float8 *)(&c[i]) = vc;
}
writeln("Checking array...");
for (size_t i = 0; i < c.length; i += 8) {
if (c[i] != a[i] * b[i]) {
writefln("Value in array c is not product (i = %s): %s != %s + %s", i, c[i], a[i], b[i]);
break;
}
}
}
Output:
7FFE53D6FDC0 7FFE53D6FDB0 7FFE53D6FDA0
7F90BAC35020 7F90BAC37020 7F90BAC39020
Filling array...
Performing arithmetics...
Checking array...
> What exactly is the error reported? An out-of-bounds read/write would not result in a segfault. (but perhaps with optimization and UB for unaligned float8 access...)
Seems like optimization level does not change error message (run.dlang.io LDC, only "-mattr=+avx" flag):
Error: /tmp/onlineapp-223f65 failed with status: -2
message: Segmentation fault (core dumped)
Without this LDC flag, no errors.
> Print out the pointer to a[0]
to verify what the actual alignment is.
If we look to output above, first line addresses are aligned to 32 bytes, but it does not matter since we have size_t length of dynamic array first, then pointer and not array itself if I understand correctly? Second line addresses are aligned too, but their alignment matters.
> Does it work when you create an array of float8
? (float8[] a = new float8[128/8];
)
No, I have modified original code version and errors are the same, except for dmd with "-mcpu=avx" flag set (error changed to "program killed by signal 11" on run.dlang.io).
import std.stdio : writeln, writefln;
import std.random : uniform01;
import core.memory : GC;
import core.simd;
void main() {
float8[] a = new float8[128];
float8[] b = new float8[128];
float8[] c = new float8[128];
writeln(&a, " ", &b, " ", &c);
writeln(a.ptr, " ", b.ptr, " ", c.ptr);
writeln("Filling array...");
for (size_t i = 0; i < c.length; ++i) {
// If I understand correctly, lines below assign 8 equal float values to float8 (does not matter in this test?)
a[i] = uniform01();
b[i] = uniform01();
}
writeln("Performing arithmetics...");
for (size_t i = 0; i < c.length; ++i) {
c[i] = a[i] * b[i];
}
writeln("Checking array...");
for (size_t i = 0; i < c.length; i += 8) {
if (c[i].array != (a[i] * b[i]).array) {
writefln("Value in array c is not product (i = %s): %s != %s + %s", i, c[i], a[i], b[i]);
break;
}
}
}
Output:
7FFF602EF5A0 7FFF602EF590 7FFF602EF580
7F15CB784010 7F15CB786010 7F15CB788010
Filling array...
Error: /tmp/onlineapp-835ef2 failed with status: -2
message: Segmentation fault (core dumped)
Error: program received signal 2 (Interrupt)
> By the way, a[i], b[i] = uniform01(), uniform01();
does not do what you think it does. Rewrite to
Oh, yesterday I became little pythonic :)