Jump to page: 1 2
Thread overview
core.simd and dynamic arrays
Jan 26
John C.
Jan 26
ryuukk_
Jan 26
John C.
Jan 26
John C.
Jan 26
John C.
Jan 26
Johan
Jan 27
John C.
Jan 27
John C.
Jan 27
John C.
Jan 27
Johan
January 26

Hello everyone. I am complete newbie in D and programming at all and I can't understand why dynamic arrays can't be used within following D code:

import std.random : uniform01;
import core.simd;

void main() {
    align(32) float[] a = new float[128];
    align(32) float[] b = new float[128];
    align(32) float[] c = new float[128];

    /* filling input arrays with random numbers in [0, 1) range */
    for (size_t i = 0; i < c.length; ++i) {
        a[i], b[i] = uniform01(), uniform01();
    }

    for (size_t i = 0; i < c.length; i += 8) {
        /* seems that segfault reason hides below */
        auto va = *cast(float8 *)(&a[i]);
        auto vb = *cast(float8 *)(&b[i]);
        auto vc = va * vb;
        *cast(float8 *)(&c[i]) = vc;
    }
}

I have tested same code (but used instead static arrays of size 8) and it worked correctly. For bigger static arrays code above even outperformed one-by-one element iterative version.
I'm using LDC compiler 1.36.0 on x86_64 Linux system with "-w -O3 -mattr=+avx" compiler flags.

January 26

LDC 1.36 = 1 years old

latest version is LDC 1.40

with LDC 1.40, your code works on my computer

now my turn to ask a question:

why were you using a 1 years old compiler version?

common sense would be to make sure you are up to date before wondering why it's broken

January 26

On Sunday, 26 January 2025 at 12:56:55 UTC, ryuukk_ wrote:

>

common sense would be to make sure you are up to date before wondering why it's broken

This is the learn forum. People are learning here. Try to be nicer, there is no need for this.

-Steve

January 26

On Sunday, 26 January 2025 at 12:56:55 UTC, ryuukk_ wrote:

>

LDC 1.36 = 1 years old

latest version is LDC 1.40

with LDC 1.40, your code works on my computer

now my turn to ask a question:

why were you using a 1 years old compiler version?

common sense would be to make sure you are up to date before wondering why it's broken

Sorry, but I'm not using rolling-release Linux distribution and only version that was available in package repositories by default was 1.36. I have tried to switch to other package repositories, changing software sources and among all available package updates there wasn't any ldc entry. But I will try to update compiler to latest available GitHub release.

January 26

On Sunday, 26 January 2025 at 12:45:11 UTC, John C. wrote:

>

I'm using LDC compiler 1.36.0 on x86_64 Linux system with "-w -O3 -mattr=+avx" compiler flags.

I have tested this code with LDC on run.dlang.io, segmentation fault does occur only if -mattr=+avx is used. Without this flag no errors are produced.

January 26

On Sunday, 26 January 2025 at 13:59:09 UTC, John C. wrote:

>

I have tested this code with LDC on run.dlang.io, segmentation fault does occur only if -mattr=+avx is used. Without this flag no errors are produced.

Actually, if I use -mcpu=avx with DMD, no error is generated. However, if this flag is not specified, "undefined identifier float8" error occurs.

January 26

On Sunday, 26 January 2025 at 12:45:11 UTC, John C. wrote:

>

Hello everyone. I am complete newbie in D and programming at all and I can't understand why dynamic arrays can't be used within following D code:

import std.random : uniform01;
import core.simd;

void main() {
    align(32) float[] a = new float[128];
...

The align(32) applies to the slice a, not the contents of a (where a points to).

Some things to try:

  • What exactly is the error reported? An out-of-bounds read/write would not result in a segfault. (but perhaps with optimization and UB for unaligned float8 access...)
  • Print out the pointer to a[0] to verify what the actual alignment is.
  • Does it work when you create an array of float8? (float8[] a = new float8[128/8];)

By the way, a[i], b[i] = uniform01(), uniform01(); does not do what you think it does. Rewrite to

a[i] = uniform01();
b[i] = uniform01();

cheers,
Johan

January 27

On Sunday, 26 January 2025 at 16:43:19 UTC, Johan wrote:

>

The align(32) applies to the slice a, not the contents of a (where a points to).

Thank you, seems that it is reason for that errors. I remember that dynamic array can be represented as structure with size_t len and pointer to memory location, so do we need to align memory for this memory location, not dynamic array? Even if we align dynamic array structure, we get five zeros at the end of it's address, but memory location pointed to is still unaligned, so do I have align it manually? I have written this code and it works without any error with LDC and DMD on run.dlang.io:

import std.stdio : writeln, writefln;
import std.random : uniform01;
import core.memory : GC;
import core.simd;

T[] initAlignedArr(T)(size_t length) {
    auto arr = GC.malloc(T.sizeof * length + 32);
    return (cast(T*)(cast(size_t)(arr + 32) & ~ 0x01F))[0..length];
}

void main() {
    float[] a = initAlignedArr!float(1024);
    float[] b = initAlignedArr!float(1024);
    float[] c = initAlignedArr!float(1024);

    writeln(&a, " ", &b, " ", &c);
    writeln(a.ptr, " ", b.ptr, " ", c.ptr);

    writeln("Filling array...");
    for (size_t i = 0; i < c.length; ++i) {
        a[i] = uniform01();
        b[i] = uniform01();
    }

    writeln("Performing arithmetics...");
    for (size_t i = 0; i < c.length; i += 8) {
        auto va = *cast(float8 *)(&a[i]);
        auto vb = *cast(float8 *)(&b[i]);
        auto vc = va * vb;
        *cast(float8 *)(&c[i]) = vc;
    }

    writeln("Checking array...");
    for (size_t i = 0; i < c.length; i += 8) {
        if (c[i] != a[i] * b[i]) {
            writefln("Value in array c is not product (i = %s): %s != %s + %s", i, c[i], a[i], b[i]);
            break;
        }
    }
}

Output:

7FFE53D6FDC0 7FFE53D6FDB0 7FFE53D6FDA0
7F90BAC35020 7F90BAC37020 7F90BAC39020
Filling array...
Performing arithmetics...
Checking array...
>

What exactly is the error reported? An out-of-bounds read/write would not result in a segfault. (but perhaps with optimization and UB for unaligned float8 access...)

Seems like optimization level does not change error message (run.dlang.io LDC, only "-mattr=+avx" flag):

Error: /tmp/onlineapp-223f65 failed with status: -2
       message: Segmentation fault (core dumped)

Without this LDC flag, no errors.

>

Print out the pointer to a[0] to verify what the actual alignment is.

If we look to output above, first line addresses are aligned to 32 bytes, but it does not matter since we have size_t length of dynamic array first, then pointer and not array itself if I understand correctly? Second line addresses are aligned too, but their alignment matters.

>

Does it work when you create an array of float8? (float8[] a = new float8[128/8];)

No, I have modified original code version and errors are the same, except for dmd with "-mcpu=avx" flag set (error changed to "program killed by signal 11" on run.dlang.io).

import std.stdio : writeln, writefln;
import std.random : uniform01;
import core.memory : GC;
import core.simd;

void main() {
    float8[] a = new float8[128];
    float8[] b = new float8[128];
    float8[] c = new float8[128];

    writeln(&a, " ", &b, " ", &c);
    writeln(a.ptr, " ", b.ptr, " ", c.ptr);

    writeln("Filling array...");
    for (size_t i = 0; i < c.length; ++i) {
        // If I understand correctly, lines below assign 8 equal float values to float8 (does not matter in this test?)
        a[i] = uniform01();
        b[i] = uniform01();
    }

    writeln("Performing arithmetics...");
    for (size_t i = 0; i < c.length; ++i) {
        c[i] = a[i] * b[i];
    }

    writeln("Checking array...");
    for (size_t i = 0; i < c.length; i += 8) {
        if (c[i].array != (a[i] * b[i]).array) {
            writefln("Value in array c is not product (i = %s): %s != %s + %s", i, c[i], a[i], b[i]);
            break;
        }
    }
}

Output:

7FFF602EF5A0 7FFF602EF590 7FFF602EF580
7F15CB784010 7F15CB786010 7F15CB788010
Filling array...
Error: /tmp/onlineapp-835ef2 failed with status: -2
       message: Segmentation fault (core dumped)
Error: program received signal 2 (Interrupt)
>

By the way, a[i], b[i] = uniform01(), uniform01(); does not do what you think it does. Rewrite to

Oh, yesterday I became little pythonic :)

January 27

On Monday, 27 January 2025 at 05:53:09 UTC, John C. wrote:

> >

Print out the pointer to a[0] to verify what the actual alignment is.

If we look to output above, first line addresses are aligned to 32 bytes

Except address with B(1011) at second from the right position?

January 27

On Monday, 27 January 2025 at 05:57:18 UTC, John C. wrote:

>

On Monday, 27 January 2025 at 05:53:09 UTC, John C. wrote:

> >

Print out the pointer to a[0] to verify what the actual alignment is.

If we look to output above, first line addresses are aligned to 32 bytes

Except address with B(1011) at second from the right position?

You may use intel-intrinsics who:

  1. guarantees float8 is there
  2. have aligned malloc _mm_malloc
« First   ‹ Prev
1 2