Array concatenation & optimisation

Index » Learn » Array concatenation & optimisation

Array concatenation & optimisation
Jul 21, 2024 IchorDev
Jul 21, 2024 Nick Treleaven
Jul 21, 2024 Nick Treleaven
Jul 21, 2024 IchorDev
Jul 22, 2024 Nick Treleaven
Jul 21, 2024 Johan
Jul 21, 2024 IchorDev
Jul 22, 2024 Quirin Schroll
Jul 22, 2024 IchorDev

July 21, 2024

Posted by IchorDev

Permalink

IchorDev

Permalink

Obviously when writing optimised code it is desirable to reduce heap allocation frequency. With that in mind, I'm used to being told by the compiler that I can't do this in @nogc code:

void assign(ref int[4] a) @nogc{
	a[] = [1,3,6,9]; //Error: array literal in `@nogc` function `assign` may cause a GC allocation
}

'may cause' a GC allocation

Does this mean that array literals are always separately allocated first, or is this usually optimised out?
For instance, will this example always allocate a new dynamic array for the array literal, and then append it to the existing one, even in optimised builds?

void append(ref int[] a){
	a ~= [5, 4, 9];
}

Obviously for a long array literal, the benefit of knowing its length upfront (and the readability) would probably outweigh the allocation; but for small array literals, is splitting them into separate concatenations going to yield faster code, or will I waste my time and screen space?

P.S. I am mostly addressing LDC2 & GDC's output, since I am aware that DMD's optimisations are usually minimal.

July 21, 2024

Re: Array concatenation & optimisation

Posted by Nick Treleaven
in reply to IchorDev

Permalink

Nick Treleaven

Posted in reply to IchorDev

Permalink

On Sunday, 21 July 2024 at 05:43:32 UTC, IchorDev wrote:

Obviously when writing optimised code it is desirable to reduce heap allocation frequency. With that in mind, I'm used to being told by the compiler that I can't do this in @nogc code:

void assign(ref int[4] a) @nogc{
	a[] = [1,3,6,9]; //Error: array literal in `@nogc` function `assign` may cause a GC allocation
}

Just to mention that if you assign to the static array it works: a = [1,3,6,9];.

> >

'may cause' a GC allocation

Does this mean that array literals are always separately allocated first, or is this usually optimised out?

My understanding is that they do not allocate if used to initialize or assign a static array. That includes passing an array literal as an argument to a static array function parameter.

A scope slice can also be initialized from an array literal in @nogc code:
https://dlang.org/changelog/2.102.0.html#dmd.scope-array-on-stack

But assigning a literal to a scope slice is not allowed in @nogc code.

For instance, will this example always allocate a new dynamic array for the array literal, and then append it to the existing one, even in optimised builds?

void append(ref int[] a){
	a ~= [5, 4, 9];
}

If there is enough spare capacity in a's allocation, no allocation will occur.

Note that concatenation always allocates:

Concatenation always creates a copy of its operands, even if one of the operands is a 0 length array

https://dlang.org/spec/arrays.html#array-concatenation

P.S. I am mostly addressing LDC2 & GDC's output, since I am aware that DMD's optimisations are usually minimal.

While people may say that on the forum, dmd's optimizer does actually do data flow analysis:
https://forum.dlang.org/post/uqhgoi$31a7$1@digitalmars.com

July 21, 2024

Re: Array concatenation & optimisation

Posted by Nick Treleaven
in reply to Nick Treleaven

Permalink

Nick Treleaven

Posted in reply to Nick Treleaven

Permalink

On Sunday, 21 July 2024 at 10:33:38 UTC, Nick Treleaven wrote:

> >

For instance, will this example always allocate a new dynamic array for the array literal, and then append it to the existing one, even in optimised builds?

void append(ref int[] a){
	a ~= [5, 4, 9];
}

If there is enough spare capacity in a's allocation, no allocation will occur.

Sorry, I see what you mean. I've compared the -vasm output (without -O) for that function and this one:

void append(ref int[] a){
    enum e = [5, 4, 9];
    a ~= e;
}

The ASM output is the same. If you use runtime value elements I'm not sure but I think there's still no heap allocation.

I'm not good at reading ASM though.

July 21, 2024

Re: Array concatenation & optimisation

Posted by Johan
in reply to IchorDev

Permalink

Johan

Posted in reply to IchorDev

Permalink

On Sunday, 21 July 2024 at 05:43:32 UTC, IchorDev wrote:

Does this mean that array literals are always separately allocated first, or is this usually optimised out?

Not always allocated, see your example below.
I don't quite know what the heuristic is for allocation or not...

For instance, will this example always allocate a new dynamic array for the array literal, and then append it to the existing one, even in optimised builds?

void append(ref int[] a){
	a ~= [5, 4, 9];
}

https://d.godbolt.org/z/sG5Kancs4

The short array is not dynamically allocated (it's allocated on the stack, or for larger arrays it will be a hidden symbol in the binary image), even at -O0 (i.e. -O was not passed).

-Johan

July 21, 2024

Re: Array concatenation & optimisation

Posted by IchorDev
in reply to Nick Treleaven

Permalink

IchorDev

Posted in reply to Nick Treleaven

Permalink

On Sunday, 21 July 2024 at 10:33:38 UTC, Nick Treleaven wrote:

Just to mention that if you assign to the static array it works: a = [1,3,6,9];.

Bonkers. array[] is meant to be 'all of array as a slice', so you'd think that's how you copy a slice to a static array, but no!

My understanding is that they do not allocate if used to initialize or assign a static array. That includes passing an array literal as an argument to a static array function parameter.

D is pretty eager to make array literals into slices; thus have I developed a general distrust for using array literals in the vicinity of static arrays.

Case and point:

A scope slice can also be initialized from an array literal in @nogc code:
https://dlang.org/changelog/2.102.0.html#dmd.scope-array-on-stack
But assigning a literal to a scope slice is not allowed in @nogc code.

If there is enough spare capacity in a's allocation, no allocation will occur.

Note that concatenation always allocates:

Concatenation always creates a copy of its operands, even if one of the operands is a 0 length array

https://dlang.org/spec/arrays.html#array-concatenation

Thank you for all this info!

> >

P.S. I am mostly addressing LDC2 & GDC's output, since I am aware that DMD's optimisations are usually minimal.

While people may say that on the forum, dmd's optimizer does actually do data flow analysis:
https://forum.dlang.org/post/uqhgoi$31a7$1@digitalmars.com

People frequently come here to complain that 'D is slow' when they're using DMD, and often not even using -O. The responses will then usually contain some version of 'don't use DMD to generate optimised code'.

July 21, 2024

Re: Array concatenation & optimisation

Posted by IchorDev
in reply to Johan

Permalink

IchorDev

Posted in reply to Johan

Permalink

On Sunday, 21 July 2024 at 15:41:50 UTC, Johan wrote:

https://d.godbolt.org/z/sG5Kancs4

The short array is not dynamically allocated (it's allocated on the stack, or for larger arrays it will be a hidden symbol in the binary image), even at -O0 (i.e. -O was not passed).

Wow thanks, that's an impressive find! I didn't know LDC was quite so clever either.

July 22, 2024

Re: Array concatenation & optimisation

Posted by Quirin Schroll
in reply to IchorDev

Permalink

Quirin Schroll

Posted in reply to IchorDev

Permalink

On Sunday, 21 July 2024 at 05:43:32 UTC, IchorDev wrote:

Obviously when writing optimised code it is desirable to reduce heap allocation frequency. With that in mind, I'm used to being told by the compiler that I can't do this in @nogc code:

void assign(ref int[4] a) @nogc{
	a[] = [1,3,6,9]; //Error: array literal in `@nogc` function `assign` may cause a GC allocation
}

'may cause' a GC allocation

Optimization is unrelated to language semantics, except for what optimizations the language semantics allow for. Even if an allocation is optimized away, if the language semantics don’t require this optimization (which means it’s no optimization actually), it must pretend it’s not happening as far as diagnostics etc. are concerned.

My mental model of array literals is that array literals have their own, internal type. Think of __arrayLiteral!(T, n). If you ask an array literal its type (e.g. using typeof), it’ll say T[]. But when you use it to initialize a static array, it simply behaves differently than a T[] because it just isn’t one. The madness about this is that even casts don’t affect the typing.

void main() @nogc
{
    int x;
    enum int[] xs = [1,2,3];
    int[4] ys = cast(int[])(xs ~ x); // good
    int[4] zs = (b ? xs : xs) ~ x; // error
}

Here, neither the explicit type int[] for xs or the cast (you can remove any subset of them) make it so that the assignment isn’t @nogc. The whole __arrayLiteral!(T, n) is after some semi-constant folding that only applies to the length. In no way is xs ~ x a compile-time constant as x is a run-time value.

However, if you use (b ? xs : xs) instead of plain xs with a run-time boolean b, the language doesn’t see it as an array with compile-time-known length, and thus requires allocation.

In your example, you’re not assigning an array literal to a static array as far as the type system is concerned. The left-hand side is a[], which has type int[]. So, as far as the type system is concerned, you assign an array literal to an int[], and that requires allocating the literal on the GC heap, rendering the function non-@nogc. If the optimizer figures out that all of this ends up just putting some values in some static array and it removes the allocation, this has no effect on whether the function is @nogc.

July 22, 2024

Re: Array concatenation & optimisation

Posted by Nick Treleaven
in reply to Nick Treleaven

Permalink

Nick Treleaven

Posted in reply to Nick Treleaven

Permalink

On Sunday, 21 July 2024 at 10:33:38 UTC, Nick Treleaven wrote:

On Sunday, 21 July 2024 at 05:43:32 UTC, IchorDev wrote:

Does this mean that array literals are always separately allocated first, or is this usually optimised out?

My understanding is that they do not allocate if used to initialize or assign a static array. That includes passing an array literal as an argument to a static array function parameter.

A scope slice can also be initialized from an array literal in @nogc code:
https://dlang.org/changelog/2.102.0.html#dmd.scope-array-on-stack

Another (related) case that doesn't heap allocate is when passing a literal as a scope parameter:

void main() @nogc {
    f([1,2]);
}

void f(scope int[]) @nogc;

I will look at adding these cases to the spec.

July 22, 2024

Re: Array concatenation & optimisation

Posted by IchorDev
in reply to Quirin Schroll

Permalink

IchorDev

Posted in reply to Quirin Schroll

Permalink

On Monday, 22 July 2024 at 12:03:33 UTC, Quirin Schroll wrote:

this has no effect on whether the function is @nogc.

That is a highly amusing (but obviously understandable) logical contradiction that I’d never considered before.
‘Sometimes you can’t use non-GC code in @nogc code.’

Top | Forum index | About this forum

Forums