December 11, 2023

On Sunday, 10 December 2023 at 18:16:05 UTC, Nick Treleaven wrote:

>

You can call alloca as a default argument to a function. The memory will be allocated on the caller's stack before calling the function:
https://github.com/ntrel/stuff/blob/master/util.d#L113C1-L131C2

I've just tested and it seems it works as a constructor default argument too.

Clever!

December 11, 2023
On Sunday, 10 December 2023 at 22:59:06 UTC, Nicholas Wilson wrote:
> Always happy to help if you're interested in looking into using dcompute.

Thank you, I'll let you know!

> Or you could use grep with `--output-ll` as noted by Johan https://github.com/ldc-developers/ldc/issues/4265#issuecomment-1376424944 although this will be with that `workaroundIssue1356` applied.

Thanks for highlighting this, as I must have forgotten. I should be able to create a CI job that checks this as part of the release. This will give us the confidence that we need.

-- Bastiaan.
December 11, 2023

On Sunday, 10 December 2023 at 15:08:05 UTC, Bastiaan Veelo wrote:

>

We are looking forward to being able to safely use LDC, because tests show that it has the potential to at least double the performance.

Yes, and that's before you its excellent SIMD capabilities :)

December 11, 2023
On 12/6/23 17:28, Mike Parker wrote:
> 
> 
> One way to do that in D is to use `alloca`, but that's an issue because the memory it allocates has to be used in the same function that calls the `alloca`. So you can't, e.g., use `alloca` to alloc memory in a constructor, and that prevents using it in a custom array implementation. He couldn't think of a way to translate it.

There is the following trick. Not ideal since the length cannot be inferred, but this successfully injects alloca into the caller's scope.

```d
import core.stdc.stdlib:alloca;
import std.range:ElementType;
import core.lifetime:moveEmplace;

struct VLA(T,alias len){
    T[] storage;
    this(R)(R initializer,return void[] storage=alloca(len*T.sizeof)[0..len*T.sizeof]){
        this.storage=cast(T[])storage;
        foreach(ref element;this.storage){
            assert(!initializer.empty);
            auto init=initializer.front;
            moveEmplace!T(init,element);
            initializer.popFront();
        }
    }
    ref T opIndex(size_t i)return{ return storage[i]; }
    T[] opSlice()return{ return storage; }
}

auto vla(alias len,R)(R initializer,void[] storage=alloca(len*ElementType!R.sizeof)[0..len*ElementType!R.sizeof]){
    return VLA!(ElementType!R,len)(initializer,storage);
}

void main(){
    import std.stdio,std.string,std.conv,std.range;
    int x=readln.strip.to!int;
    writeln(vla!x(2.repeat(x))[]);
}
```

December 11, 2023
On 12/11/23 20:55, Timon Gehr wrote:
> ....
> There is the following trick. Not ideal since the length cannot be inferred, but this successfully injects alloca into the caller's scope.
> 

I see Nick already brought it up.

December 11, 2023
On Monday, 11 December 2023 at 08:24:55 UTC, Bastiaan Veelo wrote:
> On Sunday, 10 December 2023 at 22:59:06 UTC, Nicholas Wilson wrote:
>> Always happy to help if you're interested in looking into using dcompute.
>
> Thank you, I'll let you know!

And please do get in touch with Bruce Carneal if you want some tips and insight with the practical and applied side of dcompute (also with auto-vectorisation) as he has used it a lot more than I have.

>> Or you could use grep with `--output-ll` as noted by Johan https://github.com/ldc-developers/ldc/issues/4265#issuecomment-1376424944 although this will be with that `workaroundIssue1356` applied.
>
> Thanks for highlighting this, as I must have forgotten. I should be able to create a CI job that checks this as part of the release. This will give us the confidence that we need.
>
> -- Bastiaan.

Cheers, I look forward to some large speed increase reports.

December 11, 2023
On Monday, 11 December 2023 at 22:04:34 UTC, Nicholas Wilson wrote:
> And please do get in touch with Bruce Carneal if you want some tips and insight with the practical and applied side of dcompute (also with auto-vectorisation) as he has used it a lot more than I have.

dcompute needs some love: https://github.com/libmir/dcompute/pull/74

> Cheers, I look forward to some large speed increase reports.

it will be amazing to see such reports
December 12, 2023
On Monday, 11 December 2023 at 08:24:55 UTC, Bastiaan Veelo wrote:
> On Sunday, 10 December 2023 at 22:59:06 UTC, Nicholas Wilson wrote:
>> Or you could use grep with `--output-ll` as noted by Johan https://github.com/ldc-developers/ldc/issues/4265#issuecomment-1376424944 although this will be with that `workaroundIssue1356` applied.
>
> Thanks for highlighting this, as I must have forgotten. I should be able to create a CI job that checks this as part of the release. This will give us the confidence that we need.

I should note that regex will need some updating for the most recent LLVMs that have opaque pointers enabled:

`ptr byval\(%[a-zA-Z_][a-zA-Z0-9_\.]*\) align`

December 12, 2023

On Monday, 11 December 2023 at 19:55:38 UTC, Timon Gehr wrote:

>

... this successfully injects alloca into the caller's scope.

import core.stdc.stdlib:alloca;
import std.range:ElementType;
import core.lifetime:moveEmplace;

struct VLA(T,alias len){
    T[] storage;
    this(R)(R initializer,return void[] storage=alloca(len*T.sizeof)[0..len*T.sizeof]){
        this.storage=cast(T[])storage;
        foreach(ref element;this.storage){
            assert(!initializer.empty);
            auto init=initializer.front;
            moveEmplace!T(init,element);
            initializer.popFront();
        }
    }
    ref T opIndex(size_t i)return{ return storage[i]; }
    T[] opSlice()return{ return storage; }
}

auto vla(alias len,R)(R initializer,void[] storage=alloca(len*ElementType!R.sizeof)[0..len*ElementType!R.sizeof]){
    return VLA!(ElementType!R,len)(initializer,storage);
}

void main(){
    import std.stdio,std.string,std.conv,std.range;
    int x=readln.strip.to!int;
    writeln(vla!x(2.repeat(x))[]);
}

You guys are great!

December 12, 2023

On Monday, 11 December 2023 at 19:55:38 UTC, Timon Gehr wrote:

>

There is the following trick. Not ideal since the length cannot be inferred, but this successfully injects alloca into the caller's scope.

Wow, what a great hack - I'd have never came up with that!