Thread overview
Sub-optimal code generated for certain calls to functions with auto ref qualified parameters
Aug 10
kinke
August 10
As a follow-up to recent discussions at

https://github.com/dlang/dmd/pull/11000

I wonder if the compiler generates sub-optimal code for functions with `auto ref` parameters when passed l-values to parameters whose type fit in registers.

For instance,

T square_nt(T)(immutable scope T x)
{
    return x*x;
}

T square_t(T)(const auto ref scope T x)
{
    return x*x;
}

@safe pure nothrow @nogc unittest
{
    long x;
    auto y = square_nt(x); // passed via register
    auto z = square_t(x); // passed via indirection
}

According to my interpretation of

https://d.godbolt.org/z/5G4Eqn

it does.
August 10
On Monday, 10 August 2020 at 11:24:36 UTC, Per Nordlöw wrote:
> T square_t(T)(const auto ref scope T x)
> {
>     return x*x;
> }

Same difference in generated code for `square_nt` and `square_t` for

T square_nt(T)(immutable scope T x)
{
    return x*x;
}

T square_t(T)(immutable auto ref scope T x)
{
    return x*x;
}

@safe pure nothrow @nogc unittest
{
    immutable long x;
    auto y = square_nt(x);
    auto z = square_t(x);
}
August 10
On Monday, 10 August 2020 at 11:24:36 UTC, Per Nordlöw wrote:
> As a follow-up to recent discussions at
>
> https://github.com/dlang/dmd/pull/11000
>
> I wonder if the compiler generates sub-optimal code for functions with `auto ref` parameters when passed l-values to parameters whose type fit in registers.

It generates code according to the specification of `auto ref` (passing by ref for all lvalues and by value for all rvalues, the type doesn't matter at all), which isn't the ideal solution for passing input parameters in the most efficient way, hence that PR.
August 10
On Monday, 10 August 2020 at 15:40:19 UTC, kinke wrote:
> It generates code according to the specification of `auto ref` (passing by ref for all lvalues and by value for all rvalues, the type doesn't matter at all), which isn't the ideal solution for passing input parameters in the most efficient way, hence that PR.

Thanks