Jump to page: 1 2
Thread overview
[DIP idea] out variables
Jan 26, 2021
Q. Schroll
Jan 26, 2021
12345swordy
Jan 26, 2021
Q. Schroll
Jan 26, 2021
Tobias Pankrath
Jan 26, 2021
Max Haughton
Jan 26, 2021
Luhrel
Jan 27, 2021
Ogi
Jan 27, 2021
Max Haughton
Jan 30, 2021
Jacob Carlborg
Jan 30, 2021
Afgdr
Jan 28, 2021
Dukc
Jan 31, 2021
Q. Schroll
January 26, 2021
Main goal: Make the `out` parameter storage class live up to promises.
In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail.

General Idea
============

The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example:

    int f(out int value);
    int g(int[] value...);
    int h(out int a, out int b);

    out int x;
    // g(x); // illegal: reads x, but x is not yet initialized.
    // h(x, x); // illegal:
        // reads the second x before the initialization of first x is complete.
    f(x); // initializes x.

An `out` variable cannot be read until initialized by a function call in an `out` parameter position. Since D has exact evaluation order, it is easily determined that one usage of `x` initializes it and another in the same overall expression reads it (and not the other way around):

    out int x, y;
    /*1*/ if (h(x, y) > 0 && x < y) { .. }
    /*2*/ g(f(x), f(y), x, y);

Evaluation order says in /*1*/ that h(x, y) is executed before x and y are read for testing `x < y`.
Evaluation order says in /*2*/ that f(x) and f(y) are executed before x and y are read for passing them to g.

Also, multiple execution paths can lead to different initialization points:

    out int x, y, z;
    if (g(0)) { f(x); f(y); f(z); } else h(x, y);
    // x, y are initialized.
    g(x, y); // okay: x and y initialized on both branches
    g(z); // invalid: z might not be initialized.

It is always possible to initialize `out` variables using an ordinary assignment:

    out int x, y, z;
    if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
    g(z); // valid: z initialized on both branches


Templates
=========

Similar to `ref`, there will be `auto out` which infers `out` based on the arguments passed. `auto out` can be combined with `ref` (meaning pass by reference always, but if the argument is an out value, this is its initialization) and `auto ref` (meaning pass by reference if possible, and if the argument is an out value, this is its initialization; it cannot be passed by value and be initialized).

With __traits(isOut, param) one can test whether `auto out` boiled down to `out` or not.

After being (potentially|definitely|?) initialized, `out` variables do not trigger `auto out` to become `out`.


In-place `out` Variables
========================

When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead:

    if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
    if (g(0) && f(out x) > 0) { g(x); } else { .. }

The type of an in-place out variable can be left out, when it can be inferred from the called function. [Clearly it can be done in some cases and clearly it cannot be in all templates. Exact rules TBD.]
In the first else branch, `x` can be used, since regardless whether the `f(out int x) > 0 && x > 0` is true or false, evaluating it will initialize `x`.
In the second else branch, `x` cannot be used because `x` might not be initialized if g(0) is false.
The visibility of in-place out variables is limited to the statement they're declared in. For `if` statements this encompasses both branches, but for expression statements, it only encompasses that expression:

    x = f(out a) + a; // valid
    y = f(out b);
    // y += b; // error, b not visible
    out int c;
    f(c);
    z += c; // valid

One obvious use-case is functions that return a bool value indicating success and the result is an `out` parameter. Usually, these functions' names begin with try:

    if (tryParseInt(str, out x)) { use(x); }

Another could be unpacking:

    out T x;
    out S y;
    tuple.unpack(x, y);
    // or
    if (tuple.unpack(out a, out b) && condition(a, b)) { .. }

What do you think? Worth it?
January 26, 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
> Main goal: Make the `out` parameter storage class live up to promises.
> In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail.
>
> General Idea
> ============
>
> The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example:
>
>     int f(out int value);
>     int g(int[] value...);
>     int h(out int a, out int b);
>
>     out int x;
>     // g(x); // illegal: reads x, but x is not yet initialized.
>     // h(x, x); // illegal:
>         // reads the second x before the initialization of first x is complete.
>     f(x); // initializes x.
>
> An `out` variable cannot be read until initialized by a function call in an `out` parameter position. Since D has exact evaluation order, it is easily determined that one usage of `x` initializes it and another in the same overall expression reads it (and not the other way around):
>
>     out int x, y;
>     /*1*/ if (h(x, y) > 0 && x < y) { .. }
>     /*2*/ g(f(x), f(y), x, y);
>
> Evaluation order says in /*1*/ that h(x, y) is executed before x and y are read for testing `x < y`.
> Evaluation order says in /*2*/ that f(x) and f(y) are executed before x and y are read for passing them to g.
>
> Also, multiple execution paths can lead to different initialization points:
>
>     out int x, y, z;
>     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
>     // x, y are initialized.
>     g(x, y); // okay: x and y initialized on both branches
>     g(z); // invalid: z might not be initialized.
>
> It is always possible to initialize `out` variables using an ordinary assignment:
>
>     out int x, y, z;
>     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
>     g(z); // valid: z initialized on both branches
>
>
> Templates
> =========
>
> Similar to `ref`, there will be `auto out` which infers `out` based on the arguments passed. `auto out` can be combined with `ref` (meaning pass by reference always, but if the argument is an out value, this is its initialization) and `auto ref` (meaning pass by reference if possible, and if the argument is an out value, this is its initialization; it cannot be passed by value and be initialized).
>
> With __traits(isOut, param) one can test whether `auto out` boiled down to `out` or not.
>
> After being (potentially|definitely|?) initialized, `out` variables do not trigger `auto out` to become `out`.
>
>
> In-place `out` Variables
> ========================
>
> When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead:
>
>     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
>     if (g(0) && f(out x) > 0) { g(x); } else { .. }
>
> The type of an in-place out variable can be left out, when it can be inferred from the called function. [Clearly it can be done in some cases and clearly it cannot be in all templates. Exact rules TBD.]
> In the first else branch, `x` can be used, since regardless whether the `f(out int x) > 0 && x > 0` is true or false, evaluating it will initialize `x`.
> In the second else branch, `x` cannot be used because `x` might not be initialized if g(0) is false.
> The visibility of in-place out variables is limited to the statement they're declared in. For `if` statements this encompasses both branches, but for expression statements, it only encompasses that expression:
>
>     x = f(out a) + a; // valid
>     y = f(out b);
>     // y += b; // error, b not visible
>     out int c;
>     f(c);
>     z += c; // valid
>
> One obvious use-case is functions that return a bool value indicating success and the result is an `out` parameter. Usually, these functions' names begin with try:
>
>     if (tryParseInt(str, out x)) { use(x); }
>
> Another could be unpacking:
>
>     out T x;
>     out S y;
>     tuple.unpack(x, y);
>     // or
>     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }
>
> What do you think? Worth it?

in, out, inout need some badly reworking to do. Their is a preview for in, but not for others sadly.

-Alex
January 26, 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
> In-place `out` Variables
> ========================
>
> When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead:
>
>     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
>     if (g(0) && f(out x) > 0) { g(x); } else { .. }
>

I recently started using C# professionally which has this feature already. It makes function with out parameters so much more pleasant to use.

Many argue that we should not overload D with even more features, but I'd say, if it makes D more fun to use and it is just syntax sugar / a simple lowering than we should consider it.
January 26, 2021
On Tuesday, 26 January 2021 at 02:44:20 UTC, 12345swordy wrote:
>> What do you think? Worth it?
>
> in, out, inout need some badly reworking to do. Their is a preview for in, but not for others sadly.

While in and out are opposites in a sense, inout is something completely unrelated.
For the most part, I consider `in` to be fixed. With the preview, it works exactly as one would expect it did.
On the other hand, `out` is near useless: In the current state, making `out` an alias for `ref` wouldn't be that much of a breaking change.
January 26, 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
> Main goal: Make the `out` parameter storage class live up to promises.
> In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail.
>
> [...]

A few thoughts,

I like the concept of out applied to lvalues to catch things being used too early.

The concept of introducing a new variable *inside* an expression sounds like a nightmare,
I think the following construct is not only easier to implement but also more generally applicable elsewhere in the language

if(out x; expr(x))
{

}

-- lowers to --
out x;
if(expr(x))
{

}


I have left out any types from the above, although deferred type inference could be very useful it would also have to be considered very carefully.

Also, finally, this would be yet another thing that rhymes with dataflow analysis in the core language, so it needs to be specified carefully.
January 26, 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
> Main goal: Make the `out` parameter storage class live up to promises.
> In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail.
>
> General Idea
> ============
>
> The idea of an out variable is one that **must** be passed to a function in an `out` parameter position.
> Basic example:
>
>     int f(out int value);
>     int g(int[] value...);
>     int h(out int a, out int b);
>
>     out int x;
>     // g(x); // illegal: reads x, but x is not yet initialized.
>     // h(x, x); // illegal:
>         // reads the second x before the initialization of first x is complete.
>     f(x); // initializes x.


I would add "the icing on the cake" : As DMD would know if a `out` variable is initialized or not, we should be able to throw a generic error like "error: variable `d` is not initialized." for these types of codes:

```
class D
{
    int x;
    void foo()
    {
    }
}

void main()
{
    D d;
    d.foo(); // error: variable `d` is not initialized.
}
```

... instead of a raw crash with signal 11.

That would clearly save some time.

>
> An `out` variable cannot be read until initialized by a function call in an `out` parameter position. Since D has exact evaluation order, it is easily determined that one usage of `x` initializes it and another in the same overall expression reads it (and not the other way around):
>
>     out int x, y;
>     /*1*/ if (h(x, y) > 0 && x < y) { .. }
>     /*2*/ g(f(x), f(y), x, y);
>
> Evaluation order says in /*1*/ that h(x, y) is executed before x and y are read for testing `x < y`.
> Evaluation order says in /*2*/ that f(x) and f(y) are executed before x and y are read for passing them to g.
>
> Also, multiple execution paths can lead to different initialization points:
>
>     out int x, y, z;
>     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
>     // x, y are initialized.
>     g(x, y); // okay: x and y initialized on both branches
>     g(z); // invalid: z might not be initialized.
>
> It is always possible to initialize `out` variables using an ordinary assignment:
>
>     out int x, y, z;
>     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
>     g(z); // valid: z initialized on both branches
>

I imagine that it will still be possible to call f()/h() with a non-`out` variable ?

>
> Templates
> =========
>
> Similar to `ref`, there will be `auto out` which infers `out` based on the arguments passed.
>
> `auto out` can be combined with `ref`

`void f(T)(auto out ref T t);` ?

> (meaning pass by reference always, but if the argument is an out value, this is its initialization) and `auto ref` (meaning pass by reference if possible, and if the argument is an out value, this is its initialization; it cannot be passed by value and be initialized).
>
>
> With __traits(isOut, param) one can test whether `auto out` boiled down to `out` or not.
>

ok.

> After being (potentially|definitely|?) initialized, `out` variables do not trigger `auto out` to become `out`.
>

That doesn't make sense.

>
> In-place `out` Variables
> ========================
>
> When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead:
>
>     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
>     if (g(0) && f(out x) > 0) { g(x); } else { .. }

I don't like that idea. That makes the code more difficult to read.

>
> The type of an in-place out variable can be left out, when it can be inferred from the called function. [Clearly it can be done in some cases and clearly it cannot be in all templates. Exact rules TBD.]
> In the first else branch, `x` can be used, since regardless whether the `f(out int x) > 0 && x > 0` is true or false, evaluating it will initialize `x`.
> In the second else branch, `x` cannot be used because `x` might not be initialized if g(0) is false.
> The visibility of in-place out variables is limited to the statement they're declared in. For `if` statements this encompasses both branches, but for expression statements, it only encompasses that expression:
>
>     x = f(out a) + a; // valid
>     y = f(out b);
>     // y += b; // error, b not visible
>     out int c;
>     f(c);
>     z += c; // valid
>

Meh, I really don't like that fact of declaring a variable inside a function's parameter.
Also, I don't thing that it will be easy to implement it.

> One obvious use-case is functions that return a bool value indicating success and the result is an `out` parameter. Usually, these functions' names begin with try:
>
>     if (tryParseInt(str, out x)) { use(x); }
>
> Another could be unpacking:
>
>     out T x;
>     out S y;
>     tuple.unpack(x, y);
>     // or
>     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }
>

Same as above.

> What do you think? Worth it?

Yes, except the `in-place`.

January 27, 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
> Main goal: Make the `out` parameter storage class live up to promises.
> In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail.
>
> [...]

Is there any reason to use out parameters at all instead of returning a tuple?
January 27, 2021
On Wednesday, 27 January 2021 at 09:34:36 UTC, Ogi wrote:
> On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
>> Main goal: Make the `out` parameter storage class live up to promises.
>> In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail.
>>
>> [...]
>
> Is there any reason to use out parameters at all instead of returning a tuple?

Struct ABI can mean overhead in places you don't expect
January 28, 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
> In current semantics, `out` is basically `ref` but with documented intent.

It is more, at least potentially: an optimization aid. The calling function knows that contents of the `out` variable won't affect the result, unlike with `ref`.

> General Idea
> ============
>
> The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example:
>
>     int f(out int value);
>     int g(int[] value...);
>     int h(out int a, out int b);
>
>     out int x;
>     // g(x); // illegal: reads x, but x is not yet initialized.
>     // h(x, x); // illegal:
>         // reads the second x before the initialization of first x is complete.
>     f(x); // initializes x.

I don't like this. It is going to get annoying in cases like this:

```
int f(out int, int);

int func()
{  out int x;
   if(someCond) x.f(0);
   else if(someOtherCond) x.f(1);
   return x;
}
```

What should the compiler do? It cannot know whether it's possible x can be returned uninitialized. It can issue an error just in case, and we hate to refactor code due to false alarms like that. Or it can ignore it, in which case the `out` storage parameter will sometimes work, sometimes silently fail. One is still going to need to void initialize stuff to be sure to elide the default initialization.


> In-place `out` Variables
> ========================
>
> When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead:
>
>     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
>     if (g(0) && f(out x) > 0) { g(x); } else { .. }
>

This, however, sounds better. I'd only leave out the requirement for the caller to specify `out`, and also let to do that for `ref` parameters.


January 30, 2021
On 2021-01-27 19:25, Max Haughton wrote:

> Struct ABI can mean overhead in places you don't expect

If proper tuples are built-in to the language the language can invent its own ABI for that type. Just like it does for arrays and delegates.

On the other hand, there are a bunch of existing C functions that encodes out parameter as pointers. When declaring these in D, they can be declared with `out`, which will be more descriptive and safer than a pointer. It better shows the intent.

-- 
/Jacob Carlborg
« First   ‹ Prev
1 2