December 05, 2014
ixid:

>> void foo(int[2]) {}
>> void bar(int[]) {}
>> void main() @nogc {
>>    foo([1, 2]s);
>>    bar([1, 2]s);
>> }
>
> That is a rather unfriendly syntax, it is the kind that degenerates into noise with other structures.

Can you show an example of the noisy code it causes?

And are you able to invent something succint that is better?

Bye,
bearophile
December 05, 2014
On Friday, 5 December 2014 at 02:38:48 UTC, Daniel Murphy wrote:
> "H. S. Teoh via Digitalmars-d"  wrote in message news:mailman.2709.1417745546.9932.digitalmars-d@puremagic.com...
>
>> I've often pondered about the possibility of a language where the
>> compiler will analyze each module and infer any number of attributes and
>> optimization opportunities for each symbol exported by that module, and
>> this information will be saved in the object file (or some other kind of
>> interfacing file). This includes any half-compiled template bodies and
>> whatever else that can't be fully codegen'd until actual use.  The
>> attributes will include all sorts of stuff that programmers normally
>> wouldn't want to deal with -- there could be 10+ or 50+ attributes
>> representing various optimization / static checking opportunities.  Then
>> every time a module is imported by another module, the compiler never
>> goes to the source code of the imported module anymore, but it will read
>> the object (interface) file, which is fully attributed, and the saved
>> attributes will be used internally for static checking, optimization,
>> and inferring attributes for the current module.
>
> This can't be used to infer attributes that can produce errors - those attributes have to be user-visible or the errors don't make any sense.  If it's purely for optimization, then that's basically what LTO does.

We could inter attributes if not specified instead of assuming a
default, for example @nogc and a possible @gc.

---

int[] foo1(int[] bar) @nogc // function is @nogc, error if gc is
used
int[] foo2(int[] bar) @gc // function is @gc, functions that call
foo2 cannot be @nogc
int[] foo3(int[] bar) // neither @nogc nor @gc, compiler infers
attribute
---
December 05, 2014
On Friday, 5 December 2014 at 14:10:44 UTC, bearophile wrote:
> ixid:
>
>>> void foo(int[2]) {}
>>> void bar(int[]) {}
>>> void main() @nogc {
>>>   foo([1, 2]s);
>>>   bar([1, 2]s);
>>> }
>>
>> That is a rather unfriendly syntax, it is the kind that degenerates into noise with other structures.
>
> Can you show an example of the noisy code it causes?
>
> And are you able to invent something succint that is better?
>
> Bye,
> bearophile

[1,2].stack
stack [1,2]
@stack [1,2]
[1,2]stack
December 05, 2014
On 12/4/14 5:48 PM, deadalnix wrote:
> On Thursday, 4 December 2014 at 14:58:47 UTC, Steven Schveighoffer wrote:
>> "There can be at most one owner for any piece of data."
>>
>> This doesn't seem right. For GC data, the GC owns the data, that is
>> true. But for Ref-counted data, there is more than one owner, and only
>> when all the owners disown the data can it be destroyed.
>>
>
> The RC mechanism is the owner. Ownership is loosly defined in this DIp
> so that it do not close any door for future language evolution.

Well, actually the DIP is pretty rigid, it speaks only of ownership in terms of variables -- which variable owns a piece of data. It doesn't allow this kind of ownership via a concept or condition.

I would change the DIP to reflect this clarification, if that is what is intended.

-Steve
December 05, 2014
On 12/4/14 4:24 AM, Walter Bright wrote:
> http://wiki.dlang.org/DIP69
>
> Despite its length, this is a fairly simple proposal. It adds the
> missing semantics for the 'scope' storage class in order to make it
> possible to pass a reference to a function without it being possible for
> it to escape.
>
> This, among other things, makes a ref counting type practical. It also
> makes it more practical to use other storage allocation schemes than
> garbage collection.
>
> It does not make scope into a type constructor, nor a general
> type-annotation system.
>
> It does not provide an ownership system, though it would complement one.

Can we take a step back here?

I read many people's comments and I understand only about half of them.

Can someone who knows what this new feature is supposed to do give some Ali Çehreli-like description on the feature? Basically, let's strip out the *proof* in the DIP (the how it works and why we have it), and focus on how it is to be used.

I still am having a hard time wrapping my head around the benefits and when to use scope, scope ref, why I would use it. I'm worried that we are adding all this complication and it will confuse the shit out of users, to the point where they won't use it.

-Steve
December 05, 2014
There are limitations this proposal has in comparison to my original one. These limitations might of course be harmless and play no role in practice, but on the other hand, they may, so I think it's good to list them here.

Additionally I have to agree with Steven Schveighoffer: This DIP is very complicated to understand. It's not obvious how the various parts play together, and why/to which degree it "works", and which are the limitations. I don't think that's only because my brain is already locked on my proposal...

1) Escape detection is limited to `ref`.

    T* evil;
    ref T func(scope ref T t, ref T u) @safe {
      return t; // Error: escaping scope ref t
      return u; // ok
      evil = &u; // Error: escaping reference
    }

vs.

    T[] evil;
    T[] func(scope T[] t, T[] u) @safe {
      return t; // Error: cannot return scope
      return u; // ok
      evil = u; // !!! not good
    }

As can be seen, `ref T u` is protected from escaping (apart from returning it), while `T[] u` in the second example is not. There's no general way to express that `u` can only be returned from the function, but will not be retained otherwise by storing it in a global variable. Adding `pure` can express this in many cases, but is, of course, not always possible.

Another workaround is passing the parameters as `ref`, but this would introduce an additional indirection and has different semantics (e.g. when the lengths of the slices are modified).

2) `scope ref` return values cannot be stored.

    scope ref int foo();
    void bar(scope ref int a);

    foo().bar();        // allowed
    scope tmp = foo();  // not allowed
    tmp.bar();

Another example:

    struct Container(T) {
        scope ref T opIndex(size_t index);
    }

    void bar(scope ref int a);

    Container c;
    bar(c[42]);            // ok
    scope ref tmp = c[42]; // nope

Both cases should be fine theoretically; the "real" owner lives longer than `tmp`. Unfortunately the compiler doesn't know about this.

Both restrictions 1) and 2) are because there are no explicit lifetime/owner designations (the scope!identifier thingy in my proposal).

3) `scope` cannot be used for value types.

I can think of a few use cases for scoped value types (RC and file descriptors), but they might only be marginal.

4) No overloading on `scope`.

This is at least partially a consequence of `scope` inference. I think overloading can be made to work in the presence of inference, but I haven't thought it through.

5) `scope` is a storage class.

Manu complained about `ref` being a storage class. If I understand him right, one reason is that we have a large toolkit for dealing with type modifiers, but almost nothing for storage classes. I have to agree with him there. But I haven't understood his point fully, maybe he himself can post more about his problems with this?

6) There seem to be problems with chaining.

    scope ref int foo();
    scope ref int bar1(ref int a) {
        return a;
    }
    scope ref int bar2(scope ref int a) {
        return a;
    }
    ref int bar3(ref int a) {
        return a;
    }
    ref int bar4(scope ref int a) {
        return a;
    }
    void baz(scope ref int a);

Which of the following calls would work?

    foo().bar1().baz();
    foo().bar2().baz();
    foo().bar3().baz();
    foo().bar4().baz();

I'm not sure I understand this fully yet, but it could be that none of them work...
December 05, 2014
On 12/5/2014 7:27 AM, Steven Schveighoffer wrote:
> Can someone who knows what this new feature is supposed to do give some Ali
> Çehreli-like description on the feature? Basically, let's strip out the *proof*
> in the DIP (the how it works and why we have it), and focus on how it is to be
> used.
>
> I still am having a hard time wrapping my head around the benefits and when to
> use scope, scope ref, why I would use it. I'm worried that we are adding all
> this complication and it will confuse the shit out of users, to the point where
> they won't use it.

The tl;dr version is when a declaration is tagged with 'scope', the contents of that variable will not escape the lifetime of that declaration.

It means that this code will be safe:

   void foo(scope int* p);

   p = malloc(n);
   foo(p);
   free(p);

The rest is all the nuts and bolts of making that work.

December 05, 2014
On 12/4/2014 6:56 PM, deadalnix wrote:
> On Friday, 5 December 2014 at 00:32:32 UTC, Walter Bright wrote:
>> On 12/4/2014 3:04 PM, deadalnix wrote:
>>> So as mentioned, there are various problem with this DIP :
>>>  - rvalue are defined as having a scope that goes to the end of the statement.
>>> That mean they can never be assigned to anything as per spec.
>>
>> I don't believe this is correct. Rvalues can be assigned, just like:
>>
>>    __gshared int x;
>>    { int i; x = i; }
>>
>> i's scope ends at the } but it can still be assigned to x.
>>
>
> It work even better when i has indirections.

I understand what you're driving at, but only a scoped rvalue would not be copyable.


>>>  - It add more special casing with & (as if it wasn't enough of a mess with
>>> @property, optional () and the fact the function aren't first class). For
>>> instance *e has infinite lifetime when &(*e) is lifetime(e).
>>
>> That's right. I know you're worried about that, but I still don't see it as an
>> actual problem. (The optimizer makes use of this special case all the time.)
> Yes, this is the job of the optimizer to do this kind of stunt.
> Not the semantic analysis.

I don't see any other way, nor do I see the practical problem.


>> I originally had scope only apply to ref, but that made having scoped classes
>> impossible.
>>
>
> Promoting scoped class on stack is an ownership problem, and out
> of scope (!). It make sense to allow it as an optimization.
>
> Problem is, lifetime goes to infinite after indirection, so I'm
> not sure what the guarantee is.

The guarantee is there will be no references to the class instance after the scoped class goes out of scope.


>>> During discussion, I proposed to differentiate lifetime calculation between
>>> lvalues and rvalues (which are inherently different beasts with different
>>> lifetime) and carry (or not) the scope flag with each expression.
>>
>> I'm not sure how that would be different from the DIP as it stands now.
>
> I cause everything reached through the view to be scope and
> obliviate the need for things like &(*e) having special meaning.

Are you suggesting transitive scope?
December 05, 2014
On 12/4/2014 1:32 PM, Steven Schveighoffer wrote:
> On 12/4/14 3:58 PM, Walter Bright wrote:
>> On 12/4/2014 7:25 AM, Steven Schveighoffer wrote:
>>> int* bar(scope int*);
>>> scope int* foo();
>>>
>>> bar(foo());           // Ok, lifetime(foo()) > lifetime(bar())
>>>
>>> I'm trying to understand how foo can be implemented in any case. It
>>> has no scope
>>> ints to return, so where does it get the int from?
>>
>> Could be from a global variable. Or a new'd value.
>
> Well, OK, but why do that?

Why would a programmer do that? I often ask that question! But the language allows it, therefore we must support it.


>>> I don't see where the proposal defines what exactly can be returned
>>> via scope.
>>
>> The scope return value does not affect what can be returned. It affects
>> how that return value can be used. I.e. the return value cannot be used
>> in such a way that it escapes the lifetime of the expression.
>
> I assumed the scope return was so you could do things like:
>
> scope int *foo(scope int *x)
> {
>     return x;
> }
>
> which would be fine, I assume, right?

No. A scope parameter means the value does not escape the function. That means you can't return it.


> My question was about how this kind of allows declaring a ref variable in the
> middle of a function, which was never allowed before.

There's no technical reason it is disallowed - it's just that I didn't see a point to it.

December 05, 2014
On 12/5/2014 8:48 AM, "Marc Schütz" <schuetzm@gmx.net>" wrote:
> There are limitations this proposal has in comparison to my original one. These
> limitations might of course be harmless and play no role in practice, but on the
> other hand, they may, so I think it's good to list them here.

Good idea. Certainly, this is less powerful than your proposal. The question, obviously, is what is good enough to get the job done. By "the job", I mean reference counting, migrating many allocations to RAII (meaning the stack), and eliminating a lot of closure GC allocations.


> Additionally I have to agree with Steven Schveighoffer: This DIP is very
> complicated to understand. It's not obvious how the various parts play together,
> and why/to which degree it "works", and which are the limitations. I don't think
> that's only because my brain is already locked on my proposal...

I'm still looking for an easier way to explain it. The good news in this, however, is if it is correctly implement the compiler should be a big help in using scope correctly.


> 1) Escape detection is limited to `ref`.
>
>      T* evil;
>      ref T func(scope ref T t, ref T u) @safe {
>        return t; // Error: escaping scope ref t
>        return u; // ok
>        evil = &u; // Error: escaping reference
>      }
>
> vs.
>
>      T[] evil;
>      T[] func(scope T[] t, T[] u) @safe {
>        return t; // Error: cannot return scope
>        return u; // ok
>        evil = u; // !!! not good

right, although:
     evil = t;  // Error: not allowed

>      }
>
> As can be seen, `ref T u` is protected from escaping (apart from returning it),
> while `T[] u` in the second example is not. There's no general way to express
> that `u` can only be returned from the function, but will not be retained
> otherwise by storing it in a global variable. Adding `pure` can express this in
> many cases, but is, of course, not always possible.

As you point out, 'ref' is designed for this.


> Another workaround is passing the parameters as `ref`, but this would introduce
> an additional indirection and has different semantics (e.g. when the lengths of
> the slices are modified).
>
> 2) `scope ref` return values cannot be stored.
>
>      scope ref int foo();
>      void bar(scope ref int a);
>
>      foo().bar();        // allowed
>      scope tmp = foo();  // not allowed
>      tmp.bar();

Right


> Another example:
>
>      struct Container(T) {
>          scope ref T opIndex(size_t index);
>      }
>
>      void bar(scope ref int a);
>
>      Container c;
>      bar(c[42]);            // ok
>      scope ref tmp = c[42]; // nope
>
> Both cases should be fine theoretically; the "real" owner lives longer than
> `tmp`. Unfortunately the compiler doesn't know about this.

Right, though the compiler can optimize to produce the equivalent.

> Both restrictions 1) and 2) are because there are no explicit lifetime/owner
> designations (the scope!identifier thingy in my proposal).
>
> 3) `scope` cannot be used for value types.
>
> I can think of a few use cases for scoped value types (RC and file descriptors),
> but they might only be marginal.

I suspect that one can encapsulate such values in a struct where access to them is strictly controlled.


> 4) No overloading on `scope`.
>
> This is at least partially a consequence of `scope` inference. I think
> overloading can be made to work in the presence of inference, but I haven't
> thought it through.

Right. Different overloads can have different semantic implementations, so what should inference do? I also suspect it is bad style to overload on 'scope'.


> 5) `scope` is a storage class.
>
> Manu complained about `ref` being a storage class. If I understand him right,
> one reason is that we have a large toolkit for dealing with type modifiers, but
> almost nothing for storage classes. I have to agree with him there. But I
> haven't understood his point fully, maybe he himself can post more about his
> problems with this?

I didn't fully understand Manu's issue, but it was about 'ref' not being inferred by template type deduction. I didn't understand why 'auto ref' did not work for him. I got the impression that he was trying to program in D the same way he'd do things in C++, and that's where the trouble came in.


> 6) There seem to be problems with chaining.
>
>      scope ref int foo();
>      scope ref int bar1(ref int a) {
>          return a;
>      }
>      scope ref int bar2(scope ref int a) {
>          return a;
>      }
>      ref int bar3(ref int a) {
>          return a;
>      }
>      ref int bar4(scope ref int a) {
>          return a;
>      }
>      void baz(scope ref int a);
>
> Which of the following calls would work?
>
>      foo().bar1().baz();

yes

>      foo().bar2().baz();

no - cannot return scope ref parameter

>      foo().bar3().baz();

yes

>      foo().bar4().baz();

no, cannot return scope ref parameter


> I'm not sure I understand this fully yet, but it could be that none of them work...

Well, you're half right :-)