September 04, 2012
On Monday, 3 September 2012 at 21:18:28 UTC, Carl Sturtivant wrote:
> On Monday, 3 September 2012 at 12:12:46 UTC, monarch_dodra wrote:
>> [SNIP]
>
> In a nutshell, I think you're broadly saying that you want to program with a struct S the same way whether it's stack or heap allocated. (Good code reuse, and no duplication of semantics with -> as in C++.) From this perspective the trouble is that "S()" and "new S()" don't have the same effect except for allocating one on the stack and one on the heap, and the language forbids you from overcoming this via reference variables, except by calling a function and passing "*r" to a "ref S" parameter.
>
> [SNIP]

Yeah, in a nut shell, that is pretty much it. There are several ways to "work around it", but, IMO, none are good enough:
*Function with ref: Too intrusive, especially for more complicated functions.
*Structs that implicitly alias This (such as RefCounted): Ref Counted itself I'm not a huge fan of, since I don't see why I'd pay for RAII when I have a GC. As for the rest, it is not in the library, so I wouldn't want to roll one out myself.
**Furthermore, these wrapper structs have a way of "tainting" the type system: When you pass your struct to a template, the template will instantiate on your struct itself, and not on the wrapped type.
*structs with explicit dereference (those that have "get", for instance): That's just trading "*" for "get".

For now, I'll just (*s) it. It isn't broken or anything...

----

What I regret though, is that since D is Garbage Collected, it is just screaming to be able to write:
----
S& val = *(new S);
val.doSomthing();
doSomething(s);
...
----
This is legal in C++, but it leaks* :/ D should be able to harness such expressiveness with no problems whatsoever though.

*Actually, I've done this in C++ for classes that have hefty attributes that need to be allocated, but otherwise don't need pointer functionalities. You just have to make sure to correctly implement the CC to avoid aliasing, and to "delete &val;" in the destructor. Once you've done this though, then for all intents and purposes, "val" is a value attribute. Nifty.
September 04, 2012
On Tuesday, 4 September 2012 at 02:42:42 UTC, Jonathan M Davis
wrote:
> On Monday, September 03, 2012 14:13:10 monarch_dodra wrote:
>> I was playing around with a very big struct, and told myself I
>> wanted it allocated on the heap. This meant I was now
>> manipulating S* instead of an S.
> [snip]
>
> Enhancement Request:
>
> http://d.puremagic.com/issues/show_bug.cgi?id=8616
>
> - Jonathan M Davis

That is a very well worded ER. Thank you.
September 04, 2012
On 09/03/12 20:45, Jonathan M Davis wrote:
>
> It's a  perfectly valid enhancement request to want
> 
> void func(S s, int i) {...}
> 
> to be be callable with S*, given that normally function calls on an S* don't require you to dereference anything.

No, it's not. See http://d.puremagic.com/issues/show_bug.cgi?id=8490 for why this would be a very bad idea. However 'void func(ref S s, ...){}' should (be made to) work, at least for the UFCS case - if it already doesn't (old compiler here...) - if 'func' is supposed to emulate a method then it should behave like one.

artur
September 04, 2012
On 09/03/12 23:19, Carl Sturtivant wrote:
> So I'm wondering if a language extension along the following lines would solve your problem, simply asking the compiler to use heap allocation when the variable is declared, e.g.
> 
>  @heap auto s = S(); //secretly allocated with new, but used as if local
> 
> where the compiler would know that s is valid, just as in a normal declaration. And the compiler would automatically deallocate the variable's heap storage when it goes out of scope. Otherwise the variable would behave like a normal local variable as far as inference of any kind made by the compiler in a wider context goes.

It could be part of the type (annotating the object/instance isn't really any better than just using 'new'...), so

   struct S {
      new this(/*...*/) {/*...*/}
      /*...*/
   }

would let you skip the 'new' keyword. Might be useful when used with templates
(and a 'scope this(){}' ctor could ensure stack allocation). But as right now
struct allocations aren't properly supported yet (no custom allocators) adding
something like that shouldn't have high priority, even if it can be done w/o
breaking existing code (which does not use this feature).

artur
September 04, 2012
On Tuesday, September 04, 2012 12:51:53 Artur Skawina wrote:
> On 09/03/12 20:45, Jonathan M Davis wrote:
> > It's a perfectly valid enhancement request to want
> > 
> > void func(S s, int i) {...}
> > 
> > to be be callable with S*, given that normally function calls on an S* don't require you to dereference anything.
> 
> No, it's not. See http://d.puremagic.com/issues/show_bug.cgi?id=8490 for why this would be a very bad idea.

I completely disagree with that assessment. You already get copies with UFCS all over the place. It's just that if you could use a function which took the struct as a value with a pointer to the struct with UFCS, then you'd get a copy whereas if it's taking it by pointer, you wouldn't.

> However 'void func(ref S s, ...){}' should
> (be made to) work, at least for the UFCS case - if it already doesn't (old
> compiler here...)

I completely disagree with this as well. If using UFCS with an S* and a function which takes an S works, then it should work with a function which takes ref S, but pointers are _not_ the same as ref at all, and I completely disagree with anything which try and make pointers convert to ref in the general case. It only makes sense in this particular case, because of how calling member functions on pointers to struct works, and in that case, the ref is irrelevant IMHO. Having functions take ref is _annoying_, because you can't pass rvalues to them, and so ref should be used sparingly.

I don't see any real difference between having a function which takes an S being used with UFCS and having that same function used with an S*. A copy occurs in both cases. It's just that with S*, it means that the compiler has to implicitly dereference it for you (as it already does when accessing the struct's members). Other than that, the semantics are identical.

> - if 'func' is supposed to emulate a method then it
> should behave like one.

Which is exactly the point of this enhancement request. If you call a member function on a struct pointer, you don't need to dereference anything or really care that it's a pointer, but with UFCS, all of a sudden you do, which breaks the abstraction that UFCS is trying to provide.

- Jonathan M Davis
September 04, 2012
On Tuesday, 4 September 2012 at 10:51:36 UTC, Artur Skawina wrote:
> On 09/03/12 20:45, Jonathan M Davis wrote:
>>
>> It's a  perfectly valid enhancement request to want
>> 
>> void func(S s, int i) {...}
>> 
>> to be be callable with S*, given that normally function calls on an S* don't require you to dereference anything.
>
> No, it's not. See http://d.puremagic.com/issues/show_bug.cgi?id=8490 for why
this would be a very bad idea. However 'void func(ref S s, ...){}' should (be made to) work, at least for the UFCS case - if it already doesn't (old compiler here...) - if 'func' is supposed to emulate a method then it should behave like one.

 Hmmm... And here i consider the opposite true. With ref as it works, you can only pass 'live' (absolutely guaranteed to be allocated) variables to work, you can't guarantee that with a normal pointer.

 On the other hand. Let's consider the two calls.

  struct S {
    int x, y;
    int gety(){ return y;}
  }

  int getx(S s) {return s.x;}

  S ss;
  S *sp = &ss; //without errors for now

  writeln(ss.gety());
  writeln(sp.gety()); /*pointer but as it currently works,
                        the same (technically) */

  writeln(ss.getx());
  writeln(sp.getx()); //currently breaks without dereferencing

  If getx is valid with a pointer, it will be more consistent (and doesn't change anything). Now let's void it.

  sp = null;

  writeln(sp.gety()); //null pointer, refers to calling site
  writeln(sp.getx()); /*null pointer, breaks during copying and
                        should refer to calling site*/

 This I think is totally acceptable since it would be the same error and the same reason. If we take ref and allow that, you may as well allow referencing variables (and not just in the function signature)

  struct S {
    //may as well allow during building of struct,
    //it's just as safe as the rest of the ref calls now.
    ref S prev;  //defaults to null
    this (ref S s) {prev = s};

    int x;
  }

  int getx(ref S s) {return s.x}

  //now breaks inside getx
  writeln(sp.getx());
  writeln(getx(sp));

  //might compile?; More likely it's no longer a named variable
  writeln((*sp).getx());
  writeln(getx(*sp));

  //might compile, but now it's old C/C++ pointers. Very risky?
  writeln(getx(sp[0]));

 I ask you, how do you check if it's a null pointer? &s?

  int getx(ref S s)
  //How does this make sense?? it looks wrong and is misleading
  in {assert(&s); }
  body {return s.x); }

 More importantly, if it's now a possibility do we have to start adding checks everywhere?

  int getx(S *s)
  in {assert(s); } //perfectly acceptable check, we know it's a pointer
  body {return s.x); }
September 05, 2012
On Monday, 3 September 2012 at 12:12:46 UTC, monarch_dodra wrote:
> I was playing around with a very big struct, and told myself I wanted it allocated on the heap. This meant I was now manipulating S* instead of an S.

I've been extensively trying things out since I first posted this. I'd like to give some feedback:

-------------------------------------
First off, working with an S* has been a complete failure. The structure I was working with is several K. The problem is that come the first function call, my pointer is dereferenced and passed by value to the function, and then everything grinds to a halt.

The approach works well in C++, because everything is passed by reference. D, on the other hand, which promotes *not* having copy constructors, also promotes pass-by-value, which is completely incompatible. I like D's approach, but it also means having to shift the way I design my patterns.

-------------------------------------
The conclusion here is that the pointer must indeed be wrapped in some sort of structure, that has cheap copy. Things like RefCounted are actually *OK*, but as I was trying to write a "Reference" wrapper, I realized both have one *Major* flaw: Calling functions that return new objects, such as dup, save, opBinary etc... will leak the new object out of the "ReferenceType Wrapper" :/

On the other hand, I took my original S structure, and re-built it with an internal "Payload", and gave it reference semantics. *THAT* worked like a *CHARM* !!!

-------------------------------------
Regarding the first Enhancement Request, I know think it is a "Bad Idea ®" : Not because of any null pointer problems (IMO, that is actually a non-issue), but because of the implicit cast from S* to S. The thing with member functions is that they _Truly_Do_ take a pointer. The operator "." is not actually a "convenience dereference". When you think about it, it is actually the [(*p).member|p->member] syntax which is strange (All the way back to C, yes): why reference an object, if behind the scenes, all you do is pass the address?

I mean:

struct S
{
    void foo();
}
void bar(S* p);

void main()
{
    S* p = new S;
    p.foo();
    p.bar();
    S s;
    (&s).foo();
    (&s).bar();
}

When you think about, *that* makes a lot of sense (to me), and UFCS works correctly with it.

September 05, 2012
On 09/04/12 20:03, Jonathan M Davis wrote:
> On Tuesday, September 04, 2012 12:51:53 Artur Skawina wrote:
>> On 09/03/12 20:45, Jonathan M Davis wrote:
>>> It's a perfectly valid enhancement request to want
>>>
>>> void func(S s, int i) {...}
>>>
>>> to be be callable with S*, given that normally function calls on an S* don't require you to dereference anything.
>>
>> No, it's not. See http://d.puremagic.com/issues/show_bug.cgi?id=8490 for why this would be a very bad idea.
> 
> I completely disagree with that assessment. You already get copies with UFCS all over the place.

The fact that something is already broken is not an argument for making things
even worse. As often in D, UFCS was done w/o properly thinking things through;
it should only work with free functions that take a ref argument.
Yes, accepting a value-passed arg is not technically "broken", but it /is/
unsound.

   struct S { int i; }
   ref other_func_expecting_an_S_ref(ref S s) { return s; }
   int* whatever;

   auto f1(S s) { s.i = 42; }
   auto f2(S s) { whatever = &s.i; }
   auto f3(S s) { return other_func_expecting_an_S_ref(s); }

are just a few types of bugs that will appear other and other again. If an
UFCS "method" wants to work on a copy, it can create one (or just call another
non-ref-arg-non-UFCS function). Allowing the above means those bugs are not even
warned about by the compiler and much programmer time will be wasted looking for
the one place where someone forgot to add a 'ref' keyword while writing or
modifying the code (w/o realizing that 's' is a private copy and that the code
have been working previously only by luck).
And yes - enforcing the only-ref-arg rule would probably restrict UFCS a bit, in
that things that can be chained right now might no longer be usable that way, but
it's just a matter of fixing them, possibly by allowing annotations that
explicitly enable the call-by-value semantics for the type and/or function.
The defaults should however be safe.

Not introducing unsound features is easier than later removing them.


> It's just that if you could use a function which took the struct as a value with a pointer to the struct with UFCS, then you'd get a copy whereas if it's taking it by pointer, you wouldn't.

   S* s;
   auto f(S* sp) {/*...*/};
   s.f();

behaves as expected from an external POV, so there is no problem. Mistakenly causing a copy is not likely, so implementation bugs are not an issue either.

   S* s;
   auto f(S s) {/*...*/};
   s.f();

*would* cause problems both for the 'f()' implementer /and/ user.


>> However 'void func(ref S s, ...){}' should
>> (be made to) work, at least for the UFCS case - if it already doesn't (old
>> compiler here...)
> 
> I completely disagree with this as well. If using UFCS with an S* and a function which takes an S works, then it should work with a function which takes ref S, but pointers are _not_ the same as ref at all, and I completely disagree with anything which try and make pointers convert to ref in the general case. It only makes sense in this particular case, because of how calling member functions on pointers to struct works, and in that case, the ref is irrelevant IMHO. Having functions take ref is _annoying_, because you can't pass rvalues to them, and so ref should be used sparingly.

Umm, UFCS *is* about calling "member functions" - this is exactly why UFCS
"methods" should be treated just like "normal" ones.
Allowing pass-this-by-value is an /extension/, which brings more harm than good.


> I don't see any real difference between having a function which takes an S being used with UFCS and having that same function used with an S*. A copy occurs in both cases. It's just that with S*, it means that the compiler has

People working with pointer-to-structs are not going to expect that

   S* s;
   // ...
   s.method();

makes a copy of '*s' and operates on that. It's not how methods normally work; it's way to easy for this to happen by mistake.


> to implicitly dereference it for you (as it already does when accessing the struct's members). Other than that, the semantics are identical.
> 
>> - if 'func' is supposed to emulate a method then it
>> should behave like one.
> 
> Which is exactly the point of this enhancement request. If you call a member function on a struct pointer, you don't need to dereference anything or really care that it's a pointer, but with UFCS, all of a sudden you do, which breaks the abstraction that UFCS is trying to provide.

If you call a member function on a struct pointer with UFCS all of a sudden you need to care if it's a free function or a real method, because in the first case the struct might be copied implicitly, and you get no help from the compiler to detect such issues. That is (would be) the problem.


On 09/04/12 20:19, Era Scarecrow wrote:
>  I ask you, how do you check if it's a null pointer? &s?

Yes, obviously. If you need to do that manually.

>   int getx(ref S s)
>   //How does this make sense?? it looks wrong and is misleading
>   in {assert(&s); }
>   body {return s.x); }

It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.

>  More importantly, if it's now a possibility do we have to start adding checks everywhere?
> 
>   int getx(S *s)
>   in {assert(s); } //perfectly acceptable check, we know it's a pointer
>   body {return s.x); }


   struct S { int i; auto f() { return i; } }
   int main() {
      S* s;
      return s.f();
   }

This program will assert at runtime (and (correctly) segfault in a release-build). The compiler inserts not-null-this checks for "normal" methods in non-release mode, and could also do that before invoking any UFCS "method". So you wouldn't need to check for '!!&this' yourself.


The problem w/ these checks is that they can not be disabled per-type - which
prevents certain valid uses. The compiler-inserted assertions fire also when the
methods can deal with null-this themselves (happens eg. when dealing with 'C' APIs
and libraries, when you want to keep the C part as unmodified as possible).
But that is a different issue.

artur
September 05, 2012
On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
> On 09/04/12 20:19, Era Scarecrow wrote:
>>  I ask you, how do you check if it's a null pointer? &s?
>
> Yes, obviously. If you need to do that manually.

 But you shouldn't have to.

>>   int getx(ref S s)
>>   //How does this make sense?? it looks wrong and is misleading
>>   in {assert(&s); }
>>   body {return s.x); }
>
> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.

 How I'm understanding references in D (And perhaps I'm repeating myself in two different topics) is they point to live variables (IE guaranteed pointers), and remains this way until you return from that function.

 This is entirely valid and simplifies things. Remember in D we want the language to 'do the right thing', but if you make references where it works to 'sometimes works' then it becomes a problem (pointers 'sometimes' work and are not @safe, while ref is @safe). Checking the address of a reference shouldn't be needed, since it should be dereferenced at where it was called at if need be (throwing it there).

 Any side effects that may rely on the validity of the 'pointer' (once the function ends) are compromised afterwards (I'm always assume you pass local variables for this logic, since it's likely 80% of the time).

 Would you REALLY want to mark every single function that uses ref as @trusted? You're then blindly telling the compiler to 'shut up' so you can use the code and ignoring the checking; Or am I wrong?

int getItem(string[] inp, string cmp) @safe {
  //foreach not @safe!
  //only works not @safe or blindly adding @trusted
  foreach(i, ref s; inp) {
    if (s == cmp)
      return i;
  }

  //still might not compile, if foreach calls opApply
  //(Even if it doesn't use the reference).
  foreach(i, s; inp) {
  }
}

>>  More importantly, if it's now a possibility do we have to start adding checks everywhere?
>> 
>>   int getx(S *s)
>>   in {assert(s); } //perfectly acceptable check, we know it's a pointer
>>   body {return s.x); }

>    struct S { int i; auto f() { return i; } }
>    int main() {
>       S* s;
>       return s.f();
>    }
>
> This program will assert at runtime (and (correctly) segfault in a release-build). The compiler inserts not-null-this checks for "normal" methods in non-release mode, and could also do that before invoking any UFCS "method". So you wouldn't need to check for '!!&this' yourself.

 I thought those checks weren't added (via the compiler) since if it causes a seg fault the CPU would catch it and kill the program on it's own (just add debugging flags); If you added the checks they would do the same thing (More buck for the same bang).

> The problem w/ these checks is that they can not be disabled per-type - which prevents certain valid uses. The compiler-inserted assertions fire also when the methods can deal with null-this themselves (happens eg. when dealing with 'C' APIs and libraries, when you want to keep the C part as unmodified as possible). But that is a different issue.


September 06, 2012
On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
> On 09/04/12 20:19, Era Scarecrow wrote:
>>  I ask you, how do you check if it's a null pointer? &s?
>
> Yes, obviously. If you need to do that manually.
> 
>>   int getx(ref S s)
>>   //How does this make sense?? it looks wrong and is misleading
>>   in {assert(&s); }
>>   body {return s.x); }
>
> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.

 I've been thinking about this; It would definitely be the wrong thing to do. The assert would _Always_ succeed. The address you get would be of the pointer/reference for the stack (the pointer variable exists, where it points to isn't so much the case), so it would be the same as comparing it to this...

  int getx(S* s)
  in {assert(&s);} //always true, and always wrong.

 As I mentioned, it's wrong and is misleading. You'd have to work around the system to get the check correct; and even then if the compile decides to do something different you can only have it implementation dependent.

  int getx(ref S s)
  in {
    S *ptr = cast(S*) s;
    assert(ptr);
  }

 I'm not even sure this would even work (it's undefined afterall). I hope I never have to start adding such odd looking checks, else I would throw out ref and use pointers instead; At least with them the checks are straight-forward in comparison.