September 06, 2012
On Thursday, 6 September 2012 at 06:18:11 UTC, Era Scarecrow wrote:
> On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
>> On 09/04/12 20:19, Era Scarecrow wrote:
>>> I ask you, how do you check if it's a null pointer? &s?
>>
>> Yes, obviously. If you need to do that manually.
>> 
>>>  int getx(ref S s)
>>>  //How does this make sense?? it looks wrong and is misleading
>>>  in {assert(&s); }
>>>  body {return s.x); }
>>
>> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.
>
>  I've been thinking about this; It would definitely be the wrong thing to do. The assert would _Always_ succeed. The address you get would be of the pointer/reference for the stack (the pointer variable exists, where it points to isn't so much the case), so it would be the same as comparing it to this...
>
>   int getx(S* s)
>   in {assert(&s);} //always true, and always wrong.

That is absolutely not true at all. Behind the scenes, *passing* a ref is the same as passing a pointer, yes, but afterwards, they are different entities. If you request the address of the reference, it *will* give you the address of the referenced object. It is NOT the same as what you just wrote:

--------
import std.stdio;

struct S
{
    int i;
}

void foo(ref S s)
{
    writeln("address of s is: ", &s);
    assert(&s);
}

void main()
{
    S* p;
    foo(*p);
}
--------
address of s is: null
core.exception.AssertError@main(11): Assertion failure
--------

>  As I mentioned, it's wrong and is misleading. You'd have to work around the system to get the check correct; and even then if the compile decides to do something different you can only have it implementation dependent.
>
>   int getx(ref S s)
>   in {
>     S *ptr = cast(S*) s;
>     assert(ptr);
>   }
>
>  I'm not even sure this would even work (it's undefined afterall). I hope I never have to start adding such odd looking checks, else I would throw out ref and use pointers instead; At least with them the checks are straight-forward in comparison.

Again, a reference and a pointer are not the same thing. That cast is illegal.

--------
main.d(10): Error: e2ir: cannot cast s of type S to type S*
--------

But *this* is legal and good though:

--------
int getx(ref S s)
in {
  S *ptr = &s;
  assert(ptr);
}
--------
Although it is just transforming the initial 1-liner into a 2-liner...
September 06, 2012
On 09/06/12 00:50, Era Scarecrow wrote:
> On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
>> On 09/04/12 20:19, Era Scarecrow wrote:
>>>  I ask you, how do you check if it's a null pointer? &s?
>>
>> Yes, obviously. If you need to do that manually.
> 
>  But you shouldn't have to.
> 
>>>   int getx(ref S s)
>>>   //How does this make sense?? it looks wrong and is misleading
>>>   in {assert(&s); }
>>>   body {return s.x); }
>>
>> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.
> 
>  How I'm understanding references in D (And perhaps I'm repeating myself in two different topics) is they point to live variables (IE guaranteed pointers), and remains this way until you return from that function.

The struct example I gave previously (quoted below) shows how easily you can end up with a null reference in D; refs are *not* guaranteed to be live. It's not just about pointers:

   class C { int i; auto f() { return i; } }
   int main() {
      C c;
      return c.f();
   }

Here you won't even get an assert error, just a segfault. But the pointer-to-class (reference type in general, but so far there are only classes) model chosen in D is wrong; this probably contributes to the confusion about refs, because they behave differently for classes. Let's ignore classes for now, they're "special".

>  This is entirely valid and simplifies things. Remember in D we want the language to 'do the right thing', but if you make references where it works to 'sometimes works' then it becomes a problem (pointers 'sometimes' work and are not @safe, while ref is @safe). Checking the address of a reference shouldn't be needed, since it should be dereferenced at where it was called at if need be (throwing it there).

Pointers *are* @safe, it's just certain operations on them that are not.

>  Would you REALLY want to mark every single function that uses ref as @trusted?

No idea why you think that would be needed.

>>>  More importantly, if it's now a possibility do we have to start adding checks everywhere?
>>>
>>>   int getx(S *s)
>>>   in {assert(s); } //perfectly acceptable check, we know it's a pointer
>>>   body {return s.x); }
> 
>>    struct S { int i; auto f() { return i; } }
>>    int main() {
>>       S* s;
>>       return s.f();
>>    }
>>
>> This program will assert at runtime (and (correctly) segfault in a release-build). The compiler inserts not-null-this checks for "normal" methods in non-release mode, and could also do that before invoking any UFCS "method". So you wouldn't need to check for '!!&this' yourself.
> 
>  I thought those checks weren't added (via the compiler) since if it causes a seg fault the CPU would catch it and kill the program on it's own (just add debugging flags); If you added the checks they would do the same thing (More buck for the same bang).

The checks happen for structs, and should be configurable, but right now are not, which sometimes causes trouble.


On 09/06/12 08:18, Era Scarecrow wrote:
> On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
>> On 09/04/12 20:19, Era Scarecrow wrote:
>>>  I ask you, how do you check if it's a null pointer? &s?
>>
>> Yes, obviously. If you need to do that manually.
>>
>>>   int getx(ref S s)
>>>   //How does this make sense?? it looks wrong and is misleading
>>>   in {assert(&s); }
>>>   body {return s.x); }
>>
>> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.
> 
>  I've been thinking about this; It would definitely be the wrong thing to do. The assert would _Always_ succeed. The address you get would be of the pointer/reference for the stack (the pointer

No, that's not how ref args work. '&s' will give you the address of the object (eg struct). A reference type like 'class' has another (internal) level of indirection so in that case you would get a pointer to the class-reference. But that's how classes work internally, the 'object' in that case is just the  internal pointer to the "real" class data. Taking the address of an argument gives you a pointer to it in every case.

artur
September 06, 2012
On 09/06/12 12:29, Artur Skawina wrote:
> that case is just the  internal pointer to the "real" class data. Taking the address of an argument gives you a pointer to it in every case.

...gives you a pointer to that argument...

would have been be less ambiguous.

artur
September 06, 2012
On 09/06/12 12:29, Artur Skawina wrote:
>    class C { int i; auto f() { return i; } }
>    int main() {
>       C c;
>       return c.f();
>    }
> 
> Here you won't even get an assert error, just a segfault. But the pointer-to-class

Argh. The reason you won't get get an assert is because I forgot to add 'final' when converting the struct example...

   class C { int i; final f() { return i; } }
   int main() {
      C c;
      return c.f();
   }


BTW, why doesn't GDC figure this out by itself? IIRC GCC gets these cases right both for C++ and C (!), but does not devirtualize D methods, not even in LTO/WPR mode...

artur
September 06, 2012
On Thursday, 6 September 2012 at 10:29:17 UTC, Artur Skawina wrote:
> On 09/06/12 00:50, Era Scarecrow wrote:
>> How I'm understanding references in D (And perhaps I'm repeating myself in two different topics) is they point to live variables (IE guaranteed pointers), and remains this way until you return from that function.
>
> The struct example I gave previously (quoted below) shows how easily you can end up with a null reference in D; refs are *not* guaranteed to be live. It's not just about pointers:

>
>    class C { int i; auto f() { return i; } }
>    int main() {
>       C c;
>       return c.f();
>    }
>
> Here you won't even get an assert error, just a segfault. But the pointer-to-class (reference type in general, but so far there are only classes) model chosen in D is wrong; this probably contributes to the confusion about refs, because they behave differently for classes. Let's ignore classes for now, they're "special".

 Yeah, they are allocated, and 'can' still contain a null reference (or be deallocated/voided) in some way, so are pointers automatically.

> Pointers *are* @safe, it's just certain operations on them that are not.

 To my understanding I thought pointers (almost everything of them) was not covered in SafeD/@safe code. True as long as you don't mess with the pointer, than the object could remain valid (assuming it was allocated), but you automatically go into low-level code, and in trusted or system programing.

>>  Would you REALLY want to mark every single function that uses ref as @trusted?

> No idea why you think that would be needed.

 Because we aren't allocating everything on the heap. Maybe I'm just seeing things at a very different angle than you. Maybe I need a core dump for my head.

>> I've been thinking about this; It would definitely be the wrong thing to do. The assert would _Always_ succeed. The address you get would be of the pointer/reference for the stack (the pointer
>
> No, that's not how ref args work. '&s' will give you the address of the object (eg struct). A reference type like 'class' has another (internal) level of indirection so in that case you would get a pointer to the class-reference. But that's how classes work internally, the 'object' in that case is just the  internal pointer to the "real" class data. Taking the address of an argument gives you a pointer to it in every case.

 Curious. Both ways could be correct. But somehow I don't think so...

 Alright let's go the opposite direction. Give me an example in which passing a variable (by reference to a function) would EVER require it to check the address to see if it was null. Class/allocated objects should fail before the function gets control. ie:

 void func(ref int i);

 class X {
   int i;
 }

 X x;
 int* i;
 int[10] a;

 func(x.i); /*should fail while dereferencing x to access i,
              so never gets to func*/
 func(*i);  //does this count as a lvalue? Probably not,
 func(a[0]);//none of these three should compile with that in mind
 func(0);

 Being named variables, and likely non-classes you are then left with mostly local variables, or arrays, or some type of pointer indirection issue. But ever case I come up with says it would fail before the function was called.
September 06, 2012
On 09/06/12 13:34, Era Scarecrow wrote:
>  Alright let's go the opposite direction. Give me an example in which passing a variable (by reference to a function) would EVER require it to check the address to see if it was null. Class/allocated objects should fail before the function gets control. ie:
> 
>  void func(ref int i);
> 
>  class X {
>    int i;
>  }
> 
>  X x;
>  int* i;
>  int[10] a;
> 
>  func(x.i); /*should fail while dereferencing x to access i,
>               so never gets to func*/
>  func(*i);  //does this count as a lvalue? Probably not,
>  func(a[0]);//none of these three should compile with that in mind
>  func(0);
> 
>  Being named variables, and likely non-classes you are then left with mostly local variables, or arrays, or some type of pointer indirection issue. But ever case I come up with says it would fail before the function was called.

Both '*i' and 'a[0]' count. (Even '0' could be made to work as a 'const ref' arg, but i'm not sure if that would be a good idea)

artur
September 06, 2012
On 6 September 2012 11:58, Artur Skawina <art.08.09@gmail.com> wrote:
> On 09/06/12 12:29, Artur Skawina wrote:
>>    class C { int i; auto f() { return i; } }
>>    int main() {
>>       C c;
>>       return c.f();
>>    }
>>
>> Here you won't even get an assert error, just a segfault. But the pointer-to-class
>
> Argh. The reason you won't get get an assert is because I forgot to add 'final' when converting the struct example...
>
>    class C { int i; final f() { return i; } }
>    int main() {
>       C c;
>       return c.f();
>    }
>
>
> BTW, why doesn't GDC figure this out by itself? IIRC GCC gets these cases right both for C++ and C (!), but does not devirtualize D methods, not even in LTO/WPR mode...
>
> artur

All methods are virtual by default in D.  If you feel there is something that can be improved in GDC's codegen, please send a testcase and a written example of the behaviour it should show, and I will look into it. :-)

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
September 06, 2012
On Thursday, 6 September 2012 at 12:00:05 UTC, Artur Skawina wrote:
> On 09/06/12 13:34, Era Scarecrow wrote:
>> Alright let's go the opposite direction. Give me an example in which passing a variable (by reference to a function) would EVER require it to check the address to see if it was null. Class/allocated objects should fail before the function gets control. ie:
>> 
>>  void func(ref int i);
>> 
>>  class X {
>>    int i;
>>  }
>> 
>>  X x;
>>  int* i;
>>  int[10] a;
>> 
>>  func(x.i); /*should fail while dereferencing x to access i,
>>               so never gets to func*/
>>  func(*i);  //does this count as a lvalue? Probably not,
>>  func(a[0]);//none of these three should compile with that in mind
>>  func(0);
>> 
>> Being named variables, and likely non-classes you are then left with mostly local variables, or arrays, or some type of pointer indirection issue. But ever case I come up with says it would fail before the function was called.
>
> Both '*i' and 'a[0]' count. (Even '0' could be made to work as a 'const ref' arg, but i'm not sure if that would be a good idea)

 I wasn't sure about *i. I can see it going either way. *i would need to be dereferenced first, a[0] would need a bounds check which then ensures it exists (even if it was dynamic); So checking an address from ref wouldn't be needed in the func.
September 06, 2012
On 09/06/12 22:07, Era Scarecrow wrote:
> On Thursday, 6 September 2012 at 12:00:05 UTC, Artur Skawina wrote:
>> On 09/06/12 13:34, Era Scarecrow wrote:
>>> Alright let's go the opposite direction. Give me an example in which passing a variable (by reference to a function) would EVER require it to check the address to see if it was null. Class/allocated objects should fail before the function gets control. ie:
>>>
>>>  void func(ref int i);
>>>
>>>  class X {
>>>    int i;
>>>  }
>>>
>>>  X x;
>>>  int* i;
>>>  int[10] a;
>>>
>>>  func(x.i); /*should fail while dereferencing x to access i,
>>>               so never gets to func*/
>>>  func(*i);  //does this count as a lvalue? Probably not,
>>>  func(a[0]);//none of these three should compile with that in mind
>>>  func(0);
>>>
>>> Being named variables, and likely non-classes you are then left with mostly local variables, or arrays, or some type of pointer indirection issue. But ever case I come up with says it would fail before the function was called.
>>
>> Both '*i' and 'a[0]' count. (Even '0' could be made to work as a 'const ref' arg, but i'm not sure if that would be a good idea)
> 
>  I wasn't sure about *i. I can see it going either way. *i would need to be dereferenced first, a[0] would need a bounds check which then ensures it exists (even if it was dynamic); So checking an address from ref wouldn't be needed in the func.

No, '*i' does not actually dereference the pointer when used as a ref argument.

This program won't assert

   auto func(ref int i) { assert(!&i); }
   void main() { int* i; func(*i); }

and would segfault if 'i' was accessed by 'func'.

artur
September 06, 2012
On 09/06/12 18:21, Iain Buclaw wrote:
> On 6 September 2012 11:58, Artur Skawina <art.08.09@gmail.com> wrote:
>> On 09/06/12 12:29, Artur Skawina wrote:
>>>    class C { int i; auto f() { return i; } }
>>>    int main() {
>>>       C c;
>>>       return c.f();
>>>    }
>>>
>>> Here you won't even get an assert error, just a segfault. But the pointer-to-class
>>
>> Argh. The reason you won't get get an assert is because I forgot to add 'final' when converting the struct example...
>>
>>    class C { int i; final f() { return i; } }
>>    int main() {
>>       C c;
>>       return c.f();
>>    }
>>
>>
>> BTW, why doesn't GDC figure this out by itself? IIRC GCC gets these cases right both for C++ and C (!), but does not devirtualize D methods, not even in LTO/WPR mode...
> 
> All methods are virtual by default in D.  If you feel there is something that can be improved in GDC's codegen, please send a testcase and a written example of the behaviour it should show, and I will look into it. :-)

I'm just wondering /why/ the optimization doesn't happen for D; there are far more important issues like cross-module inlining. Eventually devirtualization will be needed exactly because all methods are virtual.

The test case would be something like the following two programs:

C++:
   class C {
      public:
      virtual int foo(int i) { return i+1; }
   };

   int main() {
      C* c = new C;
      int i = 0;
      i = c->foo(i);
      return i;
   }

D:
   class C {
      int foo(int i) { return i+1; }
   };

   int main() {
      C c = new C;
      int i = 0;
      i = c.foo(i);
      return i;
   }

The first compiles to:

 8048420:       83 ec 04                sub    $0x4,%esp
 8048423:       c7 04 24 04 00 00 00    movl   $0x4,(%esp)
 804842a:       e8 e1 ff ff ff          call   8048410 <_Znwj@plt>
 804842f:       c7 00 a0 85 04 08       movl   $0x80485a0,(%eax)
 8048435:       b8 01 00 00 00          mov    $0x1,%eax
 804843a:       83 c4 04                add    $0x4,%esp
 804843d:       c3                      ret

but the second results in:

 8049820:       55                      push   %ebp
 8049821:       89 e5                   mov    %esp,%ebp
 8049823:       83 ec 18                sub    $0x18,%esp
 8049826:       c7 04 24 c0 47 07 08    movl   $0x80747c0,(%esp)
 804982d:       e8 0e 4d 00 00          call   804e540 <_d_newclass>
 8049832:       8b 10                   mov    (%eax),%edx
 8049834:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
 804983b:       00
 804983c:       89 04 24                mov    %eax,(%esp)
 804983f:       ff 52 18                call   *0x18(%edx)
 8049842:       c9                      leave
 8049843:       c3                      ret

ie in the D version the call isn't devirtualized.

artur