View mode: basic / threaded / horizontal-split · Log in · Help
September 06, 2012
Re: pointers, functions, and uniform call syntax
On Thursday, 6 September 2012 at 06:18:11 UTC, Era Scarecrow 
wrote:
> On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina 
> wrote:
>> On 09/04/12 20:19, Era Scarecrow wrote:
>>> I ask you, how do you check if it's a null pointer? &s?
>>
>> Yes, obviously. If you need to do that manually.
>> 
>>>  int getx(ref S s)
>>>  //How does this make sense?? it looks wrong and is misleading
>>>  in {assert(&s); }
>>>  body {return s.x); }
>>
>> It looks correct and is perfectly obvious. But see below - you 
>> don't need to do this manually - the compiler does it for you 
>> when calling methods and could handle the UFCS case too.
>
>  I've been thinking about this; It would definitely be the 
> wrong thing to do. The assert would _Always_ succeed. The 
> address you get would be of the pointer/reference for the stack 
> (the pointer variable exists, where it points to isn't so much 
> the case), so it would be the same as comparing it to this...
>
>   int getx(S* s)
>   in {assert(&s);} //always true, and always wrong.

That is absolutely not true at all. Behind the scenes, *passing* 
a ref is the same as passing a pointer, yes, but afterwards, they 
are different entities. If you request the address of the 
reference, it *will* give you the address of the referenced 
object. It is NOT the same as what you just wrote:

--------
import std.stdio;

struct S
{
    int i;
}

void foo(ref S s)
{
    writeln("address of s is: ", &s);
    assert(&s);
}

void main()
{
    S* p;
    foo(*p);
}
--------
address of s is: null
core.exception.AssertError@main(11): Assertion failure
--------

>  As I mentioned, it's wrong and is misleading. You'd have to 
> work around the system to get the check correct; and even then 
> if the compile decides to do something different you can only 
> have it implementation dependent.
>
>   int getx(ref S s)
>   in {
>     S *ptr = cast(S*) s;
>     assert(ptr);
>   }
>
>  I'm not even sure this would even work (it's undefined 
> afterall). I hope I never have to start adding such odd looking 
> checks, else I would throw out ref and use pointers instead; At 
> least with them the checks are straight-forward in comparison.

Again, a reference and a pointer are not the same thing. That 
cast is illegal.

--------
main.d(10): Error: e2ir: cannot cast s of type S to type S*
--------

But *this* is legal and good though:

--------
int getx(ref S s)
in {
  S *ptr = &s;
  assert(ptr);
}
--------
Although it is just transforming the initial 1-liner into a 
2-liner...
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 09/06/12 00:50, Era Scarecrow wrote:
> On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
>> On 09/04/12 20:19, Era Scarecrow wrote:
>>>  I ask you, how do you check if it's a null pointer? &s?
>>
>> Yes, obviously. If you need to do that manually.
> 
>  But you shouldn't have to.
> 
>>>   int getx(ref S s)
>>>   //How does this make sense?? it looks wrong and is misleading
>>>   in {assert(&s); }
>>>   body {return s.x); }
>>
>> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.
> 
>  How I'm understanding references in D (And perhaps I'm repeating myself in two different topics) is they point to live variables (IE guaranteed pointers), and remains this way until you return from that function. 

The struct example I gave previously (quoted below) shows how easily you can end
up with a null reference in D; refs are *not* guaranteed to be live. It's not
just about pointers:

  class C { int i; auto f() { return i; } }
  int main() {
     C c;
     return c.f();
  }

Here you won't even get an assert error, just a segfault. But the pointer-to-class
(reference type in general, but so far there are only classes) model chosen in D is
wrong; this probably contributes to the confusion about refs, because they behave
differently for classes. Let's ignore classes for now, they're "special".

>  This is entirely valid and simplifies things. Remember in D we want the language to 'do the right thing', but if you make references where it works to 'sometimes works' then it becomes a problem (pointers 'sometimes' work and are not @safe, while ref is @safe). Checking the address of a reference shouldn't be needed, since it should be dereferenced at where it was called at if need be (throwing it there).

Pointers *are* @safe, it's just certain operations on them that are not.

>  Would you REALLY want to mark every single function that uses ref as @trusted? 

No idea why you think that would be needed.

>>>  More importantly, if it's now a possibility do we have to start adding checks everywhere?
>>>
>>>   int getx(S *s)
>>>   in {assert(s); } //perfectly acceptable check, we know it's a pointer
>>>   body {return s.x); }
> 
>>    struct S { int i; auto f() { return i; } }
>>    int main() {
>>       S* s;
>>       return s.f();
>>    }
>>
>> This program will assert at runtime (and (correctly) segfault in a release-build). The compiler inserts not-null-this checks for "normal" methods in non-release mode, and could also do that before invoking any UFCS "method". So you wouldn't need to check for '!!&this' yourself.
> 
>  I thought those checks weren't added (via the compiler) since if it causes a seg fault the CPU would catch it and kill the program on it's own (just add debugging flags); If you added the checks they would do the same thing (More buck for the same bang).

The checks happen for structs, and should be configurable, but right now
are not, which sometimes causes trouble.


On 09/06/12 08:18, Era Scarecrow wrote:
> On Wednesday, 5 September 2012 at 11:01:50 UTC, Artur Skawina wrote:
>> On 09/04/12 20:19, Era Scarecrow wrote:
>>>  I ask you, how do you check if it's a null pointer? &s?
>>
>> Yes, obviously. If you need to do that manually.
>>
>>>   int getx(ref S s)
>>>   //How does this make sense?? it looks wrong and is misleading
>>>   in {assert(&s); }
>>>   body {return s.x); }
>>
>> It looks correct and is perfectly obvious. But see below - you don't need to do this manually - the compiler does it for you when calling methods and could handle the UFCS case too.
> 
>  I've been thinking about this; It would definitely be the wrong thing to do. The assert would _Always_ succeed. The address you get would be of the pointer/reference for the stack (the pointer 

No, that's not how ref args work. '&s' will give you the address of the
object (eg struct). A reference type like 'class' has another (internal)
level of indirection so in that case you would get a pointer to the
class-reference. But that's how classes work internally, the 'object' in
that case is just the  internal pointer to the "real" class data. Taking
the address of an argument gives you a pointer to it in every case.

artur
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 09/06/12 12:29, Artur Skawina wrote:
> that case is just the  internal pointer to the "real" class data. Taking
> the address of an argument gives you a pointer to it in every case.

...gives you a pointer to that argument...

would have been be less ambiguous.

artur
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 09/06/12 12:29, Artur Skawina wrote:
>    class C { int i; auto f() { return i; } }
>    int main() {
>       C c;
>       return c.f();
>    }
> 
> Here you won't even get an assert error, just a segfault. But the pointer-to-class

Argh. The reason you won't get get an assert is because I forgot to add
'final' when converting the struct example... 

  class C { int i; final f() { return i; } }
  int main() {
     C c;
     return c.f();
  }


BTW, why doesn't GDC figure this out by itself? IIRC GCC gets these
cases right both for C++ and C (!), but does not devirtualize D methods,
not even in LTO/WPR mode...

artur
September 06, 2012
Re: pointers, functions, and uniform call syntax
On Thursday, 6 September 2012 at 10:29:17 UTC, Artur Skawina 
wrote:
> On 09/06/12 00:50, Era Scarecrow wrote:
>> How I'm understanding references in D (And perhaps I'm 
>> repeating myself in two different topics) is they point to 
>> live variables (IE guaranteed pointers), and remains this way 
>> until you return from that function.
>
> The struct example I gave previously (quoted below) shows how 
> easily you can end up with a null reference in D; refs are 
> *not* guaranteed to be live. It's not just about pointers:

>
>    class C { int i; auto f() { return i; } }
>    int main() {
>       C c;
>       return c.f();
>    }
>
> Here you won't even get an assert error, just a segfault. But 
> the pointer-to-class (reference type in general, but so far 
> there are only classes) model chosen in D is wrong; this 
> probably contributes to the confusion about refs, because they 
> behave differently for classes. Let's ignore classes for now, 
> they're "special".

 Yeah, they are allocated, and 'can' still contain a null 
reference (or be deallocated/voided) in some way, so are pointers 
automatically.

> Pointers *are* @safe, it's just certain operations on them that 
> are not.

 To my understanding I thought pointers (almost everything of 
them) was not covered in SafeD/@safe code. True as long as you 
don't mess with the pointer, than the object could remain valid 
(assuming it was allocated), but you automatically go into 
low-level code, and in trusted or system programing.

>>  Would you REALLY want to mark every single function that uses 
>> ref as @trusted?

> No idea why you think that would be needed.

 Because we aren't allocating everything on the heap. Maybe I'm 
just seeing things at a very different angle than you. Maybe I 
need a core dump for my head.

>> I've been thinking about this; It would definitely be the 
>> wrong thing to do. The assert would _Always_ succeed. The 
>> address you get would be of the pointer/reference for the 
>> stack (the pointer
>
> No, that's not how ref args work. '&s' will give you the 
> address of the object (eg struct). A reference type like 
> 'class' has another (internal) level of indirection so in that 
> case you would get a pointer to the class-reference. But that's 
> how classes work internally, the 'object' in that case is just 
> the  internal pointer to the "real" class data. Taking the 
> address of an argument gives you a pointer to it in every case.

 Curious. Both ways could be correct. But somehow I don't think 
so...

 Alright let's go the opposite direction. Give me an example in 
which passing a variable (by reference to a function) would EVER 
require it to check the address to see if it was null. 
Class/allocated objects should fail before the function gets 
control. ie:

 void func(ref int i);

 class X {
   int i;
 }

 X x;
 int* i;
 int[10] a;

 func(x.i); /*should fail while dereferencing x to access i,
              so never gets to func*/
 func(*i);  //does this count as a lvalue? Probably not,
 func(a[0]);//none of these three should compile with that in mind
 func(0);

 Being named variables, and likely non-classes you are then left 
with mostly local variables, or arrays, or some type of pointer 
indirection issue. But ever case I come up with says it would 
fail before the function was called.
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 09/06/12 13:34, Era Scarecrow wrote:
>  Alright let's go the opposite direction. Give me an example in which passing a variable (by reference to a function) would EVER require it to check the address to see if it was null. Class/allocated objects should fail before the function gets control. ie:
> 
>  void func(ref int i);
> 
>  class X {
>    int i;
>  }
> 
>  X x;
>  int* i;
>  int[10] a;
> 
>  func(x.i); /*should fail while dereferencing x to access i,
>               so never gets to func*/
>  func(*i);  //does this count as a lvalue? Probably not,
>  func(a[0]);//none of these three should compile with that in mind
>  func(0);
> 
>  Being named variables, and likely non-classes you are then left with mostly local variables, or arrays, or some type of pointer indirection issue. But ever case I come up with says it would fail before the function was called.

Both '*i' and 'a[0]' count. (Even '0' could be made to work as a 'const ref'
arg, but i'm not sure if that would be a good idea)

artur
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 6 September 2012 11:58, Artur Skawina <art.08.09@gmail.com> wrote:
> On 09/06/12 12:29, Artur Skawina wrote:
>>    class C { int i; auto f() { return i; } }
>>    int main() {
>>       C c;
>>       return c.f();
>>    }
>>
>> Here you won't even get an assert error, just a segfault. But the pointer-to-class
>
> Argh. The reason you won't get get an assert is because I forgot to add
> 'final' when converting the struct example...
>
>    class C { int i; final f() { return i; } }
>    int main() {
>       C c;
>       return c.f();
>    }
>
>
> BTW, why doesn't GDC figure this out by itself? IIRC GCC gets these
> cases right both for C++ and C (!), but does not devirtualize D methods,
> not even in LTO/WPR mode...
>
> artur

All methods are virtual by default in D.  If you feel there is
something that can be improved in GDC's codegen, please send a
testcase and a written example of the behaviour it should show, and I
will look into it. :-)

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
September 06, 2012
Re: pointers, functions, and uniform call syntax
On Thursday, 6 September 2012 at 12:00:05 UTC, Artur Skawina 
wrote:
> On 09/06/12 13:34, Era Scarecrow wrote:
>> Alright let's go the opposite direction. Give me an example in 
>> which passing a variable (by reference to a function) would 
>> EVER require it to check the address to see if it was null. 
>> Class/allocated objects should fail before the function gets 
>> control. ie:
>> 
>>  void func(ref int i);
>> 
>>  class X {
>>    int i;
>>  }
>> 
>>  X x;
>>  int* i;
>>  int[10] a;
>> 
>>  func(x.i); /*should fail while dereferencing x to access i,
>>               so never gets to func*/
>>  func(*i);  //does this count as a lvalue? Probably not,
>>  func(a[0]);//none of these three should compile with that in 
>> mind
>>  func(0);
>> 
>> Being named variables, and likely non-classes you are then 
>> left with mostly local variables, or arrays, or some type of 
>> pointer indirection issue. But ever case I come up with says 
>> it would fail before the function was called.
>
> Both '*i' and 'a[0]' count. (Even '0' could be made to work as 
> a 'const ref' arg, but i'm not sure if that would be a good 
> idea)

 I wasn't sure about *i. I can see it going either way. *i would 
need to be dereferenced first, a[0] would need a bounds check 
which then ensures it exists (even if it was dynamic); So 
checking an address from ref wouldn't be needed in the func.
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 09/06/12 22:07, Era Scarecrow wrote:
> On Thursday, 6 September 2012 at 12:00:05 UTC, Artur Skawina wrote:
>> On 09/06/12 13:34, Era Scarecrow wrote:
>>> Alright let's go the opposite direction. Give me an example in which passing a variable (by reference to a function) would EVER require it to check the address to see if it was null. Class/allocated objects should fail before the function gets control. ie:
>>>
>>>  void func(ref int i);
>>>
>>>  class X {
>>>    int i;
>>>  }
>>>
>>>  X x;
>>>  int* i;
>>>  int[10] a;
>>>
>>>  func(x.i); /*should fail while dereferencing x to access i,
>>>               so never gets to func*/
>>>  func(*i);  //does this count as a lvalue? Probably not,
>>>  func(a[0]);//none of these three should compile with that in mind
>>>  func(0);
>>>
>>> Being named variables, and likely non-classes you are then left with mostly local variables, or arrays, or some type of pointer indirection issue. But ever case I come up with says it would fail before the function was called.
>>
>> Both '*i' and 'a[0]' count. (Even '0' could be made to work as a 'const ref' arg, but i'm not sure if that would be a good idea)
> 
>  I wasn't sure about *i. I can see it going either way. *i would need to be dereferenced first, a[0] would need a bounds check which then ensures it exists (even if it was dynamic); So checking an address from ref wouldn't be needed in the func.

No, '*i' does not actually dereference the pointer when used as
a ref argument.

This program won't assert

  auto func(ref int i) { assert(!&i); }
  void main() { int* i; func(*i); }

and would segfault if 'i' was accessed by 'func'.

artur
September 06, 2012
Re: pointers, functions, and uniform call syntax
On 09/06/12 18:21, Iain Buclaw wrote:
> On 6 September 2012 11:58, Artur Skawina <art.08.09@gmail.com> wrote:
>> On 09/06/12 12:29, Artur Skawina wrote:
>>>    class C { int i; auto f() { return i; } }
>>>    int main() {
>>>       C c;
>>>       return c.f();
>>>    }
>>>
>>> Here you won't even get an assert error, just a segfault. But the pointer-to-class
>>
>> Argh. The reason you won't get get an assert is because I forgot to add
>> 'final' when converting the struct example...
>>
>>    class C { int i; final f() { return i; } }
>>    int main() {
>>       C c;
>>       return c.f();
>>    }
>>
>>
>> BTW, why doesn't GDC figure this out by itself? IIRC GCC gets these
>> cases right both for C++ and C (!), but does not devirtualize D methods,
>> not even in LTO/WPR mode...
> 
> All methods are virtual by default in D.  If you feel there is
> something that can be improved in GDC's codegen, please send a
> testcase and a written example of the behaviour it should show, and I
> will look into it. :-)

I'm just wondering /why/ the optimization doesn't happen for D; there
are far more important issues like cross-module inlining. Eventually 
devirtualization will be needed exactly because all methods are virtual.

The test case would be something like the following two programs:

C++: 
  class C {
     public:
     virtual int foo(int i) { return i+1; }
  };

  int main() {
     C* c = new C;
     int i = 0;
     i = c->foo(i);
     return i;
  }

D:
  class C {
     int foo(int i) { return i+1; }
  };

  int main() {
     C c = new C;
     int i = 0;
     i = c.foo(i);
     return i;
  }

The first compiles to:

8048420:       83 ec 04                sub    $0x4,%esp
8048423:       c7 04 24 04 00 00 00    movl   $0x4,(%esp)
804842a:       e8 e1 ff ff ff          call   8048410 <_Znwj@plt>
804842f:       c7 00 a0 85 04 08       movl   $0x80485a0,(%eax)
8048435:       b8 01 00 00 00          mov    $0x1,%eax
804843a:       83 c4 04                add    $0x4,%esp
804843d:       c3                      ret    

but the second results in:

8049820:       55                      push   %ebp
8049821:       89 e5                   mov    %esp,%ebp
8049823:       83 ec 18                sub    $0x18,%esp
8049826:       c7 04 24 c0 47 07 08    movl   $0x80747c0,(%esp)
804982d:       e8 0e 4d 00 00          call   804e540 <_d_newclass>
8049832:       8b 10                   mov    (%eax),%edx
8049834:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
804983b:       00 
804983c:       89 04 24                mov    %eax,(%esp)
804983f:       ff 52 18                call   *0x18(%edx)
8049842:       c9                      leave  
8049843:       c3                      ret    

ie in the D version the call isn't devirtualized.

artur
1 2 3 4 5
Top | Discussion index | About this forum | D home