Jump to page: 1 2 3
Thread overview
C `restrict` keyword in D
Sep 03, 2017
Uknown
Sep 03, 2017
Moritz Maxeiner
Sep 03, 2017
Uknown
Sep 03, 2017
Moritz Maxeiner
Sep 03, 2017
Uknown
Sep 03, 2017
Moritz Maxeiner
Sep 04, 2017
Uknown
Sep 04, 2017
Moritz Maxeiner
Sep 04, 2017
Uknown
Sep 04, 2017
ag0aep6g
Sep 04, 2017
Moritz Maxeiner
Sep 04, 2017
ag0aep6g
Sep 04, 2017
Moritz Maxeiner
Sep 04, 2017
Johan Engelen
Sep 04, 2017
Moritz Maxeiner
Sep 05, 2017
Johan Engelen
Sep 05, 2017
dukc
Sep 05, 2017
Moritz Maxeiner
Sep 05, 2017
Jonathan M Davis
Sep 06, 2017
Johan Engelen
Sep 06, 2017
Jonathan M Davis
Sep 06, 2017
Cecil Ward
Sep 04, 2017
Dukc
Sep 04, 2017
Johan Engelen
Sep 05, 2017
Dukc
Sep 06, 2017
Dukc
Sep 08, 2017
Dukc
Sep 05, 2017
Uknown
Sep 04, 2017
jmh530
September 03, 2017
In C, the `restrict` keyword implies that 2 or more pointer arguments in a function call do not point to the same data. This allows for some additional optimizations which were not possible before, finally making C as fast as Fortran.
e.g.
This is the new definition for memcpy in C99
void* memcpy(void *restrict dst, const void *restrict src, size_t n);

`dst` and `src` should never point to the same block of memory, and this is enforced by the programmer.

In D, it makes sense to add a similar functionality, that extends beyond just performance optimizations. It could potentially be used to better guarantee @safety of some code.
e.g. (from discussions about ref counting in D) :
void main() @safe
{
    auto arr = RCArray!int([0]);
    foo(arr, arr[0]);
}

void foo(ref RCArray!int arr, ref int val) @safe
{
    {
	auto copy = arr; //arr's (and copy's) reference counts are both 2
	arr = RCArray!int([]); // There is another owner, so arr
			       // forgets about the old payload
    } // Last owner of the array ('copy') gets destroyed and happily
      // frees the payload.
    val = 3; // Oops.
}

Here, adding `restrict` to foo's parameters like so :

void foo(restrict ref RCArray!int arr, restrict ref int val)

would make the compiler statically enforce the fact that neither references are pointing to the same data. This would cause an error in main, since arr[0] is from the same block of memory as arr.
The same would apply for pointers.

I just hope to have a nice discussion on this topic here.

Thanks!

Read more about `restrict` here : http://en.cppreference.com/w/c/language/restrict
September 03, 2017
On Sunday, 3 September 2017 at 03:04:58 UTC, Uknown wrote:
> [...]
>
> void foo(ref RCArray!int arr, ref int val) @safe
> {
>     {
> 	auto copy = arr; //arr's (and copy's) reference counts are both 2
> 	arr = RCArray!int([]); // There is another owner, so arr
> 			       // forgets about the old payload
>     } // Last owner of the array ('copy') gets destroyed and happily
>       // frees the payload.
>     val = 3; // Oops.
> }
>
> Here, adding `restrict` to foo's parameters like so :
>
> void foo(restrict ref RCArray!int arr, restrict ref int val)
>
> would make the compiler statically enforce the fact that neither references are pointing to the same data. This would cause an error in main, since arr[0] is from the same block of memory as arr.

How does the compiler know which member of RCArray!int to check for pointing to the same memory chunk as val?
September 03, 2017
On Sunday, 3 September 2017 at 03:49:21 UTC, Moritz Maxeiner wrote:
> On Sunday, 3 September 2017 at 03:04:58 UTC, Uknown wrote:
>> [...]
>>
>> void foo(ref RCArray!int arr, ref int val) @safe
>> {
>>     {
>> 	auto copy = arr; //arr's (and copy's) reference counts are both 2
>> 	arr = RCArray!int([]); // There is another owner, so arr
>> 			       // forgets about the old payload
>>     } // Last owner of the array ('copy') gets destroyed and happily
>>       // frees the payload.
>>     val = 3; // Oops.
>> }
>>
>> Here, adding `restrict` to foo's parameters like so :
>>
>> void foo(restrict ref RCArray!int arr, restrict ref int val)
>>
>> would make the compiler statically enforce the fact that neither references are pointing to the same data. This would cause an error in main, since arr[0] is from the same block of memory as arr.
>
> How does the compiler know which member of RCArray!int to check for pointing to the same memory chunk as val?

If I understand C's version of restrict correctly, the pointers must not refer to the same block. So extending the same here, val should not be allowed to be a reference to any members of RCArray!int.

This does seem to get get more confusing when the heap is involved as a member of a struct.
e.g.
void main() @safe
{
	struct HeapAsMember {
		int* _someArr;
	}
	HeapAsMember x;
	x._someArr = new int;
	void foo(restrict ref HeapAsMember x, restrict ref int val) @safe
	{
		x._someArr = new int;
		val = 0;
	}
	foo(x, x._someArr[0]);
}
I feel that in this case, the compiler should throw an error, since val would be a reference to a member pointed to by _someArr, which is a member of x. Although, I wonder if such analysis would be feasible? This case is trivial, but there could be more complicated cases.
September 03, 2017
On Sunday, 3 September 2017 at 06:11:10 UTC, Uknown wrote:
> On Sunday, 3 September 2017 at 03:49:21 UTC, Moritz Maxeiner wrote:
>> On Sunday, 3 September 2017 at 03:04:58 UTC, Uknown wrote:
>>> [...]
>>>
>>> void foo(ref RCArray!int arr, ref int val) @safe
>>> {
>>>     {
>>> 	auto copy = arr; //arr's (and copy's) reference counts are both 2
>>> 	arr = RCArray!int([]); // There is another owner, so arr
>>> 			       // forgets about the old payload
>>>     } // Last owner of the array ('copy') gets destroyed and happily
>>>       // frees the payload.
>>>     val = 3; // Oops.
>>> }
>>>
>>> Here, adding `restrict` to foo's parameters like so :
>>>
>>> void foo(restrict ref RCArray!int arr, restrict ref int val)
>>>
>>> would make the compiler statically enforce the fact that neither references are pointing to the same data. This would cause an error in main, since arr[0] is from the same block of memory as arr.
>>
>> How does the compiler know which member of RCArray!int to check for pointing to the same memory chunk as val?
>
> If I understand C's version of restrict correctly, the pointers must not refer to the same block. So extending the same here, val should not be allowed to be a reference to any members of RCArray!int.

AFAICT that's not what's needed here for safety, though: RCArray will have a member (`data`, `store`, or something like that) pointing to the actual elements (usually on the heap). You essentially want `val` not to point into the same memory chunk as `data` points into, which is different from `val` not to point to a member of RCArray.

>
> This does seem to get get more confusing when the heap is involved as a member of a struct.
> e.g.
> [...]

Right, that's essentially what an RCArray does, as well.

> I feel that in this case, the compiler should throw an error, since val would be a reference to a member pointed to by _someArr, which is a member of x. Although, I wonder if such analysis would be feasible? This case is trivial, but there could be more complicated cases.

The main issue I see is that pointers/references can change at runtime, so I don't think a static analysis in the compiler can cover this in general (which, I think, is also why the C99 keyword is an optimization hint only).
September 03, 2017
On Sunday, 3 September 2017 at 12:59:25 UTC, Moritz Maxeiner wrote:
> [...]
> The main issue I see is that pointers/references can change at runtime, so I don't think a static analysis in the compiler can cover this in general (which, I think, is also why the C99 keyword is an optimization hint only).

Well, I thought about it, I have to agree with you, as far as pointers go. There seems to be no simple way in which the compiler can safely ensure that the two restrict pointers point to the same data. But fir references, it seems trivial.

In order to do so, RCArray would have to first annotate it's opIndex, opSlice and any other data returning member functions with the restrict keyword. e.g.
struct RCArray(T) @safe
{
	private T[] _payload;
	/+some other functions needed to implement RCArray correctly+/
	restrict ref T opIndex(size_t i) {
		//implimentation as usual
		return _payload[i];
	}
	restrict ref T opIndex() {
		return _payload;
	}
	//opSlice and the rest defined similary
}

void main() @safe
{
	RCArray!int my_array;
	...
	auto t = my_array[0];//error: my_array.opIndex(0) is defined as restrict
	//This essentialy prevents a second reference from existing in the same scope
	foo(arr, arr[0]);
	//error: call to foo introduces `restrict` data from the same container
        //into the scope of foo
}

void foo(ref RCArray!int arr, ref int val) @safe
{
	{
		auto copy = arr; //arr's (and copy's) reference counts are both 2
		arr = RCArray!int([]); // There is another owner, so arr
		// forgets about the old payload
	} // Last owner of the array ('copy') gets destroyed and happily
	// frees the payload.
	val = 3;//No longer an issue!
}

This is now no longer like the C99 keyword in behaviour, but on the bright side, with one annotation to the return types, RCArray suddenly doesn't need to worry about escaped references. Also, no modifications would be needed to the foo function.
This is potentially useful for other owning container types. The compiler could still use the information gainer from the restrict annotation for optimizations, although such optimizations would be much less aggressive than in C.

Coming back to pointers, the only way I can see (short of bringing Rust's borrow checker to D) is to add additional annotations to function return values. The problem comes with code like this :

int * foo() @safe
{
	static int[1] data;
	return &data[0];
}
void main()
{
        int * restrict p1 = foo();
        int * restrict p2 = foo();//Should be error, but the compiler can't figure
                                  //this out without further annotations
}
September 03, 2017
On Sunday, 3 September 2017 at 15:39:58 UTC, Uknown wrote:
> On Sunday, 3 September 2017 at 12:59:25 UTC, Moritz Maxeiner wrote:
>> [...]
>> The main issue I see is that pointers/references can change at runtime, so I don't think a static analysis in the compiler can cover this in general (which, I think, is also why the C99 keyword is an optimization hint only).
>
> Well, I thought about it, I have to agree with you, as far as pointers go. There seems to be no simple way in which the compiler can safely ensure that the two restrict pointers point to the same data. But fir references, it seems trivial.

References are just non-null syntax for pointers that take addresses implicitly on function call. Issues not related to null that pertain to pointers translate to references, as any (non-null) pointer can be turned into a reference (and vice versa):

---
void foo(int* a, bool b)
{
    if (b) bar(a);
    else baz(*a);
}

void bar(int* a) {}
void baz(ref int a) { bar(&a); }
---

>
> In order to do so, RCArray would have to first annotate it's opIndex, opSlice and any other data returning member functions with the restrict keyword. e.g.
> struct RCArray(T) @safe
> {
> 	private T[] _payload;
> 	/+some other functions needed to implement RCArray correctly+/
> 	restrict ref T opIndex(size_t i) {
> 		//implimentation as usual
> 		return _payload[i];
> 	}
> 	restrict ref T opIndex() {
> 		return _payload;
> 	}
> 	//opSlice and the rest defined similary
> }
> [...]

Note: There's no need to attribute the RCArray template as @safe (other than for debugging when developing the template). The compiler will derive it for each member if they are indeed @safe.

W.r.t. the rest: I don't think treating references as different from pointers can be done correctly, as any pointers/references can be interchanged at runtime.

>
> Coming back to pointers, the only way I can see (short of bringing Rust's borrow checker to D) is to add additional annotations to function return values. The problem comes with code like this :
>
> int * foo() @safe
> {
> 	static int[1] data;
> 	return &data[0];
> }
> void main()
> {
>         int * restrict p1 = foo();
>         int * restrict p2 = foo();//Should be error, but the compiler can't figure
>                                   //this out without further annotations
> }

Dealing with pointer aliasing in a generic way is a hard problem :p


September 04, 2017
On Sunday, 3 September 2017 at 16:55:51 UTC, Moritz Maxeiner wrote:
> On Sunday, 3 September 2017 at 15:39:58 UTC, Uknown wrote:
>> On Sunday, 3 September 2017 at 12:59:25 UTC, Moritz Maxeiner wrote:
>>> [...]
>>> The main issue I see is that pointers/references can change at runtime, so I don't think a static analysis in the compiler can cover this in general (which, I think, is also why the C99 keyword is an optimization hint only).
>>
>> Well, I thought about it, I have to agree with you, as far as pointers go. There seems to be no simple way in which the compiler can safely ensure that the two restrict pointers point to the same data. But fir references, it seems trivial.
>
> References are just non-null syntax for pointers that take addresses implicitly on function call. Issues not related to null that pertain to pointers translate to references, as any (non-null) pointer can be turned into a reference (and vice versa):
>
> ---
> void foo(int* a, bool b)
> {
>     if (b) bar(a);
>     else baz(*a);
> }
>
> void bar(int* a) {}
> void baz(ref int a) { bar(&a); }
> ---

Yes. But this is what makes them so useful. You don't have to worry about null dereferences.

>>
>> In order to do so, RCArray would have to first annotate it's opIndex, opSlice and any other data returning member functions with the restrict keyword. e.g.
>> struct RCArray(T) @safe
>> {
>> 	private T[] _payload;
>> 	/+some other functions needed to implement RCArray correctly+/
>> 	restrict ref T opIndex(size_t i) {
>> 		//implimentation as usual
>> 		return _payload[i];
>> 	}
>> 	restrict ref T opIndex() {
>> 		return _payload;
>> 	}
>> 	//opSlice and the rest defined similary
>> }
>> [...]
>
> Note: There's no need to attribute the RCArray template as @safe (other than for debugging when developing the template). The compiler will derive it for each member if they are indeed @safe.

Indeed. I just wrote it to emphasize on the fact that its safe.

> W.r.t. the rest: I don't think treating references as different from pointers can be done correctly, as any pointers/references can be interchanged at runtime.

I'm not sure I understand how one could switch between pointers and refs at runtime. Could you please elaborate a bit or link to an example? Thanks.

>> Coming back to pointers, the only way I can see (short of bringing Rust's borrow checker to D) is to add additional annotations to function return values. The problem comes with code like this :
>>
>> int * foo() @safe
>> {
>> 	static int[1] data;
>> 	return &data[0];
>> }
>> void main()
>> {
>>         int * restrict p1 = foo();
>>         int * restrict p2 = foo();//Should be error, but the compiler can't figure
>>                                   //this out without further annotations
>> }
>
> Dealing with pointer aliasing in a generic way is a hard problem :p

Yep!
I feel there's little point in discussing the introduction of a new keyword if it only works on returning `ref` and has none of the original optimization advantages C brought.

On a side note, C99 added `inline` and `restrict`, 2 new keywords, without any worry of potentially breaking existing code. Normally they would have dded _Restrict and _Inline, and then #defined those.
September 04, 2017
On Monday, 4 September 2017 at 02:43:48 UTC, Uknown wrote:
> On Sunday, 3 September 2017 at 16:55:51 UTC, Moritz Maxeiner wrote:
>> On Sunday, 3 September 2017 at 15:39:58 UTC, Uknown wrote:
>>> On Sunday, 3 September 2017 at 12:59:25 UTC, Moritz Maxeiner wrote:
>>>> [...]
>>>> The main issue I see is that pointers/references can change at runtime, so I don't think a static analysis in the compiler can cover this in general (which, I think, is also why the C99 keyword is an optimization hint only).
>>>
>>> Well, I thought about it, I have to agree with you, as far as pointers go. There seems to be no simple way in which the compiler can safely ensure that the two restrict pointers point to the same data. But fir references, it seems trivial.
>>
>> References are just non-null syntax for pointers that take addresses implicitly on function call. Issues not related to null that pertain to pointers translate to references, as any (non-null) pointer can be turned into a reference (and vice versa):
>>
>> ---
>> void foo(int* a, bool b)
>> {
>>     if (b) bar(a);
>>     else baz(*a);
>> }
>>
>> void bar(int* a) {}
>> void baz(ref int a) { bar(&a); }
>> ---
>
> Yes. But this is what makes them so useful. You don't have to worry about null dereferences.

Indeed, but it also means that - other than null dereferencing - pointer issues can by made into reference issues my dereferencing a pointer and passing that into a function that takes that parameter by reference.

>
>>>
>>> In order to do so, RCArray would have to first annotate it's opIndex, opSlice and any other data returning member functions with the restrict keyword. e.g.
>>> struct RCArray(T) @safe
>>> {
>>> 	private T[] _payload;
>>> 	/+some other functions needed to implement RCArray correctly+/
>>> 	restrict ref T opIndex(size_t i) {
>>> 		//implimentation as usual
>>> 		return _payload[i];
>>> 	}
>>> 	restrict ref T opIndex() {
>>> 		return _payload;
>>> 	}
>>> 	//opSlice and the rest defined similary
>>> }
>>> [...]
>>
>> Note: There's no need to attribute the RCArray template as @safe (other than for debugging when developing the template). The compiler will derive it for each member if they are indeed @safe.
>
> Indeed. I just wrote it to emphasize on the fact that its safe.
>
>> W.r.t. the rest: I don't think treating references as different from pointers can be done correctly, as any pointers/references can be interchanged at runtime.
>
> I'm not sure I understand how one could switch between pointers and refs at runtime. Could you please elaborate a bit or link to an example? Thanks.

What I meant (and apparently poorly expressed) is that you can turn a pointer into a reference (as long as it's not null) and taking the address of a "ref" yields a pointer and as in my `foo` example in the above, which path is taken can change at runtime. You can, e.g. generate a reference to an object's member without the compiler being able to detect it by calculating the appropriate pointer and then dereferencing it.

September 04, 2017
On Monday, 4 September 2017 at 04:10:44 UTC, Moritz Maxeiner wrote:
> What I meant (and apparently poorly expressed) is that you can turn a pointer into a reference (as long as it's not null) and taking the address of a "ref" yields a pointer and as in my `foo` example in the above, which path is taken can change at runtime. You can, e.g. generate a reference to an object's member without the compiler being able to detect it by calculating the appropriate pointer and then dereferencing it.

I think I understand now. Thanks!
September 04, 2017
On 09/04/2017 06:10 AM, Moritz Maxeiner wrote:
> Indeed, but it also means that - other than null dereferencing - pointer issues can by made into reference issues my dereferencing a pointer and passing that into a function that takes that parameter by reference.

Why "other than null dereferencing"? You can dereference a null pointer and pass it in a ref parameter. That doesn't crash at the call site, but only when the callee accesses the parameter:

----
int f(ref int x, bool b) { return b ? x : 0; }
void main()
{
    int* p = null;

    /* Syntactically a null dereference, but doesn't crash: */
    f(*p, false);

    /* This crashes: */
    f(*p, true);
}
----
« First   ‹ Prev
1 2 3