Pointer semantics in CTFE (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Pointer semantics in CTFE (page 2)

May 26, 2012

Re: Pointer semantics in CTFE

Posted by Don
in reply to Walter Bright

Don

Posted in reply to Walter Bright

On 26.05.2012 01:09, Walter Bright wrote:
> On 5/24/2012 11:50 PM, Don Clugston wrote:
>> Opinions?
>
>
> My experience with such special cases is that users lose the ability to
> reason about what code will work in CTFE and what will not. Its
> operation will become a magical, unknowable black box. Meanwhile, you'll
> be endlessly chasing more special cases and rainbows.
>
> I suggest not supporting it unless it can be done in the general case.
>
> One solution might be to attach to each CTFE pointer a reference to the
> object it is pointing into. Then, pointer comparisons within the same
> object can work, and comparing pointers from different objects can
> explicitly and predictably be not supported.

That is exactly how it works right now.

The problem is, if pstart and pend point to the beginning and end of an array, then given another pointer q, there is AFAIK no defined way in C for checking if q is between pstart and pend, even though I'm sure everyone does it. (eg, I don't know how to implement memcpy otherwise).
If q is part of that array, (q >= pstart && q <= pend) is well-defined, and returns true.
But if it isn't part of that array, q >= pstart is undefined.

In reality of course, if q is unrelated to the array then EITHER q < pstart, OR q > pend. So (q >= pstart && q <= pend) is ALWAYS false, and the result is completely predictable.

ie, the full expression can be well-defined even though its two subexpressions are not.

This can be implemented.
Given an expression like q >= pstart, where p and q are unrelated, it's possible to make it have the value 'false_or_undefined'. If it is anded with another expression that is also false_or_undefined, and goes the other direction ( q <= pend) , then the whole thing is false. Otherwise, it generates an error.

The reason it would have to be a special case is because of things like:
bool b = false;
q >= start && ( b = true, true) && q <= end
which is undefined because the result of the first comparison has been stored.
So it would only be valid for one-comparison-in-each-direction-anded-together (and the equivalent for ||).

May 26, 2012

Re: Pointer semantics in CTFE

Posted by Walter Bright
in reply to Don

Walter Bright

Posted in reply to Don

On 5/25/2012 8:24 PM, Don wrote:
> The problem is, if pstart and pend point to the beginning and end of an array,
> then given another pointer q, there is AFAIK no defined way in C for checking if
> q is between pstart and pend, even though I'm sure everyone does it. (eg, I
> don't know how to implement memcpy otherwise).
> If q is part of that array, (q >= pstart && q <= pend) is well-defined, and
> returns true.
> But if it isn't part of that array, q >= pstart is undefined.
>
> In reality of course, if q is unrelated to the array then EITHER q < pstart, OR
> q > pend. So (q >= pstart && q <= pend) is ALWAYS false, and the result is
> completely predictable.

If q is unrelated to the array, which you can tell because q, pstart, and pend must all refer to the same object that CTFE knows about, then you can have CTFE refuse to execute it.


> ie, the full expression can be well-defined even though its two subexpressions
> are not.
>
> This can be implemented.
> Given an expression like q >= pstart, where p and q are unrelated, it's possible
> to make it have the value 'false_or_undefined'. If it is anded with another
> expression that is also false_or_undefined, and goes the other direction ( q <=
> pend) , then the whole thing is false. Otherwise, it generates an error.
>
> The reason it would have to be a special case is because of things like:
> bool b = false;
> q >= start && ( b = true, true) && q <= end
> which is undefined because the result of the first comparison has been stored.
> So it would only be valid for one-comparison-in-each-direction-anded-together
> (and the equivalent for ||).

I still don't think you need to make a special case for it. If upon initialization of q, pstart, and pend, CTFE makes a note of what memory object they are initialized from, then you can unambiguously tell if they still point within the same object or not.

May 26, 2012

Re: Pointer semantics in CTFE

Posted by Don
in reply to Walter Bright

Don

Posted in reply to Walter Bright

On 26.05.2012 05:35, Walter Bright wrote:
> On 5/25/2012 8:24 PM, Don wrote:
>> The problem is, if pstart and pend point to the beginning and end of
>> an array,
>> then given another pointer q, there is AFAIK no defined way in C for
>> checking if
>> q is between pstart and pend, even though I'm sure everyone does it.
>> (eg, I
>> don't know how to implement memcpy otherwise).
>> If q is part of that array, (q >= pstart && q <= pend) is
>> well-defined, and
>> returns true.
>> But if it isn't part of that array, q >= pstart is undefined.
>>
>> In reality of course, if q is unrelated to the array then EITHER q <
>> pstart, OR
>> q > pend. So (q >= pstart && q <= pend) is ALWAYS false, and the
>> result is
>> completely predictable.
>
> If q is unrelated to the array, which you can tell because q, pstart,
> and pend must all refer to the same object that CTFE knows about, then
> you can have CTFE refuse to execute it.

Yes, that's what happens now. But that doesn't help the programmer.

If it is inside, no problem, the expression is true. But if it is not inside, the expression is not false -- it's a compile-time error.

So you can't use it as a test for if it is inside the same object.

I was confused about how memmove can work in C without relying on undefined behaviour. But I just read
http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
which defines it in terms of an intermediate buffer.

So maybe, the current CTFE implementation is _exactly_ consistent with the C spec. If that's true, though, I find it pretty incredible that there is no way to find out if a pointers points a particular array, even if you have pointers to both the start and end of that array.

(OK, I guess you can iterate from start to end, checking for equality, but .. bleah .. it's a terrible abstraction inversion).

>
>
>> ie, the full expression can be well-defined even though its two
>> subexpressions
>> are not.
>>
>> This can be implemented.
>> Given an expression like q >= pstart, where p and q are unrelated,
>> it's possible
>> to make it have the value 'false_or_undefined'. If it is anded with
>> another
>> expression that is also false_or_undefined, and goes the other
>> direction ( q <=
>> pend) , then the whole thing is false. Otherwise, it generates an error.
>>
>> The reason it would have to be a special case is because of things like:
>> bool b = false;
>> q >= start && ( b = true, true) && q <= end
>> which is undefined because the result of the first comparison has been
>> stored.
>> So it would only be valid for
>> one-comparison-in-each-direction-anded-together
>> (and the equivalent for ||).
>
> I still don't think you need to make a special case for it. If upon
> initialization of q, pstart, and pend, CTFE makes a note of what memory
> object they are initialized from, then you can unambiguously tell if
> they still point within the same object or not.

May 26, 2012

Re: Pointer semantics in CTFE

Posted by deadalnix
in reply to Steven Schveighoffer

deadalnix

Posted in reply to Steven Schveighoffer

Le 25/05/2012 17:38, Steven Schveighoffer a écrit :
> On Fri, 25 May 2012 11:08:52 -0400, David Nadlinger <see@klickverbot.at>
> wrote:
>
>> On Friday, 25 May 2012 at 14:06:55 UTC, Steven Schveighoffer wrote:
>>> Remove the restriction. The code is unpredictable, but not invalid.
>>> It just means you need to take more care when writing such code.
>>
>> The question is whether we want to allow unpredictable behavior at
>> compile time.
>
> Given the alternative, yes I think we do.
>
> -Steve

What ? We absolutely don't whant that, and teh alternative is pretty simple : using slices.

May 27, 2012

Re: Pointer semantics in CTFE

Posted by Walter Bright
in reply to Don

Walter Bright

Posted in reply to Don

On 5/26/2012 3:59 AM, Don wrote:
> Yes, that's what happens now. But that doesn't help the programmer.
>
> If it is inside, no problem, the expression is true. But if it is not inside,
> the expression is not false -- it's a compile-time error.

Ok, I understand now what you meant.

> So you can't use it as a test for if it is inside the same object.
>
> I was confused about how memmove can work in C without relying on undefined
> behaviour. But I just read
> http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
> which defines it in terms of an intermediate buffer.
>
> So maybe, the current CTFE implementation is _exactly_ consistent with the C
> spec. If that's true, though, I find it pretty incredible that there is no way
> to find out if a pointers points a particular array, even if you have pointers
> to both the start and end of that array.
>
> (OK, I guess you can iterate from start to end, checking for equality, but ..
> bleah .. it's a terrible abstraction inversion).

You could implement it as simply comparing the addresses - you'd be no worse off than C is, and you would get the correct answer for pointers both in and out of the array without needing special cases.

May 27, 2012

Re: Pointer semantics in CTFE

Posted by Artur Skawina
in reply to Walter Bright

Artur Skawina

Posted in reply to Walter Bright

On 05/27/12 02:45, Walter Bright wrote:
> On 5/26/2012 3:59 AM, Don wrote:
>> Yes, that's what happens now. But that doesn't help the programmer.
>>
>> If it is inside, no problem, the expression is true. But if it is not inside, the expression is not false -- it's a compile-time error.
> 
> Ok, I understand now what you meant.
> 
>> So you can't use it as a test for if it is inside the same object.
>>
>> I was confused about how memmove can work in C without relying on undefined
>> behaviour. But I just read
>> http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
>> which defines it in terms of an intermediate buffer.
>>
>> So maybe, the current CTFE implementation is _exactly_ consistent with the C spec. If that's true, though, I find it pretty incredible that there is no way to find out if a pointers points a particular array, even if you have pointers to both the start and end of that array.
>>
>> (OK, I guess you can iterate from start to end, checking for equality, but .. bleah .. it's a terrible abstraction inversion).
> 
> You could implement it as simply comparing the addresses - you'd be no worse off than C is, and you would get the correct answer for pointers both in and out of the array without needing special cases.
> 

Note that if pointer comparison is allowed, subtraction should too. Ie if 'p1>=p2' works, then it is reasonable to expect 'p1-p2<=i' to also work.

artur

May 29, 2012

Re: Pointer semantics in CTFE

Posted by Don Clugston
in reply to Walter Bright

Don Clugston

Posted in reply to Walter Bright

On 27/05/12 02:45, Walter Bright wrote:
> On 5/26/2012 3:59 AM, Don wrote:
>> Yes, that's what happens now. But that doesn't help the programmer.
>>
>> If it is inside, no problem, the expression is true. But if it is not
>> inside,
>> the expression is not false -- it's a compile-time error.
>
> Ok, I understand now what you meant.
>
>> So you can't use it as a test for if it is inside the same object.
>>
>> I was confused about how memmove can work in C without relying on
>> undefined
>> behaviour. But I just read
>> http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
>> which defines it in terms of an intermediate buffer.
>>
>> So maybe, the current CTFE implementation is _exactly_ consistent with
>> the C
>> spec. If that's true, though, I find it pretty incredible that there
>> is no way
>> to find out if a pointers points a particular array, even if you have
>> pointers
>> to both the start and end of that array.
>>
>> (OK, I guess you can iterate from start to end, checking for equality,
>> but ..
>> bleah .. it's a terrible abstraction inversion).
>
> You could implement it as simply comparing the addresses - you'd be no
> worse off than C is, and you would get the correct answer for pointers
> both in and out of the array without needing special cases.

I think that's a no-go.
Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down.
And at runtime, you have a debugger.

It's not hard to make it work in all cases of:

one-sided-comparison && one-sided-comparison
one-sided-comparison || one-sided-comparison

one-sided-comparision:
    ptr_expression RELOP ptr_expression
    ! one-sided-comparision

RELOP:
   >
   <
   >=
   <=

where ptr_expression is any expression of type pointer.

And by all cases, I really mean all (the code for the four pointer expressions need not having anything in common).

It's only when you allow other expressions to be present, that things get hairy.

May 29, 2012

Re: Pointer semantics in CTFE

Posted by Michel Fortin
in reply to Don Clugston

Michel Fortin

Posted in reply to Don Clugston

On 2012-05-29 13:29:35 +0000, Don Clugston <dac@nospam.com> said:

> On 27/05/12 02:45, Walter Bright wrote:
>> You could implement it as simply comparing the addresses - you'd be no
>> worse off than C is, and you would get the correct answer for pointers
>> both in and out of the array without needing special cases.
> 
> I think that's a no-go.
> Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down.
> And at runtime, you have a debugger.

Wouldn't it be possible to just catch the case where you compare two pointers not assigned from the same memory block and issue an error?

Here's an idea: make each CTFE pointer some kind of struct with a pointer to the memory block and an offset. When comparing, if the memory block isn't the same, it's an error. If you add or subtract to a pointer, it'll still belong to the same memory block but with a different offset, thus it remains comparable. When dereferencing, you can make sure the pointer still points inside the block, assuming the block knows its length.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

May 29, 2012

Re: Pointer semantics in CTFE

Posted by Artur Skawina
in reply to Michel Fortin

Artur Skawina

Posted in reply to Michel Fortin

On 05/29/12 16:20, Michel Fortin wrote:
> On 2012-05-29 13:29:35 +0000, Don Clugston <dac@nospam.com> said:
> 
>> On 27/05/12 02:45, Walter Bright wrote:
>>> You could implement it as simply comparing the addresses - you'd be no worse off than C is, and you would get the correct answer for pointers both in and out of the array without needing special cases.
>>
>> I think that's a no-go.
>> Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down.
>> And at runtime, you have a debugger.
> 
> Wouldn't it be possible to just catch the case where you compare two pointers not assigned from the same memory block and issue an error?
> 
> Here's an idea: make each CTFE pointer some kind of struct with a pointer to the memory block and an offset. When comparing, if the memory block isn't the same, it's an error. If you add or subtract to a pointer, it'll still belong to the same memory block but with a different offset, thus it remains comparable. When dereferencing, you can make sure the pointer still points inside the block, assuming the block knows its length.
> 

   int a[1024];
   int[] da = a[0..1024];

   if (whatever)
      da = da[3..14];
   if (something_else)
      da = [42] ~ da;
   // etc

   if (da_is_a_slice_of_a())
      still_inside_a();

How do you implement da_is_a_slice_of_a()?

Disallowing pointer comparisons means normal code needs __ctfe special casing in order to work at compile time at all, like it was done for the "bug" mentioned in this thread.

artur

May 29, 2012

Re: Pointer semantics in CTFE

Posted by Michel Fortin
in reply to Artur Skawina

Michel Fortin

Posted in reply to Artur Skawina

On 2012-05-29 15:09:00 +0000, Artur Skawina <art.08.09@gmail.com> said:

>    int a[1024];
>    int[] da = a[0..1024];
> 
>    if (whatever)
>       da = da[3..14];
>    if (something_else)
>       da = [42] ~ da;
>    // etc
> 
>    if (da_is_a_slice_of_a())
>       still_inside_a();
> 
> How do you implement da_is_a_slice_of_a()?

Indeed, for that to work you'd still need to handle this case specially. My bad for not catching that.

Personally, I think it'd be much cleaner to go with some kind of magic function than trying to match the condition against a predefined pattern. Something like da.isSliceOf(a), which could do the usual pointer thing at runtime and call some sort of CTFE intrinsic at compile-time.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation