Jump to page: 1 2
Thread overview
Is D's pointer subtraction more permissive than C (and C++)?
Apr 01, 2022
Ali Çehreli
Apr 01, 2022
Ali Çehreli
Apr 01, 2022
Ali Çehreli
Apr 01, 2022
H. S. Teoh
Apr 01, 2022
norm
Apr 02, 2022
Paul Backus
Apr 03, 2022
Patrick Schluter
Apr 02, 2022
Elronnd
Apr 01, 2022
Paul Backus
Apr 01, 2022
Paul Backus
Apr 02, 2022
Salih Dincer
Apr 02, 2022
Walter Bright
Apr 03, 2022
Tobias Pankrath
Apr 03, 2022
Paul Backus
April 01, 2022
As the following quote from a Microsoft document claims, and as I've already known, pointer subtraction is legal only if the pointers are into the same array: "ANSI 3.3.6, 4.1.1 The type of integer required to hold the difference between two pointers to elements of the same array, ptrdiff_t." ( https://docs.microsoft.com/en-us/cpp/c-language/pointer-subtraction?view=msvc-170 )

I suspect "array" means "a block of memory" there because arrays are ordinarily malloc'ed pieces of memory in C.

1) Is D more permissive (or anemic in documentation:))? I ask because paragraph 5 below does not mention the pointers should be related in any way:

  https://dlang.org/spec/expression.html#pointer_arithmetic

2) Is subtracting pointers that used to be in the same array legal.

void main() {
  auto a = [ 1, 2 ];
  auto b = a;
  assert(a.ptr - b.ptr == 0);    // i) Obviously legal?

  // Drop the first element
  a = a[1..$];
  assert(a.ptr - b.ptr == 1);    // ii) GC-behaviorally legal?

  // Save the pointer
  const old_aPtr = a.ptr;
  // and move the array to another memory
  a.length = 1_000_000;
  // Expect a and b are on different blocks of memory
  assert(a.ptr != old_aPtr);

  assert(old_aPtr - b.ptr == 1);  // iii) Practically legal?
}

Regardless of your answer, I will go ahead and perform that last subtraction :).

Ali

P.S. I am trying to implement a type where slices will follow the elements as the elements may be moved in memory:

  const old_aPtr = a.ptr;

  a ~= e;
  if (a.ptr != old_aPtr) {
    // Elements are moved; adjust the slice
    assert(b.ptr >= old_aPtr);
    const old_bOffset = b.ptr - old_aPtr;
    b = a[old_bOffset .. $];
  }

If you ask why I don't keep offsets instead of slices to begin with, I want to use pointers (implicitly in D slices) so that they participate in the ownership of array elements so that the GC does not free earlier elements as the buffer is popFronted as well.
April 01, 2022

On 4/1/22 11:52 AM, Ali Çehreli wrote:

>

As the following quote from a Microsoft document claims, and as I've already known, pointer subtraction is legal only if the pointers are into the same array: "ANSI 3.3.6, 4.1.1 The type of integer required to hold the difference between two pointers to elements of the same array, ptrdiff_t." ( https://docs.microsoft.com/en-us/cpp/c-language/pointer-subtraction?view=msvc-170 )

I suspect "array" means "a block of memory" there because arrays are ordinarily malloc'ed pieces of memory in C.

I'm assuming this has to do with the ability to detect artifacts of how the compiler/library lays out memory, which shouldn't really figure into program behavior.

In practice, I don't see how it affects the behavior of the compiler. When you subtract two pointers, I don't see how the compiler/optimizer can make some other decision based on the subtraction not being between two pointers to the same block of memory.

>
  1. Is D more permissive (or anemic in documentation:))? I ask because paragraph 5 below does not mention the pointers should be related in any way:

  https://dlang.org/spec/expression.html#pointer_arithmetic

I assume this is because nobody thought about it? But I don't see a problem with omitting that requirement.

>
  1. Is subtracting pointers that used to be in the same array legal.

void main() {
  auto a = [ 1, 2 ];
  auto b = a;
  assert(a.ptr - b.ptr == 0);    // i) Obviously legal?

  // Drop the first element
  a = a[1..$];
  assert(a.ptr - b.ptr == 1);    // ii) GC-behaviorally legal?

  // Save the pointer
  const old_aPtr = a.ptr;
  // and move the array to another memory
  a.length = 1_000_000;
  // Expect a and b are on different blocks of memory
  assert(a.ptr != old_aPtr);

  assert(old_aPtr - b.ptr == 1);  // iii) Practically legal?
}

Assuming C rules, I still think all this is legal. I'd even hazard to guess it's legal to do this:

struct S
{
   int arr1[5], arr2[5];
}
void foo() {
   S s;
   ptrdiff_t p = &s.arr1[0] - &s.arr2[0];
}

Because you know the relationship between the pointers is defined. I.e. this is NEVER going to change from run to run, or build to build.

>

Regardless of your answer, I will go ahead and perform that last subtraction :).

Ali

P.S. I am trying to implement a type where slices will follow the elements as the elements may be moved in memory:

  const old_aPtr = a.ptr;

  a ~= e;
  if (a.ptr != old_aPtr) {
    // Elements are moved; adjust the slice
    assert(b.ptr >= old_aPtr);
    const old_bOffset = b.ptr - old_aPtr;
    b = a[old_bOffset .. $];
  }

If you ask why I don't keep offsets instead of slices to begin with, I want to use pointers (implicitly in D slices) so that they participate in the ownership of array elements so that the GC does not free earlier elements as the buffer is popFronted as well.

This should be fine. I would suggest to store things as offsets anyway, and have accessors for the pointers.

-Steve

April 01, 2022
On Friday, 1 April 2022 at 15:52:39 UTC, Ali Çehreli wrote:
> 1) Is D more permissive (or anemic in documentation:))? I ask because paragraph 5 below does not mention the pointers should be related in any way:
>
>   https://dlang.org/spec/expression.html#pointer_arithmetic

The spec is permissive, but I would not be terribly surprised if the implementation (specifically, LDC and GDC, which share backends with C compilers) actually enforced the same restrictions as C. There are similar issues with null dereferences: D's spec says they have defined behavior, but actual D compilers fail to guarantee this in some cases.

> 2) Is subtracting pointers that used to be in the same array legal.
>
> void main() {
>   auto a = [ 1, 2 ];
>   auto b = a;
>   assert(a.ptr - b.ptr == 0);    // i) Obviously legal?
>
>   // Drop the first element
>   a = a[1..$];
>   assert(a.ptr - b.ptr == 1);    // ii) GC-behaviorally legal?
>
>   // Save the pointer
>   const old_aPtr = a.ptr;
>   // and move the array to another memory
>   a.length = 1_000_000;
>   // Expect a and b are on different blocks of memory
>   assert(a.ptr != old_aPtr);
>
>   assert(old_aPtr - b.ptr == 1);  // iii) Practically legal?
> }

According to the C rules, (i) and (ii) are legal, since they point to the same memory block, but (iii) is illegal.
April 01, 2022

On 4/1/22 2:44 PM, Paul Backus wrote:

>

On Friday, 1 April 2022 at 15:52:39 UTC, Ali Çehreli wrote:

> >
  1. Is subtracting pointers that used to be in the same array legal.

void main() {
  auto a = [ 1, 2 ];
  auto b = a;
  assert(a.ptr - b.ptr == 0);    // i) Obviously legal?

  // Drop the first element
  a = a[1..$];
  assert(a.ptr - b.ptr == 1);    // ii) GC-behaviorally legal?

  // Save the pointer
  const old_aPtr = a.ptr;
  // and move the array to another memory
  a.length = 1_000_000;
  // Expect a and b are on different blocks of memory
  assert(a.ptr != old_aPtr);

  assert(old_aPtr - b.ptr == 1);  // iii) Practically legal?
}

According to the C rules, (i) and (ii) are legal, since they point to the same memory block, but (iii) is illegal.

(iii) is the same as (ii) because old_aPtr is the same as a.ptr at that time.

-Steve

April 01, 2022

On Friday, 1 April 2022 at 19:43:01 UTC, Steven Schveighoffer wrote:

>

On 4/1/22 2:44 PM, Paul Backus wrote:

>

According to the C rules, (i) and (ii) are legal, since they point to the same memory block, but (iii) is illegal.

(iii) is the same as (ii) because old_aPtr is the same as a.ptr at that time.

-Steve

You're right; my mistake. I misread it as a.ptr - b.ptr, since that's what the other two asserts do.

April 01, 2022
On 4/1/22 10:39, Steven Schveighoffer wrote:

> I don't see how the compiler/optimizer
> can make some other decision based on the subtraction not being between
> two pointers to the same block of memory.

I think this rule is related to C's accepting wildly different platforms, some of which may have different kinds of memory. Two pointers to different kinds of memory may not be subtracted.

Ali

April 01, 2022

On 4/1/22 4:22 PM, Ali Çehreli wrote:

>

On 4/1/22 10:39, Steven Schveighoffer wrote:

>

I don't see how the compiler/optimizer
can make some other decision based on the subtraction not being between
two pointers to the same block of memory.

I think this rule is related to C's accepting wildly different platforms, some of which may have different kinds of memory. Two pointers to different kinds of memory may not be subtracted.

Well, can the pointers be subtracted? Yes. What is the result? If they are in the same block, the difference in elements between two pointers. If they are not in the same block, anything.

This is why I don't know that it's important to avoid it.

-Steve

April 01, 2022
On 4/1/22 13:47, Steven Schveighoffer wrote:

>> I think this rule is related to C's accepting wildly different
>> platforms, some of which may have different kinds of memory. Two
>> pointers to different kinds of memory may not be subtracted.
>
> Well, can the pointers be subtracted? Yes.

My understanding is that depending on the CPU, certain operations would make the CPU barf. I am not sure but the old protected memory, extended memory, etc. systems might not be able to subtract between the systems at all. (Not sure; I am making this up.)

> This is why I don't know that it's important to avoid it.

I will not avoid it. :)

Ali

April 01, 2022
On Fri, Apr 01, 2022 at 01:56:19PM -0700, Ali Çehreli via Digitalmars-d wrote:
> On 4/1/22 13:47, Steven Schveighoffer wrote:
> 
> >> I think this rule is related to C's accepting wildly different platforms, some of which may have different kinds of memory. Two pointers to different kinds of memory may not be subtracted.
> >
> > Well, can the pointers be subtracted? Yes.
> 
> My understanding is that depending on the CPU, certain operations would make the CPU barf. I am not sure but the old protected memory, extended memory, etc. systems might not be able to subtract between the systems at all. (Not sure; I am making this up.)
[...]

In the bad ole days of segmented protected memory (around the days of the 386 or 486, IIRC), you could have memory with different segment prefixes, referenced using a convoluted scheme of far ptrs and near ptrs. Near ptrs are relative to a particular segment; subtracting near ptrs associated with diverse segment pointers yields nonsensical values. You can subtract far ptrs, sorta-kinda, but the results are likely to be either garbage, or else refer to an address that can't be addressed with existing segment pointers. So basically, it's Trouble with a capital T.

On modern machines, though, this is no longer relevant. I think people figured out real quick that segmented addressing is just way more trouble than it's worth, so we came running back, tail between legs, to the flat memory model and embraced it like there's no tomorrow. :-D


T

-- 
What doesn't kill me makes me stranger.
April 01, 2022
On Friday, 1 April 2022 at 21:20:00 UTC, H. S. Teoh wrote:
> On Fri, Apr 01, 2022 at 01:56:19PM -0700, Ali Çehreli via Digitalmars-d wrote:
>> On 4/1/22 13:47, Steven Schveighoffer wrote:
>> 
>> >> I think this rule is related to C's accepting wildly different platforms, some of which may have different kinds of memory. Two pointers to different kinds of memory may not be subtracted.
>> >
>> > Well, can the pointers be subtracted? Yes.
>> 
>> My understanding is that depending on the CPU, certain operations would make the CPU barf. I am not sure but the old protected memory, extended memory, etc. systems might not be able to subtract between the systems at all. (Not sure; I am making this up.)
> [...]
>
> In the bad ole days of segmented protected memory (around the days of the 386 or 486, IIRC), you could have memory with different segment prefixes, referenced using a convoluted scheme of far ptrs and near ptrs. Near ptrs are relative to a particular segment; subtracting near ptrs associated with diverse segment pointers yields nonsensical values. You can subtract far ptrs, sorta-kinda, but the results are likely to be either garbage, or else refer to an address that can't be addressed with existing segment pointers. So basically, it's Trouble with a capital T.
>
> On modern machines, though, this is no longer relevant. I think people figured out real quick that segmented addressing is just way more trouble than it's worth, so we came running back, tail between legs, to the flat memory model and embraced it like there's no tomorrow. :-D
>
>
> T

Ahh the "good" old days were not that good really when it came to addressing :-)

All x86 still start up in real mode and the "flat" modes (protected mode, long mode etc.) are still reached by SW putting the CPU into that mode. Today that happens in your boot loader as all desktop SW pretty much runs in 64bit flat mode but back in the 90's most SW would have to manage this itself.

There is an old, yet still interesting, blog about Win95 and the world as it was when we were all slowly migrating from 16bit real mode DOS to 32 bit flat mode OS's:

https://devblogs.microsoft.com/oldnewthing/20071224-00/?p=24063
(Any blog by Raymond Chen is well worth a read IMO, I always learn something new)

Sorry my post has gone completely off topic.
« First   ‹ Prev
1 2