Jump to page: 1 24  
Page
Thread overview
Pointers - Is it safe to point to invalid memory?
4 days ago
Brother Bill
4 days ago
H. S. Teoh
4 days ago
Brother Bill
4 days ago
H. S. Teoh
4 days ago
Brother Bill
4 days ago
H. S. Teoh
4 days ago
Brother Bill
4 days ago
Brother Bill
4 days ago
monkyyy
2 days ago
monkyyy
2 days ago
monkyyy
4 days ago
0xEAB
4 days ago
0xEAB
4 days ago
Brother Bill
4 days ago
Monkyyy
4 days ago
Brother Bill
4 days ago
Monkyyy
4 days ago
Monkyyy
4 days ago
Ali Çehreli
4 days ago
Monkyyy
4 days ago
Monkyyy
4 days ago
Paul Backus
4 days ago
Andy Valencia
4 days ago
monkyyy
4 days ago
H. S. Teoh
3 days ago
Paul Backus
3 days ago
monkyyy
3 days ago
Ali Çehreli
4 days ago

It is obvious that reading or writing to invalid memory can result in "undefined behavior".
But is merely pointing to invalid memory "harmful"?

The documentation states that going one past the last element of a slice is acceptable.
But is it also safe to go 10, 100 or 1000 items past the last element of a slice?

4 days ago
On Sat, Aug 16, 2025 at 11:56:43AM +0000, Brother Bill via Digitalmars-d-learn wrote:
> It is obvious that reading or writing to invalid memory can result in
> "undefined behavior".
> But is merely pointing to invalid memory "harmful"?
> 
> The documentation states that going one past the last element of a slice is acceptable.

Where does it say this?  This is wrong.


> But is it also safe to go 10, 100 or 1000 items past the last element of a slice?

Of course not.

//

It all depends on the interpretation you're using.

Technically speaking, a pointer is just a memory address. An unsigned integer.  There's nothing inherently "harmful" about an integer.

The problem arises when you interpret it as a memory address.  Once you interpret it as an address, you're likely to pass it around to code that expects to be able to read or write to memory at that address.  And that's where the problem arises.  There are expectations placed upon an unsigned integer that's to be interpreted as an address, such as that you can read memory from that address.  The set of integers that are valid addresses is a subset of the set of all (representable) integers.

It's not just about pointing to "invalid memory" either; it's also about not breaking the expectations of the type system.  When handed a pointer to a string, for example, the expectation is that when you read memory at that address, you will find a valid sequence of values that represents a string.  If you treat a random unsigned integer as a poitner to a string, you may end up reading a sequence of values that *aren't* a string, thereby obtaining invalid data.  Or worse, if you write to that address, then somebody else (i.e. some other code) that put the previous data there may try to read it later, expecting valid data of the previous type, and get instead something that's no longer a valid value of that type.  The set of memory addresses containing data of the correct type is narrower than the set of valid addresses (addresses assigned to you by the OS), and the set of valid addresses is narrower than the set of all addresses, most of which will trigger an invalid memory access from your OS because that address wasn't assigned to your program and the OS will step in to terminate your program if you try to access it.

//

Now in theory you can allow arbitrary values in your pointers, and only check for validity when you actually dereference it, analogous to how, given a street address handed to you on a piece of paper, you'd check whether that address actually exists before actually heading out there. In practice, though, this is impractical, because that means every pointer dereference your program makes would have to run through some global registry of valid addresses and check whether data currently stored there is of the correct type.  This would be extremely slow and the simplest of operations would take forever to run. (Not to mention the issues of keeping said global registry up-to-date as the program runs and modifies its data.)

To eliminate this onerous overhead every time you dereference a pointer, programming languages make the simplification that *all* pointers must always contain a valid address of the correct type (or a special null value, that indicates that there is no address at all).  The idea being that before even assigning a given integer value to a pointer, you'd ensure that it was a valid address to begin with, so that by the time you try to dereference the pointer, you can be confident that it's a valid address and simply dereference it without further verification.

This is essentially the whole point of a type system -- to ensure that a given piece of data is a valid representation of its intended type, so that you can safely manipulate it.

Doing things like assigning non-pointer values to a pointer breaks the guarantees that the type system gives you, because the assumptions made by all those places in your code that dereferences this pointer are now invalid, and all bets are off what will happen when you run that code. This is why it's invalid to point to "invalid" memory.  The act of pointing itself is "harmless" -- since it's just some integer address -- but the harm comes from the broken assumptions of the rest of the code that assumes that the address contained in the pointer is a valid address containing data of the expected type.


T

-- 
A mathematician learns more and more about less and less, until he knows everything about nothing; whereas a philospher learns less and less about more and more, until he knows nothing about everything.
4 days ago
On Saturday, 16 August 2025 at 14:43:25 UTC, H. S. Teoh wrote:
> On Sat, Aug 16, 2025 at 11:56:43AM +0000, Brother Bill via Digitalmars-d-learn wrote:
>> It is obvious that reading or writing to invalid memory can result in
>> "undefined behavior".
>> But is merely pointing to invalid memory "harmful"?
>> 
>> The documentation states that going one past the last element of a slice is acceptable.
>
> Where does it say this?  This is wrong.
>
>
>> But is it also safe to go 10, 100 or 1000 items past the last element of a slice?
>
> Of course not.
>
> //
>
> It all depends on the interpretation you're using.
>
> Technically speaking, a pointer is just a memory address. An unsigned integer.  There's nothing inherently "harmful" about an integer.
>
> The problem arises when you interpret it as a memory address.  Once you interpret it as an address, you're likely to pass it around to code that expects to be able to read or write to memory at that address.  And that's where the problem arises.  There are expectations placed upon an unsigned integer that's to be interpreted as an address, such as that you can read memory from that address.  The set of integers that are valid addresses is a subset of the set of all (representable) integers.
>
> It's not just about pointing to "invalid memory" either; it's also about not breaking the expectations of the type system.  When handed a pointer to a string, for example, the expectation is that when you read memory at that address, you will find a valid sequence of values that represents a string.  If you treat a random unsigned integer as a poitner to a string, you may end up reading a sequence of values that *aren't* a string, thereby obtaining invalid data.  Or worse, if you write to that address, then somebody else (i.e. some other code) that put the previous data there may try to read it later, expecting valid data of the previous type, and get instead something that's no longer a valid value of that type.  The set of memory addresses containing data of the correct type is narrower than the set of valid addresses (addresses assigned to you by the OS), and the set of valid addresses is narrower than the set of all addresses, most of which will trigger an invalid memory access from your OS because that address wasn't assigned to your program and the OS will step in to terminate your program if you try to access it.
>
> //
>
> Now in theory you can allow arbitrary values in your pointers, and only check for validity when you actually dereference it, analogous to how, given a street address handed to you on a piece of paper, you'd check whether that address actually exists before actually heading out there. In practice, though, this is impractical, because that means every pointer dereference your program makes would have to run through some global registry of valid addresses and check whether data currently stored there is of the correct type.  This would be extremely slow and the simplest of operations would take forever to run. (Not to mention the issues of keeping said global registry up-to-date as the program runs and modifies its data.)
>
> To eliminate this onerous overhead every time you dereference a pointer, programming languages make the simplification that *all* pointers must always contain a valid address of the correct type (or a special null value, that indicates that there is no address at all).  The idea being that before even assigning a given integer value to a pointer, you'd ensure that it was a valid address to begin with, so that by the time you try to dereference the pointer, you can be confident that it's a valid address and simply dereference it without further verification.
>
> This is essentially the whole point of a type system -- to ensure that a given piece of data is a valid representation of its intended type, so that you can safely manipulate it.
>
> Doing things like assigning non-pointer values to a pointer breaks the guarantees that the type system gives you, because the assumptions made by all those places in your code that dereferences this pointer are now invalid, and all bets are off what will happen when you run that code. This is why it's invalid to point to "invalid" memory.  The act of pointing itself is "harmless" -- since it's just some integer address -- but the harm comes from the broken assumptions of the rest of the code that assumes that the address contained in the pointer is a valid address containing data of the expected type.
>
>
> T

I'm not sure we are on the same page.
My question is whether having a pointer to invalid memory causes problems.
I am fully aware that reading or writing to an invalid memory address causes problems.

I am grasping that even having a pointer pointing to an invalid memory address is a huge code smell.  But will it cause any undefined behavior merely having a pointer with an invalid memory address.
4 days ago
On 17/08/2025 3:03 AM, Brother Bill wrote:
> I am grasping that even having a pointer pointing to an invalid memory address is a huge code smell.

Yes.

Why point the gun at your own foot?

Don't do that.

> But will it cause any undefined behavior merely having a pointer with an invalid memory address.

No.

4 days ago
On Sat, Aug 16, 2025 at 03:03:21PM +0000, Brother Bill via Digitalmars-d-learn wrote: [...]
> I'm not sure we are on the same page.
> My question is whether having a pointer to invalid memory causes
> problems.  I am fully aware that reading or writing to an invalid
> memory address causes problems.
> 
> I am grasping that even having a pointer pointing to an invalid memory address is a huge code smell.  But will it cause any undefined behavior merely having a pointer with an invalid memory address.

What do you mean by "cause any undefined behaviour"?  UB is defined by the language spec, which states that assigning a non-pointer value (or a random garbage value) to a pointer is UB.  What it actually does at runtime is orthogonal to this.  At the machine level, there is no UB, it's simply following the instructions you gave it literally.  That the results may not be what you expect is a problem at a higher level of abstraction.


T

-- 
Stop staring at me like that! It's offens... no, you'll hurt your eyes!
4 days ago

On Saturday, 16 August 2025 at 11:56:43 UTC, Brother Bill wrote:

>

It is obvious that reading or writing to invalid memory can result in "undefined behavior".
But is merely pointing to invalid memory "harmful"?

The documentation states that going one past the last element of a slice is acceptable.
But is it also safe to go 10, 100 or 1000 items past the last element of a slice?

The way D's gc works makes xor pointers invalid, so even sorta safe pointers are explicitly broken

Probably store whatever in an int

>

The documentation states that going one past the last element

If you read that d makes string litterals c strings, that only string litterals. Most overflows are a crash and idk how your not running into that? Yourve ready enough go start syntax testing.

4 days ago
On Saturday, 16 August 2025 at 15:14:21 UTC, H. S. Teoh wrote:
> On Sat, Aug 16, 2025 at 03:03:21PM +0000, Brother Bill via Digitalmars-d-learn wrote: [...]
>> I'm not sure we are on the same page.
>> My question is whether having a pointer to invalid memory causes
>> problems.  I am fully aware that reading or writing to an invalid
>> memory address causes problems.
>> 
>> I am grasping that even having a pointer pointing to an invalid memory address is a huge code smell.  But will it cause any undefined behavior merely having a pointer with an invalid memory address.
>
> What do you mean by "cause any undefined behaviour"?  UB is defined by the language spec, which states that assigning a non-pointer value (or a random garbage value) to a pointer is UB.  What it actually does at runtime is orthogonal to this.  At the machine level, there is no UB, it's simply following the instructions you gave it literally.  That the results may not be what you expect is a problem at a higher level of abstraction.
>
>


So a good D developer should not store an invalid pointer address into a pointer,
with the single exception of storing a pointer address just past a slice or array.

4 days ago
On Sat, Aug 16, 2025 at 03:24:55PM +0000, Brother Bill via Digitalmars-d-learn wrote: [...]
> So a good D developer should not store an invalid pointer address into a pointer, with the single exception of storing a pointer address just past a slice or array.

Where does it say this in the spec?  Because this is wrong.

D arrays carry length with them; they do not rely on pointers pointing past the allocated memory region.


T

-- 
The diminished 7th chord is the most flexible and fear-instilling chord. Use it often, use it unsparingly, to subdue your listeners into submission!
4 days ago
On Saturday, 16 August 2025 at 15:30:43 UTC, H. S. Teoh wrote:
> On Sat, Aug 16, 2025 at 03:24:55PM +0000, Brother Bill via Digitalmars-d-learn wrote: [...]
>> So a good D developer should not store an invalid pointer address into a pointer, with the single exception of storing a pointer address just past a slice or array.
>
> Where does it say this in the spec?  Because this is wrong.
>
> D arrays carry length with them; they do not rely on pointers pointing past the allocated memory region.
>
>
> T

Source: Programming in D book, page 432, chapter 68.8
Quote: It is valid to point at the imaginary element one past the end of an array.
4 days ago
On 17/08/2025 3:44 AM, Brother Bill wrote:
> On Saturday, 16 August 2025 at 15:30:43 UTC, H. S. Teoh wrote:
>> On Sat, Aug 16, 2025 at 03:24:55PM +0000, Brother Bill via Digitalmars-d-learn wrote: [...]
>>> So a good D developer should not store an invalid pointer address into a pointer, with the single exception of storing a pointer address just past a slice or array.
>>
>> Where does it say this in the spec?  Because this is wrong.
>>
>> D arrays carry length with them; they do not rely on pointers pointing past the allocated memory region.
>>
>>
>> T
> 
> Source: Programming in D book, page 432, chapter 68.8
> Quote: It is valid to point at the imaginary element one past the end of an array.

That is not the spec, the book is wrong.
« First   ‹ Prev
1 2 3 4