How does D distnguish managed pointers from raw pointers? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » How does D distnguish managed pointers from raw pointers?

Thread overview

How does D distnguish managed pointers from raw pointers?
Oct 03, 2019 IGotD-
Oct 03, 2019 Adam D. Ruppe
Oct 03, 2019 Andrea Fontana
Oct 04, 2019 Johan Engelen
Oct 04, 2019 IGotD-
Oct 04, 2019 Dennis
Oct 04, 2019 H. S. Teoh
Oct 04, 2019 Dennis
Oct 04, 2019 Adam D. Ruppe
Oct 04, 2019 H. S. Teoh
Oct 04, 2019 Dennis
Oct 05, 2019 Jonathan M Davis
Oct 04, 2019 H. S. Teoh
Oct 04, 2019 Dennis
Oct 04, 2019 H. S. Teoh
Oct 03, 2019 rikki cattermole

October 03, 2019

How does D distnguish managed pointers from raw pointers?

Posted by IGotD-

IGotD-

According to the GC documentation this code snippet

char* p = new char[10];
char* q = p + 6; // ok
q = p + 11;      // error: undefined behavior
q = p - 1;       // error: undefined behavior

suggests that char *p is really a "fat pointer" with size information.

However, if get some memory allocated by some C library that is allocated with malloc we have no size information. We would get a char * without any size information and according to the documentation we can do anything including access out of bounds.

How does D internally know that a pointer was previously allocated by the GC or malloc?

If we would replace the GC with reference counting. How would D be able to distinguish a reference counted pointer from a raw pointer at compile time in order to insert the code associated with the reference counting?

This brings me back to MS managed C++ where they actually had two types of "pointers" a managed pointer and the normal C++ pointers. Like this:

MyType^ instance = gcnew MyType();

In this case it was obvious what is done with GC and what wasn't (past tense since managed C++ is deprecated). In this case it would be trivial to replace the GC algorithm with whatever you want since the compiler know the type at compile time.

October 03, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by Adam D. Ruppe
in reply to IGotD-

Adam D. Ruppe

Posted in reply to IGotD-

On Thursday, 3 October 2019 at 14:13:55 UTC, IGotD- wrote:
> suggests that char *p is really a "fat pointer" with size information.

D pointers are plain naked pointers. What that doc segment is saying is it works like C - in-bounds arithmetic will work, out of bounds is undefined behavior. You can do it, but it might crash you or whatever.

There's no difference in the language between a Gc pointer and any other pointer. But....

> How does D internally know that a pointer was previously allocated by the GC or malloc?

But, this is a bit more nuanced. D, the language, does not know how it was allocated, there's no difference in the type system, but the runtime can figure it out based on the pointer value, if it falls inside the range of the GC's allocated area.

It does NOT use that for bounds checking though! It is just an internal detail it uses for some of the GC function to help its sweeps and some of the interface functions.

> If we would replace the GC with reference counting. How would D be able to distinguish a reference counted pointer from a raw pointer at compile time in order to insert the code associated with the reference counting?

It won't, D reference counting is and then would have to be done by different types.

October 03, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by Andrea Fontana
in reply to IGotD-

Andrea Fontana

Posted in reply to IGotD-

On Thursday, 3 October 2019 at 14:13:55 UTC, IGotD- wrote:
> According to the GC documentation this code snippet
>
> char* p = new char[10];
> char* q = p + 6; // ok
> q = p + 11;      // error: undefined behavior
> q = p - 1;       // error: undefined behavior
>
> suggests that char *p is really a "fat pointer" with size information.

No it's not. char* is a plain pointer.

The example is wrong, since you can't assign a new char[10] to char*.

Probably they mean something like:
auto arr = new char[10]
char* p = arr.ptr;
...

This code actually compiles, but its behaviour is undefined, so it is a logical error.

In D arrays are fat pointer instead:

int[10] my_array;

my_array is actually a pair ptr+length.

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by rikki cattermole
in reply to IGotD-

rikki cattermole

Posted in reply to IGotD-

On 04/10/2019 3:13 AM, IGotD- wrote:
> According to the GC documentation this code snippet
> 
> char* p = new char[10];
> char* q = p + 6; // ok
> q = p + 11;      // error: undefined behavior
> q = p - 1;       // error: undefined behavior
> 
> suggests that char *p is really a "fat pointer" with size information.

The pointer is raw.
There is no size information stored with it.

The GC will store size information separately from it so it can know about reallocation and what its memory range is to search for.

> However, if get some memory allocated by some C library that is allocated with malloc we have no size information. We would get a char * without any size information and according to the documentation we can do anything including access out of bounds.

Access out of bounds is do-able with a pointer allocated by the GC.

int[] array;
arr.length = 5;

int* arrayPointer = array.ptr;
int value = arrayPointer[10]; // compiles!!! but will segfault at runtime

And of course that won't work in @safe code.

> How does D internally know that a pointer was previously allocated by the GC or malloc?

Either the GC has that information or it doesn't.

> If we would replace the GC with reference counting. How would D be able to distinguish a reference counted pointer from a raw pointer at compile time in order to insert the code associated with the reference counting?

It can't.

> This brings me back to MS managed C++ where they actually had two types of "pointers" a managed pointer and the normal C++ pointers. Like this:
> 
> MyType^ instance = gcnew MyType();
> 
> In this case it was obvious what is done with GC and what wasn't (past tense since managed C++ is deprecated). In this case it would be trivial to replace the GC algorithm with whatever you want since the compiler know the type at compile time.

There is only one type of pointer in D.

The GC is a library with language hooks. Nothing more than that.
It is easily swappable from within druntime.

But it does need to hook into threads and control them (e.g. thread local storage and pausing them) so there are a few restrictions like it must be chosen immediately after libc initialization at the start of druntime initialization.

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by Johan Engelen
in reply to Andrea Fontana

Johan Engelen

Posted in reply to Andrea Fontana

On Thursday, 3 October 2019 at 14:21:37 UTC, Andrea Fontana wrote:
>
> In D arrays are fat pointer instead:
>
> int[10] my_array;
>
> my_array is actually a pair ptr+length.

```
int[10] my_static_array;
int[] my_dynamic_array;
```

my_static_array will not be a fat pointer. Length is known at compile time. Address is known at link/load time so it's also not a pointer but just a normal variable (& will give you a pointer to the array data).
my_dynamic_array will be a pair for ptr+length.

-Johan

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by IGotD-
in reply to Johan Engelen

IGotD-

Posted in reply to Johan Engelen

On Friday, 4 October 2019 at 15:03:04 UTC, Johan Engelen wrote:
> On Thursday, 3 October 2019 at 14:21:37 UTC, Andrea Fontana wrote:
>>
>> In D arrays are fat pointer instead:
>>
>> int[10] my_array;
>>
>> my_array is actually a pair ptr+length.
>
> ```
> int[10] my_static_array;
> int[] my_dynamic_array;
> ```
>
> my_static_array will not be a fat pointer. Length is known at compile time. Address is known at link/load time so it's also not a pointer but just a normal variable (& will give you a pointer to the array data).
> my_dynamic_array will be a pair for ptr+length.
>
> -Johan

What if you pass a static array to a function that expects a dynamic array. Will D automatically create a dynamic array from the static array?

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by Dennis
in reply to IGotD-

Dennis

Posted in reply to IGotD-

On Friday, 4 October 2019 at 18:30:17 UTC, IGotD- wrote:
> What if you pass a static array to a function that expects a dynamic array. Will D automatically create a dynamic array from the static array?

No, you have to append [] to create a slice from the static array.

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by H. S. Teoh
in reply to Dennis

H. S. Teoh

Posted in reply to Dennis

On Fri, Oct 04, 2019 at 06:34:40PM +0000, Dennis via Digitalmars-d-learn wrote:
> On Friday, 4 October 2019 at 18:30:17 UTC, IGotD- wrote:
> > What if you pass a static array to a function that expects a dynamic array. Will D automatically create a dynamic array from the static array?
> 
> No, you have to append [] to create a slice from the static array.

Actually, it *does* automatically convert the static array to a slice. Which is actually a bug, because you get problems like this:

	int[] func() {
		int[5] data = [ 1, 2, 3, 4, 5 ];
		return data; // implicit conversion to int[]
	}
	void main() {
		auto data = func();
		// Oops: data now references out-of-scope elements on the stack.
		// Expect garbage values and stack corruption exploits.
	}

See:
	https://issues.dlang.org/show_bug.cgi?id=15932


T

-- 
"How are you doing?" "Doing what?"

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by H. S. Teoh

H. S. Teoh

On Fri, Oct 04, 2019 at 11:43:34AM -0700, H. S. Teoh via Digitalmars-d-learn wrote:
> On Fri, Oct 04, 2019 at 06:34:40PM +0000, Dennis via Digitalmars-d-learn wrote:
> > On Friday, 4 October 2019 at 18:30:17 UTC, IGotD- wrote:
> > > What if you pass a static array to a function that expects a dynamic array. Will D automatically create a dynamic array from the static array?
> > 
> > No, you have to append [] to create a slice from the static array.
> 
> Actually, it *does* automatically convert the static array to a slice.
[...]

Here's an actual working example that illustrates the pitfall of this implicit conversion:

-----
	struct S {
		int[] data;
		this(int[] _data) { data = _data; }
	}
	S makeS() {
		int[5] data = [ 1, 2, 3, 4, 5 ];
		return S(data);
	}
	void func(S s) {
		import std.stdio;
		writeln("s.data = ", s.data);
	}
	void main() {
		S s = makeS();
		func(s);
	}
-----

Expected output:
	s.data = [1, 2, 3, 4, 5]

Actual output:
	s.data = [-2111884160, 32766, 1535478075, 22053, 5]


T

-- 
MSDOS = MicroSoft's Denial Of Service

October 04, 2019

Re: How does D distnguish managed pointers from raw pointers?

Posted by Dennis
in reply to H. S. Teoh

Dennis

Posted in reply to H. S. Teoh

On Friday, 4 October 2019 at 18:43:34 UTC, H. S. Teoh wrote:
> Actually, it *does* automatically convert the static array to a slice.

You're right, I'm confused. I recall there was a situation where you had to explicitly slice a static array, but I can't think of it now.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation