Thread overview
GC/non-GC memory as part of data type?
Nov 13, 2019
Gregor Mückl
Nov 13, 2019
James Lu
Nov 14, 2019
Gregor Mückl
Nov 15, 2019
Nick Treleaven
Nov 14, 2019
Patrick Schluter
Nov 14, 2019
Gregor Mückl
Nov 14, 2019
jmh530
Nov 14, 2019
Gregor Mückl
Nov 15, 2019
Walter Bright
November 13, 2019
Hi!

This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.

Pointers allocated through malloc() or acquired from external code would carry the information that they are non GC-memory as part of their type. Let's just designate that with a @native attribute for now.

This attribute limits what you can do safely:

- No overwriting a @native pointer address with the result of pointer arithmetic. Don't overwrite the pointer that needs to be free'd.
- Casting pointers to different types (e.g. to array types) transfers the @native-ness of the input pointer
- No assignment between @native and non-@native pointer-like types. Casting to or from @native is a @system operation.
- ~ and ~= for @native arrays always produce a copy on the GC heap and the result is therefore not @native. No in-place appending. All other implicit language-level copy operations perform similar conversions.
- pointers returned from extern(C)/extern(C++) functions are always @native (if that is not the case, then an unsafe wrapper must be written)

Unresolved:
- @native-ness could probably be inferred by static analysis within a single non-template function. It would have to be declared at function interfaces. This grows the attribute zoo.
- Some function can only work on @native pointers, others only on non-@native ones, a third kind can work with both.
- Pointer arithmetic on @native pointers would need some serious lifetime analysis to make that safer.
- No claim of completeness or soundness of these rules is made at this time :)
November 13, 2019
On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl wrote:
> Hi!
>
> This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.
>
> [...]

We could add @nogc types. That indicates to the implementation
that the memory held by that type MAY be ignored by the GC and that
that any pointers held in that type MUST NOT be moved by a moving
GC.

Alternatively, "special" pointers could have a .toPointer function
that converts itself into the memory address they mean. Applications
include XOR pointers.
November 14, 2019
On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl wrote:
> Hi!
>
> This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.
>
> [...]

In fact, it's more or less just const/immutability for the pointer, isn't it?
November 14, 2019
On Thursday, 14 November 2019 at 14:01:50 UTC, Patrick Schluter wrote:
> On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl wrote:
>> Hi!
>>
>> This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.
>>
>> [...]
>
> In fact, it's more or less just const/immutability for the pointer, isn't it?

No, const is different from what I'm trying to describe. You can currently get a const pointer to GC allocated memory and still pass that to free(), for example. If there were a distinction between native and GC pointers, this would be impossible without an explicit, unsafe cast.
November 14, 2019
On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl wrote:
> Hi!
>
> This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.
> [snip]

I'm sure there's a lot that I haven't considered, but it seems like that should be possible just that you might have to write up some of your own functionality. You could do something like below and then write some different versions of malloc, etc, and GC allocation functions and anything that uses the GC (like dynamic arrays), and probably your own ref too.

import std.traits : isPointer;

enum AllocStrategy
{
    GC,
    malloc,
    other
}

struct Ptr(T, AllocStrategy allocStrategy)
    if (isPointer!T)
{
    T x;
    alias x this;
}

void main() {
    int x = 1;
    auto y = Ptr!(int*, AllocStrategy.GC)(new int(x));
    assert(*y == 1);
}
November 14, 2019
On Thursday, 14 November 2019 at 20:06:53 UTC, jmh530 wrote:
> On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl wrote:
>> Hi!
>>
>> This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.
>> [snip]
>
> I'm sure there's a lot that I haven't considered, but it seems like that should be possible just that you might have to write up some of your own functionality. You could do something like below and then write some different versions of malloc, etc, and GC allocation functions and anything that uses the GC (like dynamic arrays), and probably your own ref too.
>
> import std.traits : isPointer;
>
> enum AllocStrategy
> {
>     GC,
>     malloc,
>     other
> }
>
> struct Ptr(T, AllocStrategy allocStrategy)
>     if (isPointer!T)
> {
>     T x;
>     alias x this;
> }
>
> void main() {
>     int x = 1;
>     auto y = Ptr!(int*, AllocStrategy.GC)(new int(x));
>     assert(*y == 1);
> }

Pushing the problem into a library type doesn't really solve anything. Any interesting library that you might want to use will not use these wrappers. If you unwrap the pointers (or worse - your slices) and pass them in, you lose all information about the guarantees that should be tracked. Remember that you don't have control over how that external library reallocates memory for data that you pass in, e.g. when it appends to a slice.
November 14, 2019
On Wednesday, 13 November 2019 at 15:25:46 UTC, James Lu wrote:
> On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl wrote:
>> Hi!
>>
>> This is an attempt to follow up on the DIP1025 discussion: what happens if all pointers/arrays/references carry the origin of the pointed to memory region as part of their type? The goal is to have a cleaner, more explicit separation of GC and non-GC heaps.
>>
>> [...]
>
> We could add @nogc types. That indicates to the implementation
> that the memory held by that type MAY be ignored by the GC and that
> that any pointers held in that type MUST NOT be moved by a moving
> GC.
>
> Alternatively, "special" pointers could have a .toPointer function
> that converts itself into the memory address they mean. Applications
> include XOR pointers.

Haven't thought about moving GC allocated memory yet. Does the current D spec even allow this? The way you're allowed to take pointers to GC memory and store them in places unknown to the GC should prevent it.
November 15, 2019
On Wednesday, 13 November 2019 at 15:25:46 UTC, James Lu wrote:
> We could add @nogc types. That indicates to the implementation
> that the memory held by that type MAY be ignored by the GC and that
> that any pointers held in that type MUST NOT be moved by a moving
> GC.

But a moving GC would only update a pointer that pointed to GC allocated memory.

@nogc on a field declaration could be useful.
November 15, 2019
There are many ways to manage memory, and a non-trivial program will often use multiple methods:

1. gc
2. stack
3. static
4. malloc/free
5. reference counting
6. various custom allocators

Using type constructors to create two categories, gc and all the others, in the end is not particularly helpful because the number of categories is unbounded and reliance on such a system would require supporting all categories.