Thread overview
how to pass a malloc'd C string over to be managed by the GC
Feb 28, 2019
Sam Johnson
Feb 28, 2019
Sam Johnson
Feb 28, 2019
Sam Johnson
Feb 28, 2019
Adam D. Ruppe
Feb 28, 2019
evilrat
Feb 28, 2019
Adam D. Ruppe
Feb 28, 2019
Adam D. Ruppe
Feb 28, 2019
Nicholas Wilson
Feb 28, 2019
Kagamin
February 28, 2019
```
string snappyCompress(const string plaintext) {
	import deimos.snappy.snappy : snappy_compress, snappy_max_compressed_length, SNAPPY_OK;
	import core.stdc.stdlib : malloc, free;
	import std.string : fromStringz, toStringz;
	char *input = cast(char *) toStringz(plaintext);
	size_t output_length = snappy_max_compressed_length(plaintext.length);
	char *output = cast(char *) malloc(output_length);
	if(snappy_compress(input, plaintext.length, output, &output_length) == SNAPPY_OK) {
		string ret = (cast(string) fromStringz(output)).clone();
                // <---- do something magical here
		return ret;
	}
	assert(0);
}
```

How can I get the GC to automatically garbage collect the `output` malloc call by tracking the returned `ret` reference? Or is this already managed by the gc because I cast to a `string`?
February 28, 2019
On Thursday, 28 February 2019 at 03:33:25 UTC, Sam Johnson wrote:
> ```
> string snappyCompress(const string plaintext) {
> 	import deimos.snappy.snappy : snappy_compress, snappy_max_compressed_length, SNAPPY_OK;
> 	import core.stdc.stdlib : malloc, free;
> 	import std.string : fromStringz, toStringz;
> 	char *input = cast(char *) toStringz(plaintext);
> 	size_t output_length = snappy_max_compressed_length(plaintext.length);
> 	char *output = cast(char *) malloc(output_length);
> 	if(snappy_compress(input, plaintext.length, output, &output_length) == SNAPPY_OK) {
> 		string ret = (cast(string) fromStringz(output)).clone();
>                 // <---- do something magical here
> 		return ret;
> 	}
> 	assert(0);
> }
> ```
>
> [...]

Ignore the `.clone()` call -- that wasn't supposed to be here -- I thought maybe string.clone() might exist but it turns out it does not.
February 28, 2019
On Thursday, 28 February 2019 at 03:35:45 UTC, Sam Johnson wrote:
> On Thursday, 28 February 2019 at 03:33:25 UTC, Sam Johnson wrote:
>> [...]
>
> Ignore the `.clone()` call -- that wasn't supposed to be here -- I thought maybe string.clone() might exist but it turns out it does not.

Update: it seems that all I need to do is GC.addRoot(output); and memory leak goes away. I think I have answered my own question.
February 28, 2019
On Thursday, 28 February 2019 at 03:33:25 UTC, Sam Johnson wrote:
> How can I get the GC to automatically garbage collect the `output` malloc call by tracking the returned `ret` reference?

If you want it GC managed, just GC allocate it instead of mallocing it.

char *output = cast(char *) malloc(output_length);

replace with

char* output = (new char[](output_length)).ptr;

and the rest of your code can remain the same.


> Or is this already managed by the gc because I cast to a `string`?

no, a cast only affects the type system
February 28, 2019
On Thursday, 28 February 2019 at 03:35:45 UTC, Sam Johnson wrote:
> Ignore the `.clone()` call -- that wasn't supposed to be here -- I thought maybe string.clone() might exist but it turns out it does not.

It is called `.dup` ( for a mutable copy) or `.idup` (for an immutable copy).

Though note that while it would copy it into a GC array, it would leave the original - you'd still want to free() that.
February 28, 2019
On Thursday, 28 February 2019 at 04:26:47 UTC, Sam Johnson wrote:
> Update: it seems that all I need to do is GC.addRoot(output); and memory leak goes away. I think I have answered my own question.

That shouldn't have any effect at all. GC.addRoot makes the GC consider that pointer to always be live. If it isn't already a GC'd pointer, the call does nothing, and if it is a GC pointer, it means it will never be collected!

Your test must be seeing something else...
February 28, 2019
On Thursday, 28 February 2019 at 03:33:25 UTC, Sam Johnson wrote:
> ```
> string snappyCompress(const string plaintext) {
> 	import deimos.snappy.snappy : snappy_compress, snappy_max_compressed_length, SNAPPY_OK;
> 	import core.stdc.stdlib : malloc, free;
> 	import std.string : fromStringz, toStringz;
> 	char *input = cast(char *) toStringz(plaintext);
> 	size_t output_length = snappy_max_compressed_length(plaintext.length);
> 	char *output = cast(char *) malloc(output_length);
> 	if(snappy_compress(input, plaintext.length, output, &output_length) == SNAPPY_OK) {
> 		string ret = (cast(string) fromStringz(output)).clone();
>                 // <---- do something magical here
> 		return ret;
> 	}
> 	assert(0);
> }
> ```
>
> How can I get the GC to automatically garbage collect the `output` malloc call by tracking the returned `ret` reference?

I'm surprised that GC.addRoot(output) actually helped you here, GC.addRoot is for when the thing you are adding may hold pointers to GC allocated stuff and you don't want non-GC allocated references to dangle when the GC does a collection pass.

If you want the GC to handle it just use new:

char[] output = new char[output_length];
...
return assumeUnique(output);

> Or is this already managed by the gc because I cast to a `string`?

No, string is just an alias for immutable(char)[], which is just
struct
{
    size_t length;
    immutable(char)* ptr;
}

it can refer to any span of memory, it has no concept of ownership.
February 28, 2019
On Thursday, 28 February 2019 at 04:26:47 UTC, Sam Johnson wrote:
>
> Update: it seems that all I need to do is GC.addRoot(output); and memory leak goes away. I think I have answered my own question.


If you know what you are doing. Otherwise you just postpone troubles due to mixed allocator/deallocator which will result in memory corruption or just crash. I mean you definitely need to remove that root and free() when you're done with it at some point later.

Also fromStringz() may internally do implicit GC copy, inspect its source, if it just appends to an array or something similar, this will likely to implicitly allocate using GC.

If you just want to copy string then arrays has a useful .dup()/.idup() methods(properties?)[1] that should give you a GC managed copy of original, if I'm not missing something...


If I understood problem correctly then the safest in this case will add scope(exit)[1] to free() array.

   char *output = cast(char *) malloc(output_length);
   scope(exit) free(output);  // will go away when leave scope, both normal/exception ways

another option will be making RAII struct wrapper that free() on destruction, possibly also using RefCounted[2]

To know where GC allocations are made there is -vgc flag for DMD, it was added to help writting GC-less code if that's absolutely needed, but you can use it to know if you are not messing up malloc-free with GC alloc/free.


And finally, don't take what I say too seriously, I do D on ocassion these days so there might be some changes in how things works or I might just forgot how it really works.


[1] https://dlang.org/spec/arrays.html#array-properties
[2] https://dlang.org/spec/statement.html#scope-guard-statement
[3] https://dlang.org/phobos/std_typecons.html#RefCounted
February 28, 2019
byte[] snappyCompress(in byte[] plaintext) {
	import deimos.snappy.snappy;
	size_t output_length = snappy_max_compressed_length(plaintext.length);
	byte[] output = new byte[output_length];
	if(snappy_compress(cast(char*)plaintext.ptr, plaintext.length, cast(char*)output.ptr, &output_length) == SNAPPY_OK) {
		byte[] compressed = output[0..output_length];
                // <---- do something magical here
		return compressed;
	}
	assert(0);
}

Snappy works on bytes, not text, char is a wrong type there.