Thread overview
Using char* and C code
Mar 07, 2013
Jeremy DeHaan
Mar 07, 2013
Jonathan M Davis
Mar 07, 2013
Jeremy DeHaan
Mar 07, 2013
Jeremy DeHaan
Mar 07, 2013
simendsjo
Mar 07, 2013
Jonathan M Davis
March 07, 2013
Hey guys!

Today while browsing std.string, I read this:

Important Note: When passing a char* to a C function, and the C function keeps it around for any reason, make sure that you keep a reference to it in your D code. Otherwise, it may go away during a garbage collection cycle and cause a nasty bug when the C code tries to use it.

Good to know! Now, I have seen things like this before.

extern (C) void someCFunction(const(char)* stuff);

void main()
{
     someCFunction("string!");
}

I know that string literals implicitly cast to this type and even have the '\0' at the end, but couldn't this cause the bug described above?

I'm currently working on a port of a C library into D, so I'm trying to have the end user avoid using pointers all together. I might write the above code as something like:

void someFunction(string stuff)
{
     stuff ~= "\0";
     someCFunction(stuff.ptr);
}

But that was before I read the warning! Obviously if I know 100% that the C function doesn't keeps a copy of this pointer the above would be ok to do. I already had some ideas on how to deal with this, but I was wondering what other people have done. Do you just make some place holder string variable to make sure it won't get GC'd? Or is there a more elegant way?
March 07, 2013
On Thursday, March 07, 2013 06:58:23 Jeremy DeHaan wrote:
> I'm currently working on a port of a C library into D, so I'm trying to have the end user avoid using pointers all together. I might write the above code as something like:
> 
> void someFunction(string stuff)
> {
>       stuff ~= "\0";
>       someCFunction(stuff.ptr);
> }

That's what toStringz is for, and it'll avoid appending the '\0' if it can (e.g. if the code unit one past the end of the string is '\0' as it is with string literals).

> But that was before I read the warning! Obviously if I know 100% that the C function doesn't keeps a copy of this pointer the above would be ok to do. I already had some ideas on how to deal with this, but I was wondering what other people have done. Do you just make some place holder string variable to make sure it won't get GC'd? Or is there a more elegant way?

Very few C functions will keep the strings around, but if you think that there's a possibility that they will, then you'll need to keep a reference to the char* that you're passing in. If you're dealing with a class or struct, then that's as simple as having a member variable for it, but if you're dealing with free functions, that's likely to mean that whoever is using those functions is going to have to worry about it. And since string literals are part of the program itself, you shouldn't need to worry about keeping references to those. They should exist for the duration of the program.

- Jonathan M Davis
March 07, 2013
>> void someFunction(string stuff)
>> {
>>       stuff ~= "\0";
>>       someCFunction(stuff.ptr);
>> }
>
> That's what toStringz is for, and it'll avoid appending the '\0' if it can
> (e.g. if the code unit one past the end of the string is '\0' as it is with
> string literals).


I actually wasn't sure if it was best to just append the character or use toStringz, but now that I know it does that I will always go with that function. I think I was going to ask that as well, but forgot or something.

> Very few C functions will keep the strings around, but if you think that
> there's a possibility that they will, then you'll need to keep a reference to
> the char* that you're passing in. If you're dealing with a class or struct,
> then that's as simple as having a member variable for it, but if you're
> dealing with free functions, that's likely to mean that whoever is using those
> functions is going to have to worry about it. And since string literals are
> part of the program itself, you shouldn't need to worry about keeping
> references to those. They should exist for the duration of the program.
>
> - Jonathan M Davis

Keeping a private member variable when using a class/struct was what I was thinking.  As for free functions, I was considering having the reference be a static variable inside the function, though I'm not sure how often I would need to do that in my port. The snippets of code I wrote were just to illustrate what I was talking about. I suppose I could have written a better example since most(if not all) of the wrapping of C functions are inside classes/structs. :P

Thanks for all the info though!




March 07, 2013
> That's what toStringz is for, and it'll avoid appending the '\0' if it can
> (e.g. if the code unit one past the end of the string is '\0' as it is with
> string literals).
>


I actually have a different question related to this now that I think about it. Is there a similar function to go from a '\0' terminated char* to a D string? Lately I have been using std.conv.text, but I have also made a function that just parses the pointer and copies its data into a string. I'm actually kind of surprised that there isn't anything built into the string class like this.
March 07, 2013
On Thursday, 7 March 2013 at 07:20:16 UTC, Jeremy DeHaan wrote:
>
>> That's what toStringz is for, and it'll avoid appending the '\0' if it can
>> (e.g. if the code unit one past the end of the string is '\0' as it is with
>> string literals).
>>
>
>
> I actually have a different question related to this now that I think about it. Is there a similar function to go from a '\0' terminated char* to a D string? Lately I have been using std.conv.text, but I have also made a function that just parses the pointer and copies its data into a string. I'm actually kind of surprised that there isn't anything built into the string class like this.

You can use "blah\0".to!string(). You can easily use slices too: "blah\0"[0..$-1].
Remember that D doesn't have a string class. string is defined as this:
  alias immutable(char)[] string;
So it's just an array (but some compiler knowledge).
March 07, 2013
On Thursday, March 07, 2013 08:19:57 Jeremy DeHaan wrote:
> > That's what toStringz is for, and it'll avoid appending the
> > '\0' if it can
> > (e.g. if the code unit one past the end of the string is '\0'
> > as it is with
> > string literals).
> 
> I actually have a different question related to this now that I think about it. Is there a similar function to go from a '\0' terminated char* to a D string? Lately I have been using std.conv.text, but I have also made a function that just parses the pointer and copies its data into a string. I'm actually kind of surprised that there isn't anything built into the string class like this.

There is no string class. A string is simply imutable(char)[]. Nothing is built into string which isn't built into arrays in general.

But if you want to convert from a char* to string, then just use std.conv.to:

auto str = to!string(ptr);

- Jonathan M Davis