Thread overview
what's the right way to get char* from string?
May 05, 2016
aki
May 05, 2016
pineapple
May 05, 2016
Jonathan M Davis
May 05, 2016
aki
May 06, 2016
Alex Parrill
May 06, 2016
ZombineDev
May 05, 2016
Hello,

When I need to call C function, often need to
have char* pointer from string.

"Interfacing to C++" page:
https://dlang.org/spec/cpp_interface.html
have following example.

extern (C) int strcmp(char* string1, char* string2);
import std.string;
int myDfunction(char[] s)
{
    return strcmp(std.string.toStringz(s), "foo");
}

but this is incorrect because toStringz() returns immutable pointer.
One way is to write mutable version of toStringz()

char* toStringzMutable(string s) @trusted pure nothrow {
    auto copy = new char[s.length + 1];
    copy[0..s.length] = s[];
    copy[s.length] = 0;
    return copy.ptr;
}

But I think this is common needs,
why it is not provided by Phobos?
(or tell me if it has)

Thanks,
aki

May 05, 2016
On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
> Hello,
>
> When I need to call C function, often need to
> have char* pointer from string.

This might help:

import std.traits : isSomeString;
import std.string : toStringz;

extern (C) int strcmp(char* string1, char* string2);

int strcmpD0(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Best
    return strcmp(
        cast(char*) toStringz(lhs),
        cast(char*) toStringz(rhs)
    );
}
int strcmpD1(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Works
    return strcmp(
        cast(char*) lhs.ptr,
        cast(char*) rhs.ptr
    );
}

/+
int strcmpD2(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Breaks
    return strcmp(
        toStringz(lhs),
        toStringz(rhs)
    );
}
+/

void main(){
    import std.stdio;
    writeln(strcmpD0("foo", "bar")); // Best
    writeln(strcmpD1("foo", "bar")); // Works
    //writeln(strcmpD2("foo", "bar")); // Breaks
}


May 05, 2016
On Thu, 05 May 2016 07:49:46 +0000
aki via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote:

> Hello,
>
> When I need to call C function, often need to
> have char* pointer from string.
>
> "Interfacing to C++" page:
> https://dlang.org/spec/cpp_interface.html
> have following example.
>
> extern (C) int strcmp(char* string1, char* string2);
> import std.string;
> int myDfunction(char[] s)
> {
>      return strcmp(std.string.toStringz(s), "foo");
> }
>
> but this is incorrect because toStringz() returns immutable
> pointer.
> One way is to write mutable version of toStringz()
>
> char* toStringzMutable(string s) @trusted pure nothrow {
>      auto copy = new char[s.length + 1];
>      copy[0..s.length] = s[];
>      copy[s.length] = 0;
>      return copy.ptr;
> }
>
> But I think this is common needs,
> why it is not provided by Phobos?
> (or tell me if it has)

If you want a different mutability, then use the more general function std.utf.toUTFz. e.g. from the documentation:

    auto p1 = toUTFz!(char*)("hello world");
    auto p2 = toUTFz!(const(char)*)("hello world");
    auto p3 = toUTFz!(immutable(char)*)("hello world");
    auto p4 = toUTFz!(char*)("hello world"d);
    auto p5 = toUTFz!(const(wchar)*)("hello world");
    auto p6 = toUTFz!(immutable(dchar)*)("hello world"w);

- Jonathan M Davis
May 05, 2016
On 5/5/16 11:53 AM, pineapple wrote:
> On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
>> Hello,
>>
>> When I need to call C function, often need to
>> have char* pointer from string.
>
> This might help:
>
> import std.traits : isSomeString;
> import std.string : toStringz;
>
> extern (C) int strcmp(char* string1, char* string2);
>
> int strcmpD0(S)(in S lhs, in S rhs) if(is(S == string) || is(S ==
> const(char)[])) { // Best
>      return strcmp(
>          cast(char*) toStringz(lhs),
>          cast(char*) toStringz(rhs)
>      );
> }

This is likely a correct solution, because strcmp does not modify any data in the string itself.

Practically speaking, you can define strcmp as taking const(char)*. This is what druntime does: http://dlang.org/phobos/core_stdc_string.html#.strcmp

> int strcmpD1(S)(in S lhs, in S rhs) if(is(S == string) || is(S ==
> const(char)[])) { // Works
>      return strcmp(
>          cast(char*) lhs.ptr,
>          cast(char*) rhs.ptr
>      );
> }

Note, this only works if the strings are literals. Do not use this mechanism in general.

> /+
> int strcmpD2(S)(in S lhs, in S rhs) if(is(S == string) || is(S ==
> const(char)[])) { // Breaks
>      return strcmp(
>          toStringz(lhs),
>          toStringz(rhs)
>      );
> }
> +/

Given a possibility that you are calling a C function that may actually modify the data, there isn't a really good way to do this.

Only thing I can think of is.. um... horrible:

char *toCharz(string s)
{
   auto cstr = s.toStringz;
   return cstr[0 .. s.length + 1].dup.ptr;
}

-Steve
May 05, 2016
On Thursday, 5 May 2016 at 11:35:09 UTC, Jonathan M Davis wrote:
> If you want a different mutability, then use the more general function std.utf.toUTFz. e.g. from the documentation:
>
>     auto p1 = toUTFz!(char*)("hello world");
>     auto p2 = toUTFz!(const(char)*)("hello world");
>     auto p3 = toUTFz!(immutable(char)*)("hello world");
>     auto p4 = toUTFz!(char*)("hello world"d);
>     auto p5 = toUTFz!(const(wchar)*)("hello world");
>     auto p6 = toUTFz!(immutable(dchar)*)("hello world"w);
>
> - Jonathan M Davis

Ah! This can be a solution.
Thanks Jonathan.

-- aki.

May 05, 2016
On 5/5/16 3:36 PM, Steven Schveighoffer wrote:
> Only thing I can think of is.. um... horrible:
>
> char *toCharz(string s)
> {
>     auto cstr = s.toStringz;
>     return cstr[0 .. s.length + 1].dup.ptr;
> }

Ignore this. What Jonathan said :)

-Steve

May 06, 2016
On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
> extern (C) int strcmp(char* string1, char* string2);

This signature of strcmp is incorrect. strcmp accepts const char* arguments [1], which in D would be written as const(char)*. The immutable(char)* values returned from toStringz are implicitly convertible to const(char)* and are therefore useable as-is as arguments to strcmp.

import std.string;
extern (C) int strcmp(const(char)* string1, const(char)* string2);
auto v = strcmp(somestring1.toStringz, somestring2.toStringz);

[1] http://linux.die.net/man/3/strcmp
May 06, 2016
On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
> Hello,
>
> When I need to call C function, often need to
> have char* pointer from string.
>
> "Interfacing to C++" page:
> https://dlang.org/spec/cpp_interface.html
> have following example.
>
> extern (C) int strcmp(char* string1, char* string2);
> import std.string;
> int myDfunction(char[] s)
> {
>     return strcmp(std.string.toStringz(s), "foo");
> }
>
> but this is incorrect because toStringz() returns immutable pointer.
> One way is to write mutable version of toStringz()
>
> char* toStringzMutable(string s) @trusted pure nothrow {
>     auto copy = new char[s.length + 1];
>     copy[0..s.length] = s[];
>     copy[s.length] = 0;
>     return copy.ptr;
> }
>
> But I think this is common needs,
> why it is not provided by Phobos?
> (or tell me if it has)
>
> Thanks,
> aki

In this particular case, if you `import core.stdc.string : strcmp`, instead of providing your own extern declaration it should work, because in there the signature is correctly typed as `in char*` which is essentially the same as `const(char*)` which can accept both mutable, const and immutable arguments. Also it has the correct attributes so you can call it from `pure`, `nothrow` and `@nogc` code.

As others have said, when you do need to convert a string slice to a pointer to a null terminated char/wchar/dchar string, `toUTFz` can be very useful.

But where possible, you should prefer functions that take an explicit length parameter, so you can avoid memory allocation:

```
string s1, s2;
import std.algorithm : min;
import core.stdc.string : strncmp;
strncmp(s1.ptr, s2.ptr, min(s1.length, s2.length));
// (`min` is used to prevent the C function from
// accessing data beyond the smallest
// of the two string slices).
```

Also string slices that point to a **whole** string literal are automatically null-terminated:

```
// lit is zero-terminated
string lit = "asdf";
assert (lit.ptr[lit.length] == '\0');
assert (strlen(lit.ptr) == lit.length);
```

However you need to be very careful, because as soon as you make a sub-slice, this property disappears:

```
// slice is not zero-terminated.
string slice = lit[0..2];
assert (slice.ptr[length] == 'd');
assert (strlen(slice.ptr) != slice.length);
```

This means that you can't be sure that a string slice is zero-termninated unless you can see it in your code that it points to a string literal and you're sure that it would never be changed to point to something else (like something returned from a function).