Jump to page: 1 2 3
Thread overview
[Proposal] Add module for C-strings support in Phobos
Mar 20, 2014
Denis Shelomovskij
Mar 20, 2014
Rikki Cattermole
Mar 20, 2014
Denis Shelomovskij
Mar 20, 2014
Rikki Cattermole
Mar 20, 2014
Denis Shelomovskij
Mar 21, 2014
angel
Mar 21, 2014
Adam D. Ruppe
Mar 21, 2014
Adam D. Ruppe
Mar 22, 2014
Nordlöw
Mar 22, 2014
Nordlöw
Mar 22, 2014
Nordlöw
Mar 22, 2014
Adam D. Ruppe
Mar 22, 2014
Nordlöw
Mar 22, 2014
Nordlöw
Mar 22, 2014
Nordlöw
Mar 22, 2014
Andrej Mitrovic
Mar 22, 2014
Nordlöw
Mar 22, 2014
Nordlöw
Mar 22, 2014
Andrej Mitrovic
Mar 22, 2014
Andrej Mitrovic
Mar 22, 2014
Andrej Mitrovic
March 20, 2014
It's filed as enhancement 12418 [2]:

C-strings processing is a special and common case so:
1. C-strings should be supported with both performance and usability.
2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules).

Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function.

So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers.


[1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html
[2] https://d.puremagic.com/issues/show_bug.cgi?id=12418
[3] https://d.puremagic.com/issues/show_bug.cgi?id=12417

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
March 20, 2014
On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:
> It's filed as enhancement 12418 [2]:
>
> C-strings processing is a special and common case so:
> 1. C-strings should be supported with both performance and usability.
> 2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules).
>
> Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function.
>
> So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers.
>
>
> [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html
> [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418
> [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417

Looks like it wouldn't be really useful with Windows API. Given that wstrings are more common there.

Another thing that would be nice to have is a wrapper struct for the pointer that allows accessing via e.g. opIndex and opSlice. Ext.
Use case: Store the struct on D side to make sure GC doesn't clean it up and still be able to access and modify it like a normal string easily.
March 20, 2014
20.03.2014 13:20, Rikki Cattermole пишет:
> On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:
>> It's filed as enhancement 12418 [2]:
>>
>> C-strings processing is a special and common case so:
>> 1. C-strings should be supported with both performance and usability.
>> 2. There should be a dedicated module for C-strings (instead of adding
>> such functions here and there in other modules).
>>
>> Current state: there is no good support for C-strings in Phobos, there
>> is slow and broken `toStringz` (Issue 12417 [3]), and no standard way
>> to make many common operations, like converting returned C-string to
>> string and releasing its memory or creating a C-string from string
>> using an allocation function.
>>
>> So I propose to add `unstd.c.string` [1] module to Phobos which
>> include all use-cases I have seen implementing (correct and fast in
>> contrast to existing ones like GtkD (yes, it's both incorrect and slow
>> because of tons of GC allocations)) C library wrappers.
>>
>>
>> [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html
>> [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418
>> [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417
>
> Looks like it wouldn't be really useful with Windows API. Given that
> wstrings are more common there.

You misunderstand the terminology. C string is a zero-terminated string. Also looks like you didn't even go to docs page as the second example is WinAPI one.

> Another thing that would be nice to have is a wrapper struct for the
> pointer that allows accessing via e.g. opIndex and opSlice. Ext.
> Use case: Store the struct on D side to make sure GC doesn't clean it up
> and still be able to access and modify it like a normal string easily.

I don't understand the use-case. If you did implemented some C library wrappers and have a personal experience, I'd like to hear your opinion on C functions calling problem and your proposal to solve it, if you dislike mine. Also with examples, please, where my solution fails and your one rocks. )


-- 
Денис В. Шеломовский
Denis V. Shelomovskij
March 20, 2014
On Thursday, 20 March 2014 at 09:32:33 UTC, Denis Shelomovskij wrote:
> 20.03.2014 13:20, Rikki Cattermole пишет:
>> On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:
>>> It's filed as enhancement 12418 [2]:
>>>
>>> C-strings processing is a special and common case so:
>>> 1. C-strings should be supported with both performance and usability.
>>> 2. There should be a dedicated module for C-strings (instead of adding
>>> such functions here and there in other modules).
>>>
>>> Current state: there is no good support for C-strings in Phobos, there
>>> is slow and broken `toStringz` (Issue 12417 [3]), and no standard way
>>> to make many common operations, like converting returned C-string to
>>> string and releasing its memory or creating a C-string from string
>>> using an allocation function.
>>>
>>> So I propose to add `unstd.c.string` [1] module to Phobos which
>>> include all use-cases I have seen implementing (correct and fast in
>>> contrast to existing ones like GtkD (yes, it's both incorrect and slow
>>> because of tons of GC allocations)) C library wrappers.
>>>
>>>
>>> [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html
>>> [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418
>>> [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417
>>
>> Looks like it wouldn't be really useful with Windows API. Given that
>> wstrings are more common there.
>
> You misunderstand the terminology. C string is a zero-terminated string. Also looks like you didn't even go to docs page as the second example is WinAPI one.
I understand how c strings work. It would be nice to have more unittests for dstring/wstring, because it looks more geared towards char/string. Which is why it looks on the offset that it is less going to work.

>> Another thing that would be nice to have is a wrapper struct for the
>> pointer that allows accessing via e.g. opIndex and opSlice. Ext.
>> Use case: Store the struct on D side to make sure GC doesn't clean it up
>> and still be able to access and modify it like a normal string easily.
>
> I don't understand the use-case. If you did implemented some C library wrappers and have a personal experience, I'd like to hear your opinion on C functions calling problem and your proposal to solve it, if you dislike mine. Also with examples, please, where my solution fails and your one rocks. )

I don't dislike your approach at all. I just feel that it needs to allow for a little more use cases. Given the proposal is for phobos.

What you have done looks fine for most cases to c libraries. I'm just worried that it has less use cases then it could have.
I'm just nitpicking so don't mind me too much :)
March 20, 2014
20.03.2014 13:52, Rikki Cattermole пишет:
> On Thursday, 20 March 2014 at 09:32:33 UTC, Denis Shelomovskij wrote:
>> 20.03.2014 13:20, Rikki Cattermole пишет:
>>> On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:
>>>> It's filed as enhancement 12418 [2]:
>>>>
>>>> C-strings processing is a special and common case so:
>>>> 1. C-strings should be supported with both performance and usability.
>>>> 2. There should be a dedicated module for C-strings (instead of adding
>>>> such functions here and there in other modules).
>>>>
>>>> Current state: there is no good support for C-strings in Phobos, there
>>>> is slow and broken `toStringz` (Issue 12417 [3]), and no standard way
>>>> to make many common operations, like converting returned C-string to
>>>> string and releasing its memory or creating a C-string from string
>>>> using an allocation function.
>>>>
>>>> So I propose to add `unstd.c.string` [1] module to Phobos which
>>>> include all use-cases I have seen implementing (correct and fast in
>>>> contrast to existing ones like GtkD (yes, it's both incorrect and slow
>>>> because of tons of GC allocations)) C library wrappers.
>>>>
>>>>
>>>> [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html
>>>> [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418
>>>> [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417
>>>
>>> Looks like it wouldn't be really useful with Windows API. Given that
>>> wstrings are more common there.
>>
>> You misunderstand the terminology. C string is a zero-terminated
>> string. Also looks like you didn't even go to docs page as the second
>> example is WinAPI one.
> I understand how c strings work. It would be nice to have more unittests
> for dstring/wstring, because it looks more geared towards char/string.
> Which is why it looks on the offset that it is less going to work.

I'd say must unittests do test UTF-16 & UTF-32 versions. As for documentation, function signatures contain template parameter for character but probably there is a lack of ddoc unittests and/or documentation.

>
>>> Another thing that would be nice to have is a wrapper struct for the
>>> pointer that allows accessing via e.g. opIndex and opSlice. Ext.
>>> Use case: Store the struct on D side to make sure GC doesn't clean it up
>>> and still be able to access and modify it like a normal string easily.
>>
>> I don't understand the use-case. If you did implemented some C library
>> wrappers and have a personal experience, I'd like to hear your opinion
>> on C functions calling problem and your proposal to solve it, if you
>> dislike mine. Also with examples, please, where my solution fails and
>> your one rocks. )
>
> I don't dislike your approach at all. I just feel that it needs to allow
> for a little more use cases. Given the proposal is for phobos.
>
> What you have done looks fine for most cases to c libraries. I'm just
> worried that it has less use cases then it could have.
> I'm just nitpicking so don't mind me too much :)

Thanks. So the algorithm is like this: find C library which needs more love and file me an issue [1]. As I just added all common use-cases I have seen.

[1] https://bitbucket.org/denis-sh/unstandard/issues

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
March 21, 2014
Going slightly beyond a new module code, it might, possibly, be useful to enable zero-terminated string creation on the core language level, with:
    auto mystr = "hello"z;

The 'z' in the end is much the same as 'L' in a '5L' ...
March 21, 2014
On Friday, 21 March 2014 at 19:59:51 UTC, angel wrote:
> Going slightly beyond a new module code, it might, possibly, be useful to enable zero-terminated string creation on the core language level, with:
>     auto mystr = "hello"z;

The core language already knows zero-terminated strings:

void main() {
        immutable(char)* s = "lol";
}

Regular 8-bit strings implicitly convert to pointers without needing to explicitly call the .ptr property and they are always zero terminated automatically.

This is why you can write printf("foo"); in D and have it just work without complaining about needing toStringz.

You can also write:

        const char* s = "lol";

and that works too. Not quite auto, but not a big hassle.
March 21, 2014
You could also write:

alias toStringz z;

auto foo = "bar".z;

and that would work too!
March 22, 2014
> alias toStringz z;
>
> auto foo = "bar".z;

In this case "bar" is already zero-terminated right?

See

"String literals already have a 0 appended to them"

in http://dlang.org/arrays.html
March 22, 2014
> You could also write:
>
> alias toStringz z;
>
> auto foo = "bar".z;
>
> and that would work too!

DMD currently cannot infer aliases to be callable using UCFS unfortunately:

#!/usr/bin/env rdmd-dev-module

unittest {
    import std.stdio: wln = writeln;
    import std.string;
    wln(typeof("a".z).stringof);
}

errors with

t_string.d(19,19): Error: no property 'z' for type 'string'

Shouldn't be to hard to fix, though.
« First   ‹ Prev
1 2 3