Thread overview | ||||||
---|---|---|---|---|---|---|
|
June 07, 2013 odd behavior of split() function | ||||
---|---|---|---|---|
| ||||
I would like to split "A+B+C+D" into "A", "B", "C", "D" but when using split() I get "A+B+C+D", "B+C+D", "C+D", "D" the code is below import std.stdio; import std.string; import std.array; int main() { string [] str_list; string test_str = "A+B+C+D"; str_list = test_str.split("+"); foreach(item; str_list) printf("%s\n", cast(char*)item); return 0; } |
June 07, 2013 Re: odd behavior of split() function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bedros | On Friday, June 07, 2013 09:18:57 Bedros wrote:
> I would like to split "A+B+C+D" into "A", "B", "C", "D"
>
> but when using split() I get
>
> "A+B+C+D", "B+C+D", "C+D", "D"
>
>
> the code is below
>
>
> import std.stdio;
> import std.string;
> import std.array;
>
> int main()
> {
> string [] str_list;
> string test_str = "A+B+C+D";
> str_list = test_str.split("+");
> foreach(item; str_list)
> printf("%s\n", cast(char*)item);
>
> return 0;
> }
That would be because of your misuse of printf. If you used
foreach(item; str_list)
writeln(item);
you would have been fine. D string literals do happen to have a null character one past their end so that you can pass them directly to C functions, but D strings in general are _not_ null terminated, and printf expects strings to be null terminated. If you want to convert a D string to a null terminated string, you need to use std.string.toStringz, not a cast. You should pretty much never cast a D string to char* or const char* or any variant thereof. So, you could have done
printf("%s\n", toStringz(item));
but I don't know why you'd want to use printf rather than writeln or writefln - both of which (unlike printf) are typesafe and understand D types.
You got
"A+B+C+D", "B+C+D", "C+D", "D"
because the original string (being a string literal) had a null character one past its end, and each of the strings returned by split was a slice of the original string, and printf blithely ignored the actual boundaries of the slice looking for the next null character that it happened to find in memory, which - because they were all slices of the same string literal - happened to be the end of the original string literal. And the strings printed differed, because each slice started in a different portion of the underlying array.
- Jonathan M Davis
|
June 07, 2013 Re: odd behavior of split() function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | first of all, many thanks for the quick reply.
I'm learning D and it's just because of the habit I unconsciously used printf instead of writef
thanks again.
-Bedros
On Friday, 7 June 2013 at 07:29:48 UTC, Jonathan M Davis wrote:
> On Friday, June 07, 2013 09:18:57 Bedros wrote:
>> I would like to split "A+B+C+D" into "A", "B", "C", "D"
>>
>> but when using split() I get
>>
>> "A+B+C+D", "B+C+D", "C+D", "D"
>>
>>
>> the code is below
>>
>>
>> import std.stdio;
>> import std.string;
>> import std.array;
>>
>> int main()
>> {
>> string [] str_list;
>> string test_str = "A+B+C+D";
>> str_list = test_str.split("+");
>> foreach(item; str_list)
>> printf("%s\n", cast(char*)item);
>>
>> return 0;
>> }
>
> That would be because of your misuse of printf. If you used
>
> foreach(item; str_list)
> writeln(item);
>
> you would have been fine. D string literals do happen to have a null character
> one past their end so that you can pass them directly to C functions, but D
> strings in general are _not_ null terminated, and printf expects strings to be
> null terminated. If you want to convert a D string to a null terminated
> string, you need to use std.string.toStringz, not a cast. You should pretty
> much never cast a D string to char* or const char* or any variant thereof. So,
> you could have done
>
> printf("%s\n", toStringz(item));
>
> but I don't know why you'd want to use printf rather than writeln or writefln -
> both of which (unlike printf) are typesafe and understand D types.
>
> You got
>
> "A+B+C+D", "B+C+D", "C+D", "D"
>
> because the original string (being a string literal) had a null character one
> past its end, and each of the strings returned by split was a slice of the
> original string, and printf blithely ignored the actual boundaries of the
> slice looking for the next null character that it happened to find in memory,
> which - because they were all slices of the same string literal - happened to
> be the end of the original string literal. And the strings printed differed,
> because each slice started in a different portion of the underlying array.
>
> - Jonathan M Davis
|
June 07, 2013 Re: odd behavior of split() function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bedros | Am 07.06.2013 09:53, schrieb Bedros:
> first of all, many thanks for the quick reply.
>
> I'm learning D and it's just because of the habit I unconsciously used
> printf instead of writef
>
> thanks again.
>
> -Bedros
>
> On Friday, 7 June 2013 at 07:29:48 UTC, Jonathan M Davis wrote:
>> On Friday, June 07, 2013 09:18:57 Bedros wrote:
>>> I would like to split "A+B+C+D" into "A", "B", "C", "D"
>>>
>>> but when using split() I get
>>>
>>> "A+B+C+D", "B+C+D", "C+D", "D"
>>>
>>>
>>> the code is below
>>>
>>>
>>> import std.stdio;
>>> import std.string;
>>> import std.array;
>>>
>>> int main()
>>> {
>>> string [] str_list;
>>> string test_str = "A+B+C+D";
>>> str_list = test_str.split("+");
>>> foreach(item; str_list)
>>> printf("%s\n", cast(char*)item);
>>>
>>> return 0;
>>> }
>>
>> That would be because of your misuse of printf. If you used
>>
>> foreach(item; str_list)
>> writeln(item);
>>
>> you would have been fine. D string literals do happen to have a null
>> character
>> one past their end so that you can pass them directly to C functions,
>> but D
>> strings in general are _not_ null terminated, and printf expects
>> strings to be
>> null terminated. If you want to convert a D string to a null terminated
>> string, you need to use std.string.toStringz, not a cast. You should
>> pretty
>> much never cast a D string to char* or const char* or any variant
>> thereof. So,
>> you could have done
>>
>> printf("%s\n", toStringz(item));
>>
>> but I don't know why you'd want to use printf rather than writeln or
>> writefln -
>> both of which (unlike printf) are typesafe and understand D types.
>>
>> You got
>>
>> "A+B+C+D", "B+C+D", "C+D", "D"
>>
>> because the original string (being a string literal) had a null
>> character one
>> past its end, and each of the strings returned by split was a slice of
>> the
>> original string, and printf blithely ignored the actual boundaries of the
>> slice looking for the next null character that it happened to find in
>> memory,
>> which - because they were all slices of the same string literal -
>> happened to
>> be the end of the original string literal. And the strings printed
>> differed,
>> because each slice started in a different portion of the underlying
>> array.
>>
>> - Jonathan M Davis
>
You can use printf if you want to, the correct usage is not so nice though:
string str = "test";
printf("%.*s", str.length, str.ptr);
Kind Regards
Benjamin Thaut
|
Copyright © 1999-2021 by the D Language Foundation