Thread overview | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 08, 2015 Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) dchar[] range = to!dchar("erdem".dup) How costly is this? Is there a way which I can have Utf32 string directly without a cast? |
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kadir Erdem Demir | On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
>
> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
>
> dchar[] range = to!dchar("erdem".dup)
>
> How costly is this?
> Is there a way which I can have Utf32 string directly without a cast?
1. dstring range = to!dstring("erdem"); //without dup
2. dchar[] range = to!(dchar[])("erdem"); //mutable
3. dstring range = "erdem"d; //directly
4. dchar[] range = "erdem"d.dup; //mutable
|
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kadir Erdem Demir | On Mon, 08 Jun 2015 10:41:59 +0000 Kadir Erdem Demir via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. > > With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) > > dchar[] range = to!dchar("erdem".dup) > > How costly is this? > Is there a way which I can have Utf32 string directly without a > cast? dstring str = "erdem"d; dstring str2 = std.utf.toUTF32(someUtf8Or16Or32String); |
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Yaroshenko | On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote: > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote: >> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. >> >> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) >> >> dchar[] range = to!dchar("erdem".dup) >> >> How costly is this? >> Is there a way which I can have Utf32 string directly without a cast? > > 1. dstring range = to!dstring("erdem"); //without dup > 2. dchar[] range = to!(dchar[])("erdem"); //mutable > 3. dstring range = "erdem"d; //directly > 4. dchar[] range = "erdem"d.dup; //mutable what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32 |
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kadir Erdem Demir | On Mon, 08 Jun 2015 10:41:59 +0000 Kadir Erdem Demir via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. > > With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) > > dchar[] range = to!dchar("erdem".dup) > > How costly is this? import std.conv; import std.utf; import std.datetime; import std.stdio; void f0() { string somestr = "some not so long utf8 string forbenchmarking"; dstring str = to!dstring(somestr); } void f1() { string somestr = "some not so long utf8 string forbenchmarking"; dstring str = toUTF32(somestr); } void main() { auto r = benchmark!(f0,f1)(1_000_000); auto f0Result = to!Duration(r[0]); auto f1Result = to!Duration(r[1]); writeln("f0 time: ",f0Result); writeln("f1 time: ",f1Result); } /// output /// f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs f1 time: 600 ms, 979 μs, and 8 hnsecs |
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to weaselcat | Thanks a lot, your answers are very useful for me . Nothing wrong with toUtf32, I just didn't know it. |
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to weaselcat | On Mon, 08 Jun 2015 10:51:53 +0000 weaselcat via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote: > > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote: > >> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. > >> > >> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) > >> > >> dchar[] range = to!dchar("erdem".dup) > >> > >> How costly is this? > >> Is there a way which I can have Utf32 string directly without > >> a cast? > > > > 1. dstring range = to!dstring("erdem"); //without dup > > 2. dchar[] range = to!(dchar[])("erdem"); //mutable > > 3. dstring range = "erdem"d; //directly > > 4. dchar[] range = "erdem"d.dup; //mutable > > what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32 from: http://dlang.org/phobos/std_encoding.html#.transcode Supersedes: This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and std.utf.toUTF32() (but note that to!() supersedes it more conveniently). |
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Kozák | On Monday, 8 June 2015 at 11:06:07 UTC, Daniel Kozák wrote:
>
> On Mon, 08 Jun 2015 10:51:53 +0000
> weaselcat via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
> wrote:
>
>> On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:
>> > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
>> >> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
>> >>
>> >> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
>> >>
>> >> dchar[] range = to!dchar("erdem".dup)
>> >>
>> >> How costly is this?
>> >> Is there a way which I can have Utf32 string directly without a cast?
>> >
>> > 1. dstring range = to!dstring("erdem"); //without dup
>> > 2. dchar[] range = to!(dchar[])("erdem"); //mutable
>> > 3. dstring range = "erdem"d; //directly
>> > 4. dchar[] range = "erdem"d.dup; //mutable
>>
>> what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32
>
> from: http://dlang.org/phobos/std_encoding.html#.transcode
>
> Supersedes:
> This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and
> std.utf.toUTF32() (but note that to!() supersedes it more conveniently).
BTW on ldc(ldc -O3 -singleobj -release -boundscheck=off) transcode is the fastest:
f0 time: 1 sec, 115 ms, 48 μs, and 7 hnsecs // to!dstring
f1 time: 449 ms and 329 μs // toUTF32
f2 time: 272 ms, 969 μs, and 1 hnsec // transcode
|
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Kozák | On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote:
> import std.conv;
> import std.utf;
> import std.datetime;
> import std.stdio;
>
> void f0() {
> string somestr = "some not so long utf8 string forbenchmarking";
> dstring str = to!dstring(somestr);
> }
>
>
> void f1() {
> string somestr = "some not so long utf8 string forbenchmarking";
> dstring str = toUTF32(somestr);
> }
>
> void main() {
> auto r = benchmark!(f0,f1)(1_000_000);
> auto f0Result = to!Duration(r[0]);
> auto f1Result = to!Duration(r[1]);
> writeln("f0 time: ",f0Result);
> writeln("f1 time: ",f1Result);
> }
>
>
> /// output ///
> f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
> f1 time: 600 ms, 979 μs, and 8 hnsecs
Chances are you're benchmarking the GC. Try benchmark!(f0,f1,f0,f1,f0,f1);
|
June 08, 2015 Re: Utf8 to Utf32 cast cost | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On Mon, 08 Jun 2015 11:32:07 +0000 Kagamin via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote: > > import std.conv; > > import std.utf; > > import std.datetime; > > import std.stdio; > > > > void f0() { > > string somestr = "some not so long utf8 string > > forbenchmarking"; > > dstring str = to!dstring(somestr); > > } > > > > > > void f1() { > > string somestr = "some not so long utf8 string > > forbenchmarking"; > > dstring str = toUTF32(somestr); > > } > > > > void main() { > > auto r = benchmark!(f0,f1)(1_000_000); > > auto f0Result = to!Duration(r[0]); > > auto f1Result = to!Duration(r[1]); > > writeln("f0 time: ",f0Result); > > writeln("f1 time: ",f1Result); > > } > > > > > > /// output /// > > f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs > > f1 time: 600 ms, 979 μs, and 8 hnsecs > > Chances are you're benchmarking the GC. Try benchmark!(f0,f1,f0,f1,f0,f1); No difference even with GC.disable() results are same. |
Copyright © 1999-2021 by the D Language Foundation