Jump to page: 1 2
Thread overview
Utf8 to Utf32 cast cost
Jun 08, 2015
Kadir Erdem Demir
Jun 08, 2015
Ilya Yaroshenko
Jun 08, 2015
weaselcat
Jun 08, 2015
Kadir Erdem Demir
Jun 08, 2015
Daniel Kozák
Jun 08, 2015
Daniel Kozak
Jun 10, 2015
Marco Leise
Jun 08, 2015
Daniel Kozák
Jun 08, 2015
Daniel Kozák
Jun 08, 2015
Kagamin
Jun 08, 2015
Daniel Kozák
Jun 08, 2015
Anonymouse
Jun 08, 2015
Daniel Kozak
Jun 08, 2015
Anonymouse
Jun 10, 2015
Marco Leise
June 08, 2015
I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.

With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])

dchar[] range = to!dchar("erdem".dup)

How costly is this?
Is there a way which I can have Utf32 string directly without a cast?
June 08, 2015
On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
>
> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
>
> dchar[] range = to!dchar("erdem".dup)
>
> How costly is this?
> Is there a way which I can have Utf32 string directly without a cast?

1. dstring range = to!dstring("erdem"); //without dup
2. dchar[] range = to!(dchar[])("erdem"); //mutable
3. dstring range = "erdem"d; //directly
4. dchar[] range = "erdem"d.dup; //mutable
June 08, 2015
On Mon, 08 Jun 2015 10:41:59 +0000
Kadir Erdem Demir via Digitalmars-d-learn
<digitalmars-d-learn@puremagic.com> wrote:

> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
> 
> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
> 
> dchar[] range = to!dchar("erdem".dup)
> 
> How costly is this?
> Is there a way which I can have Utf32 string directly without a
> cast?

dstring str = "erdem"d;
dstring str2 = std.utf.toUTF32(someUtf8Or16Or32String);




June 08, 2015
On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:
> On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
>> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
>>
>> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
>>
>> dchar[] range = to!dchar("erdem".dup)
>>
>> How costly is this?
>> Is there a way which I can have Utf32 string directly without a cast?
>
> 1. dstring range = to!dstring("erdem"); //without dup
> 2. dchar[] range = to!(dchar[])("erdem"); //mutable
> 3. dstring range = "erdem"d; //directly
> 4. dchar[] range = "erdem"d.dup; //mutable

what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32
June 08, 2015
On Mon, 08 Jun 2015 10:41:59 +0000
Kadir Erdem Demir via Digitalmars-d-learn
<digitalmars-d-learn@puremagic.com> wrote:

> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
> 
> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
> 
> dchar[] range = to!dchar("erdem".dup)
> 
> How costly is this?

import std.conv;
import std.utf;
import std.datetime;
import std.stdio;

void f0() {
    string somestr = "some not so long utf8 string forbenchmarking";
    dstring str = to!dstring(somestr);
}


void f1() {
    string somestr = "some not so long utf8 string forbenchmarking";
    dstring str = toUTF32(somestr);
}

void main() {
    auto r = benchmark!(f0,f1)(1_000_000);
    auto f0Result = to!Duration(r[0]);
    auto f1Result = to!Duration(r[1]);
    writeln("f0 time: ",f0Result);
    writeln("f1 time: ",f1Result);
}


/// output ///
f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
f1 time: 600 ms, 979 μs, and 8 hnsecs

June 08, 2015
Thanks a lot, your answers are very useful for me .
Nothing wrong with toUtf32, I just didn't know it.
June 08, 2015
On Mon, 08 Jun 2015 10:51:53 +0000
weaselcat via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
wrote:

> On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:
> > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
> >> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
> >>
> >> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
> >>
> >> dchar[] range = to!dchar("erdem".dup)
> >>
> >> How costly is this?
> >> Is there a way which I can have Utf32 string directly without
> >> a cast?
> >
> > 1. dstring range = to!dstring("erdem"); //without dup
> > 2. dchar[] range = to!(dchar[])("erdem"); //mutable
> > 3. dstring range = "erdem"d; //directly
> > 4. dchar[] range = "erdem"d.dup; //mutable
> 
> what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32

from: http://dlang.org/phobos/std_encoding.html#.transcode

Supersedes:
This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and
std.utf.toUTF32() (but note that to!() supersedes it more conveniently).
June 08, 2015
On Monday, 8 June 2015 at 11:06:07 UTC, Daniel Kozák wrote:
>
> On Mon, 08 Jun 2015 10:51:53 +0000
> weaselcat via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
> wrote:
>
>> On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:
>> > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
>> >> I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance.
>> >>
>> >> With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[])
>> >>
>> >> dchar[] range = to!dchar("erdem".dup)
>> >>
>> >> How costly is this?
>> >> Is there a way which I can have Utf32 string directly without a cast?
>> >
>> > 1. dstring range = to!dstring("erdem"); //without dup
>> > 2. dchar[] range = to!(dchar[])("erdem"); //mutable
>> > 3. dstring range = "erdem"d; //directly
>> > 4. dchar[] range = "erdem"d.dup; //mutable
>> 
>> what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32
>
> from: http://dlang.org/phobos/std_encoding.html#.transcode
>
> Supersedes:
> This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and
> std.utf.toUTF32() (but note that to!() supersedes it more conveniently).

BTW on ldc(ldc -O3 -singleobj -release -boundscheck=off) transcode is the fastest:

f0 time: 1 sec, 115 ms, 48 μs, and 7 hnsecs // to!dstring
f1 time: 449 ms and 329 μs // toUTF32
f2 time: 272 ms, 969 μs, and 1 hnsec // transcode
June 08, 2015
On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote:
> import std.conv;
> import std.utf;
> import std.datetime;
> import std.stdio;
>
> void f0() {
>     string somestr = "some not so long utf8 string forbenchmarking";
>     dstring str = to!dstring(somestr);
> }
>
>
> void f1() {
>     string somestr = "some not so long utf8 string forbenchmarking";
>     dstring str = toUTF32(somestr);
> }
>
> void main() {
>     auto r = benchmark!(f0,f1)(1_000_000);
>     auto f0Result = to!Duration(r[0]);
>     auto f1Result = to!Duration(r[1]);
>     writeln("f0 time: ",f0Result);
>     writeln("f1 time: ",f1Result);
> }
>
>
> /// output ///
> f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
> f1 time: 600 ms, 979 μs, and 8 hnsecs

Chances are you're benchmarking the GC. Try benchmark!(f0,f1,f0,f1,f0,f1);
June 08, 2015
On Mon, 08 Jun 2015 11:32:07 +0000
Kagamin via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
wrote:

> On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote:
> > import std.conv;
> > import std.utf;
> > import std.datetime;
> > import std.stdio;
> >
> > void f0() {
> >     string somestr = "some not so long utf8 string
> > forbenchmarking";
> >     dstring str = to!dstring(somestr);
> > }
> >
> >
> > void f1() {
> >     string somestr = "some not so long utf8 string
> > forbenchmarking";
> >     dstring str = toUTF32(somestr);
> > }
> >
> > void main() {
> >     auto r = benchmark!(f0,f1)(1_000_000);
> >     auto f0Result = to!Duration(r[0]);
> >     auto f1Result = to!Duration(r[1]);
> >     writeln("f0 time: ",f0Result);
> >     writeln("f1 time: ",f1Result);
> > }
> >
> >
> > /// output ///
> > f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
> > f1 time: 600 ms, 979 μs, and 8 hnsecs
> 
> Chances are you're benchmarking the GC. Try benchmark!(f0,f1,f0,f1,f0,f1);

No difference even with GC.disable() results are same.

« First   ‹ Prev
1 2