Thread overview | |||||||||
---|---|---|---|---|---|---|---|---|---|
|
July 12, 2007 YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
I have a byte[] A that contains an AJP13 packet, presumably including UTF8 strings. I need to extract such strings and to place strings in such a buffer. I'm using: string s = A[n .. m].dup; // n and m from prefixed string length/position return s; to get strings, and byte[] ba = cast(byte[]) s; A[n .. n+ba.length] = ba[0 .. $].dup; to put them. Are these a) sensible, b) optimal? |
July 12, 2007 Re: YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Teale | Steve Teale wrote: > I have a byte[] A that contains an AJP13 packet, presumably including UTF8 strings. I need to extract such strings and to place strings in such a buffer. I'm using: > > string s = A[n .. m].dup; // n and m from prefixed string length/position > return s; > > to get strings, and That should work, and be optimal unless you can be sure the A array doesn't change while you still need the string (in which case the .dup is unnecessary). > byte[] ba = cast(byte[]) s; > A[n .. n+ba.length] = ba[0 .. $].dup; > > to put them. Are these a) sensible, b) optimal? This one should work as well, but isn't optimal; the .dup is unnecessary. This should be equivalent but more efficient: --- A[n .. n+s.length] = cast(byte[]) s; --- |
July 12, 2007 Re: YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel Wrote: > Steve Teale wrote: > > I have a byte[] A that contains an AJP13 packet, presumably including UTF8 strings. I need to extract such strings and to place strings in such a buffer. I'm using: > > > > string s = A[n .. m].dup; // n and m from prefixed string length/position return s; > > > > to get strings, and > > That should work, and be optimal unless you can be sure the A array doesn't change while you still need the string (in which case the .dup is unnecessary). > > > byte[] ba = cast(byte[]) s; > > A[n .. n+ba.length] = ba[0 .. $].dup; > > > > to put them. Are these a) sensible, b) optimal? > > This one should work as well, but isn't optimal; the .dup is unnecessary. This should be equivalent but more efficient: > --- > A[n .. n+s.length] = cast(byte[]) s; > --- Can I use n+s.length? In my experimentation i noticed that a UTF8 string containing a character using a two-byte representation definitely had an s.length of the number of characters, which was one less than the number of bytes. |
July 12, 2007 Re: YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Teale | Steve Teale wrote: > Frits van Bommel Wrote: > >> --- >> A[n .. n+s.length] = cast(byte[]) s; >> --- > > Can I use n+s.length? In my experimentation i noticed that a UTF8 string containing a character using a two-byte representation definitely had an s.length of the number of characters, which was one less than the number of bytes. You noticed wrong... char[]s in D aren't very special, they're just specific array types that happen to be handled specially by some functions (such as writef*)[1]. The .length is the number of elements, and each element is a fixed size. A char is just a type representing a byte from UTF-8 text. --- import std.stdio; void main() { auto s = "\u0100"; writefln(s); writefln(s.length); writefln((cast(byte[])s).length); } --- Outputs a weird character (an A with a - on top) and two times the number 2. [1]: and by foreach statements as well; they can automagically extract char/wchar/dchar elements from char[]/wchar[]dchar[], in any combination. |
July 12, 2007 Re: YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel Wrote:
> Steve Teale wrote:
> > Frits van Bommel Wrote:
> >
> >> ---
> >> A[n .. n+s.length] = cast(byte[]) s;
> >> ---
> >
> > Can I use n+s.length? In my experimentation i noticed that a UTF8 string containing a character using a two-byte representation definitely had an s.length of the number of characters, which was one less than the number of bytes.
>
> You noticed wrong...
> char[]s in D aren't very special, they're just specific array types that
> happen to be handled specially by some functions (such as writef*)[1].
> The .length is the number of elements, and each element is a fixed size.
> A char is just a type representing a byte from UTF-8 text.
> ---
> import std.stdio;
>
> void main() {
> auto s = "\u0100";
> writefln(s);
> writefln(s.length);
> writefln((cast(byte[])s).length);
> }
> ---
> Outputs a weird character (an A with a - on top) and two times the number 2.
>
>
> [1]: and by foreach statements as well; they can automagically extract char/wchar/dchar elements from char[]/wchar[]dchar[], in any combination.
You are correct, I had misinterpreted my own test program.
|
July 13, 2007 Re: YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel wrote:
> Outputs a weird character (an A with a - on top) [...]
Hah, Null-A! Reading A.E. van Vogt?
Regards, Frank
|
July 13, 2007 Re: YASQ - Proper way to convert byte[] <--> string | ||||
---|---|---|---|---|
| ||||
Posted in reply to 0ffh | 0ffh wrote:
> Frits van Bommel wrote:
>> Outputs a weird character (an A with a - on top) [...]
>
> Hah, Null-A! Reading A.E. van Vogt?
No, never heard of him. I just picked \u0100 because it was a round character code and it happened to be that character...
|
Copyright © 1999-2021 by the D Language Foundation