Thread overview
String range to dchar.
Oct 31, 2014
Samuel Pike
Oct 31, 2014
Meta
Oct 31, 2014
H. S. Teoh
October 31, 2014
Hi all.

First time posting here. I recently downloaded the dmd compiler and started making a few exercises with the language. Nice language features but still somewhat confused with the library.

If I use byDchar() over a "string" is there a possibility to get part of a character or is guaranteed that the entire visual character will be in the dchar?

Also, is there a way to peek into a range? Maybe a range that buffers its items when calling peek()?

Thank you
October 31, 2014
On Friday, 31 October 2014 at 00:17:02 UTC, Samuel Pike wrote:
> Hi all.
>
> First time posting here. I recently downloaded the dmd compiler and started making a few exercises with the language. Nice language features but still somewhat confused with the library.
>
> If I use byDchar() over a "string" is there a possibility to get part of a character or is guaranteed that the entire visual character will be in the dchar?
>
> Also, is there a way to peek into a range? Maybe a range that buffers its items when calling peek()?
>
> Thank you

You should just be able to call the range's .front method, which will do the decoding. However, calling .front on just a normal string without using byDchar will also work, as front automatically decodes by default.

void main()
{
	string s = "中文汉字";
	writeln(s[0]);       //Prints '?'
	writeln(s.front);    //Prints '中'
}
October 31, 2014
On Fri, Oct 31, 2014 at 12:17:00AM +0000, Samuel Pike via Digitalmars-d-learn wrote:
> Hi all.
> 
> First time posting here. I recently downloaded the dmd compiler and started making a few exercises with the language. Nice language features but still somewhat confused with the library.
> 
> If I use byDchar() over a "string" is there a possibility to get part of a character or is guaranteed that the entire visual character will be in the dchar?
[...]

A dchar corresponds with a Unicode code point, but that doesn't always correspond with a "visual character" (e.g., if you have a base character followed by a combining diacritic, they would come out as two dchars). The Unicode term for "visual character" is "grapheme". If you want to process the string by grapheme, use byGrapheme() from std.uni.


T

-- 
EMACS = Extremely Massive And Cumbersome System