Thread overview
strings and ranges
Aug 15, 2013
Jason den Dulk
Aug 15, 2013
anonymous
Aug 15, 2013
Jonathan M Davis
Aug 15, 2013
monarch_dodra
August 15, 2013
Hello.

When working with my code I noticed that if I use front on a char[], it yields a dchar. Am I correct in concluding that it does a UTF-8 to UTF-32 conversion and popFont will skip the whole character, not just a code unit?

Also, does this mean that if I'm creating an output range for char[], will I need to implement a put(dchar) as well as a put(char)?

Thanks
Regards
Jason
August 15, 2013
On Thursday, 15 August 2013 at 00:49:00 UTC, Jason den Dulk wrote:
> When working with my code I noticed that if I use front on a char[], it yields a dchar. Am I correct in concluding that it does a UTF-8 to UTF-32 conversion and popFont will skip the whole character, not just a code unit?

yup

> Also, does this mean that if I'm creating an output range for char[], will I need to implement a put(dchar) as well as a put(char)?

I think you don't need put(char). put(char[]) or put(const(char)[]) could be worthwhile to prevent decoding. But put(dchar) alone would suffice.
August 15, 2013
On Thursday, August 15, 2013 02:48:58 Jason den Dulk wrote:
> Hello.
> 
> When working with my code I noticed that if I use front on a char[], it yields a dchar. Am I correct in concluding that it does a UTF-8 to UTF-32 conversion and popFont will skip the whole character, not just a code unit?
> 
> Also, does this mean that if I'm creating an output range for
> char[], will I need to implement a put(dchar) as well as a
> put(char)?

All strings are treated as ranges of dchar when using the range APIs, so you pretty much don't do anything with char or wchar where ranges are concerned unless you're optimizing a particular function for narrow strings. There is no reason to implement put(char), just put(dchar). Range-based code shouldn't generally care what type of string it's dealing with, so you wouldn't normally be writing any range-based code that cared about char[] unless you're optimizing a particular function's implementation (in which case, all of that would be internal to the function and wouldn't affect its semantics).

Here are a couple of stackoverflow questions that discuss ranges and strings. Perhaps, you'll find them useful.

http://stackoverflow.com/questions/16590650/how-to-read-a-string-character-by-character-as-a-range-in-d

http://stackoverflow.com/questions/12288465/std-algorithm-joinerstring-string-why-result-elements-are-dchar-and-not-ch


- Jonathan M Davis

P.S. I really should finish writing the article that I started explaining ranges. So much to do, so little time.
August 15, 2013
On Thursday, 15 August 2013 at 00:49:00 UTC, Jason den Dulk wrote:
> Also, does this mean that if I'm creating an output range for char[], will I need to implement a put(dchar) as well as a put(char)?

Unfortunately, right now, yes. "put" doesn't know how to convert on the fly to the right type.

However, I have an open pull request so that anything that accepts some form of character, or character string, can be feed any form of character, or character stream.