Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
January 11, 2007 latin-1 encoding | ||||
---|---|---|---|---|
| ||||
I'm just starting to look at D, but I can't seem to find any encodings for latin-1 in the standard library... |
January 12, 2007 Re: latin-1 encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Haugen | Simen Haugen wrote:
> I'm just starting to look at D, but I can't seem to find any encodings for latin-1 in the standard library...
What are you trying to do? It would be helpfull to know if you want to read files in latin-1 or if you want your whole program to use it internally.
|
January 12, 2007 Re: latin-1 encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Granberg | "Johan Granberg" wrote:
> What are you trying to do? It would be helpfull to know if you want to
> read
> files in latin-1 or if you want your whole program to use it internally.
Reading and writing files.
|
January 12, 2007 Re: latin-1 encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Haugen | Simen Haugen wrote:
> "Johan Granberg" wrote:
>> What are you trying to do? It would be helpfull to know if you want to
>> read
>> files in latin-1 or if you want your whole program to use it internally.
>
> Reading and writing files.
there is no string manipulation functions i the standard library that will help you there but you could read them as usual but instead of using char[] use ubyte[] to store them. If you want to use string manipulation functions the easiest would be to convert to utf8, there was some discussion of how to do that a couple of weeks ago.
|
January 12, 2007 Re: latin-1 encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Haugen | Simen Haugen schrieb:
> I'm just starting to look at D, but I can't seem to find any encodings for latin-1 in the standard library...
>
>
you can try the mango project. It has a package called ICU, that does convertions between various encodings and unicode.
|
January 12, 2007 Re: latin-1 encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Haugen | Simen Haugen wrote:
> "Johan Granberg" wrote:
>> What are you trying to do? It would be helpfull to know if you want to read
>> files in latin-1 or if you want your whole program to use it internally.
>
> Reading and writing files.
Now I'm no expert in character encodings, but isn't Latin-1 just the first 256 codepoints (or whatever they're called) of Unicode, packed into a single byte per character?
If so, it should be pretty trivial to convert latin-1 characters to Unicode, either to wchar[]/dchar[] by direct one-to-one assignment (no multibyte sequences possible) or to char[] by using std.utf.encode, like this:
-----
// warning: incomplete, untested code
ubyte[] data_lat1;
// ... fill data_lat1 array
char[] data_utf8; // perhaps preallocate this to a reasonable length
foreach(c; data_lat1) {
std.utf.encode(data_utf8, c);
}
-----
And UTF to Latin-1 should be pretty easy too:
-----
// again: incomplete, untested code
char[] data_utf; // wchar[] and dchar[] should work as well
ubyte[] data_lat1; // again, preallocate a reasonable array if you want
size_t i = 0;
while(i < data_utf.length) {
dchar c = std.utf.decode(data_utf, i); // advances i
assert(c < 0x100); // make sure it fits
data_lat1 ~= c;
}
-----
I should note that by 'preallocate' I mean '"new" an array and set the length to 0'.
Setting the length to 0 is important since otherwise your output will get appended to the end of a default-initialized array, which isn't what you want ;)
|
Copyright © 1999-2021 by the D Language Foundation