Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
April 26, 2013 Internationalization vs. Unicode | ||||
---|---|---|---|---|
| ||||
There are myriad encoding schemes. D natively supports Unicode and provide functionality via phobos. A byproduct of this is that since ASCII is a subset of Unicode, it also natively support ASCII. This is a plus for the language but what of the other encoding schemes? What library functionality is provided to manipulate or convert between those encoding schemes and Unicode? I have a need to convert from CKJ encoding (presently EUC-JP and Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there a standalone library that does this? If so, can someone point me to it? If not, is there planned functionality for inclusion in phobos or am I doomed to resorting to Java or some other language to accomplish this task (or at least until I'm educated enough to do it myself)? Thanks, Andrew |
April 26, 2013 Re: Internationalization vs. Unicode | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tyro[17] | On Fri, Apr 26, 2013 at 06:09:48PM -0400, Tyro[17] wrote: > There are myriad encoding schemes. D natively supports Unicode and provide functionality via phobos. A byproduct of this is that since ASCII is a subset of Unicode, it also natively support ASCII. This is a plus for the language but what of the other encoding schemes? What library functionality is provided to manipulate or convert between those encoding schemes and Unicode? > > I have a need to convert from CKJ encoding (presently EUC-JP and Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there a standalone library that does this? If so, can someone point me to it? If not, is there planned functionality for inclusion in phobos or am I doomed to resorting to Java or some other language to accomplish this task (or at least until I'm educated enough to do it myself)? [...] If you're using a Posix system, you could look into the 'recode' utility to convert from those legacy formats to Unicode before using your program on them. You may be able to figure out how to do it by looking at recode's source code. But AFAIK there is no way to do it in D currently. Maybe someone should invent std.recode and submit it for inclusion into Phobos. ;-) T -- People tell me that I'm paranoid, but they're just out to get me. |
April 27, 2013 Re: Internationalization vs. Unicode | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tyro[17] | On 2013-04-27 00:09, Tyro[17] wrote: > There are myriad encoding schemes. D natively supports Unicode and > provide functionality via phobos. A byproduct of this is that since > ASCII is a subset of Unicode, it also natively support ASCII. This is a > plus for the language but what of the other encoding schemes? What > library functionality is provided to manipulate or convert between those > encoding schemes and Unicode? > > I have a need to convert from CKJ encoding (presently EUC-JP and > Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there > a standalone library that does this? If so, can someone point me to it? > If not, is there planned functionality for inclusion in phobos or am I > doomed to resorting to Java or some other language to accomplish this > task (or at least until I'm educated enough to do it myself)? Would ICU do the work? If that's the case you can take a look at this: https://github.com/d-widget-toolkit/com.ibm.icu I will most likely not compile with the latest version of DMD. Also I don't know how complete it is. -- /Jacob Carlborg |
April 29, 2013 Re: Internationalization vs. Unicode | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On 4/27/13 6:37 AM, Jacob Carlborg wrote:
> On 2013-04-27 00:09, Tyro[17] wrote:
>> There are myriad encoding schemes. D natively supports Unicode and
>> provide functionality via phobos. A byproduct of this is that since
>> ASCII is a subset of Unicode, it also natively support ASCII. This is a
>> plus for the language but what of the other encoding schemes? What
>> library functionality is provided to manipulate or convert between those
>> encoding schemes and Unicode?
>>
>> I have a need to convert from CKJ encoding (presently EUC-JP and
>> Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there
>> a standalone library that does this? If so, can someone point me to it?
>> If not, is there planned functionality for inclusion in phobos or am I
>> doomed to resorting to Java or some other language to accomplish this
>> task (or at least until I'm educated enough to do it myself)?
>
> Would ICU do the work? If that's the case you can take a look at this:
>
> https://github.com/d-widget-toolkit/com.ibm.icu
>
> I will most likely not compile with the latest version of DMD. Also I
> don't know how complete it is.
>
This might work. Not sure yet. The first thing that caught my eyes is
import java.lang.all;
import java.math.BigInteger;
import java.text.CharacterIterator;
import java.text.ParsePosition;
import java.util.Comparator;
import java.util.Date;
and I was immediately confused. What? We can directly import and use Java in D? Let me try this... Oh! No! Not really! We can't. Well, since D uses the file system to organize its files, I should be able to find a java folder with these classes signatures or the D equivalent somewhere in the project folder. No... I don't see one anywhere. Looks like I will have to file ICU on my list of things to get educated about. For now I will continue to use the Java implementation I've got. Thanks.
|
April 29, 2013 Re: Internationalization vs. Unicode | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tyro[17] | On Monday, 29 April 2013 at 18:36:32 UTC, Tyro[17] wrote: > This might work. Not sure yet. The first thing that caught my eyes is You'll find the ported Java source: https://github.com/d-widget-toolkit/base/tree/master/src |
Copyright © 1999-2021 by the D Language Foundation