June 06, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
On 6/5/13 6:11 PM, Timothee Cour wrote:
> currently std.demangle.demangle doesn't work with unicode (see example below)
>
> If we decide to keep allowing unicode symbols (as opposed to just unicode strings/comments), we must
> address this issue. Will supporting this negatively impact performance (of both compile time and
> runtime) ?
>
> Likewise, will linkers + other tools (gdb etc) be happy with unicode in mangled names?
>
> ----
> structA{
> intz;
> voidfoo(intx){}
> voidさいごの果実(intx){}
> voidªå(intx){}
> }
> mangledName!(A.さいごの果実).demangle.writeln;=>_D4util13demangle_funs1A18さいごの果実MFiZv
> ----
>
Filed in bugzilla?
|
June 17, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
On Jun 5, 2013, at 6:21 PM, Brad Roberts <braddr@puremagic.com> wrote: > On 6/5/13 6:11 PM, Timothee Cour wrote: >> currently std.demangle.demangle doesn't work with unicode (see example below) >> >> If we decide to keep allowing unicode symbols (as opposed to just unicode strings/comments), we must address this issue. Will supporting this negatively impact performance (of both compile time and runtime) ? >> >> Likewise, will linkers + other tools (gdb etc) be happy with unicode in mangled names? >> >> ---- >> structA{ >> intz; >> voidfoo(intx){} >> voidさいごの果実(intx){} >> voidªå(intx){} >> } >> mangledName!(A.さいごの果実).demangle.writeln;=>_D4util13demangle_funs1A18さいごの果実MFiZv >> ---- > > Filed in bugzilla? http://d.puremagic.com/issues/show_bug.cgi?id=10393 https://github.com/D-Programming-Language/druntime/pull/524 |
June 17, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
On Mon, Jun 17, 2013 at 11:37:18AM -0700, Sean Kelly wrote: > On Jun 5, 2013, at 6:21 PM, Brad Roberts <braddr@puremagic.com> wrote: > > > On 6/5/13 6:11 PM, Timothee Cour wrote: > >> currently std.demangle.demangle doesn't work with unicode (see example below) > >> > >> If we decide to keep allowing unicode symbols (as opposed to just unicode strings/comments), we must address this issue. Will supporting this negatively impact performance (of both compile time and runtime) ? > >> > >> Likewise, will linkers + other tools (gdb etc) be happy with unicode in mangled names? > >> > >> ---- > >> structA{ > >> intz; > >> voidfoo(intx){} > >> voidさいごの果実(intx){} > >> voidªå(intx){} > >> } > >> mangledName!(A.さいごの果実).demangle.writeln;=>_D4util13demangle_funs1A18さいごの果実MFiZv > >> ---- > > > > Filed in bugzilla? > > http://d.puremagic.com/issues/show_bug.cgi?id=10393 https://github.com/D-Programming-Language/druntime/pull/524 Do linkers actually support 8-bit symbol names? Or do these have to be translated into ASCII somehow? T -- We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true. -- Robert Wilensk |
June 17, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
On Jun 17, 2013, at 11:47 AM, "H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote: > > Do linkers actually support 8-bit symbol names? Or do these have to be translated into ASCII somehow? Good question. It looks like the linker on OSX does: public _D3abc1A18さいごの果実MFiZv public _D3abc1A4ªåMFiZv The object file linked just fine. I haven't tried OPTLINK on Win32 though. |
June 17, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | On 05/31/2013 05:11 AM, Simen Kjaeraas wrote:
> On Fri, 31 May 2013 07:57:37 +0200, Walter Bright
> <newshound2@digitalmars.com> wrote:
>
>> On 5/30/2013 5:00 PM, Peter Williams wrote:
>>> On 31/05/13 05:07, Walter Bright wrote:
>>>> On 5/30/2013 4:24 AM, Manu wrote:
>>>>> We don't all know English. Plenty of people don't.
>>>>> I've worked a lot with Sony and Nintendo code/libraries, for instance,
>>>>> it almost
>>>>> always looks like this:
>>>>>
>>>>> {
>>>>> // E: I like cake.
>>>>> // J: ケーキが好きです。
>>>>> player.eatCake();
>>>>> }
>>>>>
>>>>> Clearly someone doesn't speak English in these massive codebases that
>>>>> power an
>>>>> industry worth 10s of billions.
>>>>
>>>> Sure, but the code itself is written using ASCII!
>>>
>>> Because they had no choice.
>>
>> Not true, D supports Unicode identifiers.
>
> I doubt Sony and Nintendo use D extensively.
>
|
June 18, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
On 6/17/13 11:58 AM, Sean Kelly wrote:
> On Jun 17, 2013, at 11:47 AM, "H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote:
>>
>> Do linkers actually support 8-bit symbol names? Or do these have to be
>> translated into ASCII somehow?
>
> Good question. It looks like the linker on OSX does:
>
> public _D3abc1A18さいごの果実MFiZv
> public _D3abc1A4ªåMFiZv
>
> The object file linked just fine. I haven't tried OPTLINK on Win32 though.
>
Don't symbol names from dmd/win32 get compressed if they're too long, resulting in essentially arbitrary random binary data being used as symbol names? Assuming my memory on that is correct then it's already demonstrated that optlink doesn't care what the data is.
|
June 18, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | On 6/17/2013 6:28 PM, Brad Roberts wrote:
> Don't symbol names from dmd/win32 get compressed if they're too long, resulting
> in essentially arbitrary random binary data being used as symbol names?
> Assuming my memory on that is correct then it's already demonstrated that
> optlink doesn't care what the data is.
Optlink doesn't care what the symbol byte contents are.
|
June 18, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Mon, Jun 17, 2013 at 06:49:19PM -0700, Walter Bright wrote: > On 6/17/2013 6:28 PM, Brad Roberts wrote: > >Don't symbol names from dmd/win32 get compressed if they're too long, resulting in essentially arbitrary random binary data being used as symbol names? Assuming my memory on that is correct then it's already demonstrated that optlink doesn't care what the data is. > > Optlink doesn't care what the symbol byte contents are. It seems ld on Linux doesn't, either. I just tested separate compilation on some code containing functions and modules with Cyrillic names, and it worked fine. But my system locale is UTF-8; I'm not sure if there may be a problem on other system locales (not that modern systems would actually use anything else, though!). Might this cause a problem with the VS linker? T -- It only takes one twig to burn down a forest. |
June 18, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On 6/18/2013 9:44 AM, H. S. Teoh wrote:
> Might this cause a problem with the VS linker?
I doubt it, but try it and see!
|
June 18, 2013 Re: Why UTF-8/16 character encodings? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Tue, Jun 18, 2013 at 04:33:54PM -0700, Walter Bright wrote: > On 6/18/2013 9:44 AM, H. S. Teoh wrote: > >Might this cause a problem with the VS linker? > > I doubt it, but try it and see! Sadly I don't have access to a Windows dev machine. Anybody else cares to try? T -- Study gravitation, it's a field with a lot of potential. |
Copyright © 1999-2021 by the D Language Foundation