Thread overview
invalid UTF-8 sequence compiler error
Apr 02, 2007
Cesar Rabak
Apr 02, 2007
Carlos Santander
Apr 02, 2007
Cesar Rabak
Apr 06, 2007
Cesar Rabak
Apr 07, 2007
Thomas Kuehne
Apr 07, 2007
Cesar Rabak
Apr 07, 2007
Lars Ivar Igesund
Apr 08, 2007
Cesar Rabak
Jun 30, 2018
0xFFFFFFFF
April 02, 2007
Doing some tests on the gdc-0.22-1 for Linux (result of uname -a: Linux fuba 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:43:23 CEST 2005 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GNU/Linux), I got the error "invalid UTF-8 sequence" even if the 'offending' character is in comments.

Since the gcc counterpart does not complain on similar code with locale specific characters (mainly accented chars), I ponder:

Is there a way to have a gdc that can work with accented characters in strings and comments?

Regards,

--
Cesar Rabak
April 02, 2007
Cesar Rabak escribió:
> Doing some tests on the gdc-0.22-1 for Linux (result of uname -a: Linux fuba 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:43:23 CEST 2005 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GNU/Linux), I got the error "invalid UTF-8 sequence" even if the 'offending' character is in comments.
> 
> Since the gcc counterpart does not complain on similar code with locale specific characters (mainly accented chars), I ponder:
> 
> Is there a way to have a gdc that can work with accented characters in strings and comments?
> 
> Regards,
> 
> -- 
> Cesar Rabak

This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.

-- 
Carlos Santander Bernal
April 02, 2007
Carlos Santander escreveu:
[snipped]
> 
> This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.
> 

It will take time to convince my folks it is a feature but I'll try :-)

Thanks,

--
Cesar Rabak
April 06, 2007
Cesar Rabak escreveu:
> Carlos Santander escreveu:
> [snipped]
>>
>> This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.
>>
> 
> It will take time to convince my folks it is a feature but I'll try :-)
> 
In a chat with the stake holders I learned the source I can use the code I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-.

How it is this solved with the D compiler?

-- 
 Cesar Rabak
April 07, 2007
Cesar Rabak schrieb am 2007-04-06:
> Cesar Rabak escreveu:
>> Carlos Santander escreveu:
>> [snipped]
>>>
>>> This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.
>>>
>> 
>> It will take time to convince my folks it is a feature but I'll try :-)
>> 
> In a chat with the stake holders I learned the source I can use the code I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-.
>
> How it is this solved with the D compiler?

If you simply store and print message you could use "ubyte[]". For everyting
else like searching, concating etc. more information about the
encoding(s) is required.

Thomas

April 07, 2007
Thomas Kuehne escreveu:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Cesar Rabak schrieb am 2007-04-06:
>> Cesar Rabak escreveu:
>>> Carlos Santander escreveu:
>>> [snipped]
>>>> This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.
>>>>
>>> It will take time to convince my folks it is a feature but I'll try :-)
>>>
>> In a chat with the stake holders I learned the source I can use the code I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-.
>>
>> How it is this solved with the D compiler?
> 
> If you simply store and print message you could use "ubyte[]". For everyting
> else like searching, concating etc. more information about the
> encoding(s) is required.
> 
Thanks Thomas.

For going further in the tests for the consideration of D for programming in new projects storing and printing messages is all I need by now.

If too much text data starts to appear to be processed I might have to consider an input method that makes the appropriate conversion to an internal representation (and an output one, too).

--
Cesar Rabak
April 07, 2007
Cesar Rabak wrote:

> Thomas Kuehne escreveu:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>> 
>> Cesar Rabak schrieb am 2007-04-06:
>>> Cesar Rabak escreveu:
>>>> Carlos Santander escreveu:
>>>> [snipped]
>>>>> This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.
>>>>>
>>>> It will take time to convince my folks it is a feature but I'll try :-)
>>>>
>>> In a chat with the stake holders I learned the source I can use the code I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-.
>>>
>>> How it is this solved with the D compiler?
>> 
>> If you simply store and print message you could use "ubyte[]". For everyting else like searching, concating etc. more information about the encoding(s) is required.
>> 
> Thanks Thomas.
> 
> For going further in the tests for the consideration of D for programming in new projects storing and printing messages is all I need by now.
> 
> If too much text data starts to appear to be processed I might have to consider an input method that makes the appropriate conversion to an internal representation (and an output one, too).
> 
> --
> Cesar Rabak

If you should find yourself in need of such conversion routines, consider using the ICU (IBM package for such things) bindings in the Mango library, see http://www.dsource.org/projects/mango

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource, #d.tango & #D: larsivi
Dancing the Tango
April 08, 2007
Lars Ivar Igesund escreveu:
> Cesar Rabak wrote:
> 
>> Thomas Kuehne escreveu:
[snipped]

>> Thanks Thomas.
>>
>> For going further in the tests for the consideration of D for
>> programming in new projects storing and printing messages is all I need
>> by now.
>>
>> If too much text data starts to appear to be processed I might have to
>> consider an input method that makes the appropriate conversion to an
>> internal representation (and an output one, too).
>>
>> --
>> Cesar Rabak
> 
> If you should find yourself in need of such conversion routines, consider
> using the ICU (IBM package for such things) bindings in the Mango library,
> see http://www.dsource.org/projects/mango
> 

Thank you, Lars.
June 30, 2018
On Monday, 2 April 2007 at 01:37:27 UTC, Carlos Santander wrote:
> Cesar Rabak escribió:
>> Doing some tests on the gdc-0.22-1 for Linux (result of uname -a: Linux fuba 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:43:23 CEST 2005 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GNU/Linux), I got the error "invalid UTF-8 sequence" even if the 'offending' character is in comments.
>> 
>> Since the gcc counterpart does not complain on similar code with locale specific characters (mainly accented chars), I ponder:
>> 
>> Is there a way to have a gdc that can work with accented characters in strings and comments?
>> 
>> Regards,
>> 
>> --
>> Cesar Rabak
>
> This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work.

Even tho I didn't ask the question, it does solve my problem too. Thanks