char vs ascii (page 2)

Walter wrote: > > What do people think about using the keyword: > > ascii or char? > unicode or wchar? > My votes would be "char" and "unicode". Erik makes a good case against "ascii". "wchar" and "wchar_t" are ugly C-committeeisms, IMO. -Russell B

Sheldon Simms wrote in message <9lgvsh$2jb7$1@digitaldaemon.com>... >Im Artikel <9levtq$10ji$1@digitaldaemon.com> schrieb "Walter" <walter@digitalmars.com>: > >> I've found I've wanted to support both ascii and unicode simultaneously in programs, hence I thought two different types was appropriate. I was constantly irritated by having to go through and either subtract or add L's in front of the strings. The macros to do it automatically are ugly. Hence, the idea that the string literals should be implicitly convertible to either char[] or wchar[]. > >Well it seems to be that you already have standard sizes integral types: byte, short, int, long. > >Why not make char be a 2 or 4-byte unicode char and use the syntax > >byte[] str = "My ASCII string"; > >for ascii? It seems useful to be able to overload char and byte separately.

Jeff Frohwein wrote: > Walter wrote: > >>What do people think about using the keyword: >> >> ascii or char? >> unicode or wchar? >> > > ... > > u8,s8,u16,s16,u32,s32,... > > ... > As 128 bit and 256 bit systems are released, adding new types > would be as easy as u128,s128,u256,s256... rather than have to > consider something like "long long long long", or a new name in > general. Those that want to use vague types can always typedef > their own types. > > Thanks for listening, :) > > Jeff > That's a good idea. These could be the basic language defined types, and then a "standard library" could include typedefs for the types that people are more familiar with. This would allow code to be written that could either easily adapt to changing word sizes. Be fixed for particular sizes, or both. And still have it be fairly portable.

Jeff Frohwein <"jeff "@ SPAMLESSdevrs.com> > Thanks for listening, :) Oh, I am reading all of this stuff. It's a lot of fun, and people have great ideas. I'm a little surprised at the sheer volume of replies and comments! -Walter

Oh, I hate the "_t" suffix too. I'd love to name it unicode, but since there is a Unicode, Inc., I don't think I can. Russell Bornschlegel wrote in message <3B7C4455.ADFB4496@estarcion.com>... > > >Walter wrote: >> >> What do people think about using the keyword: >> >> ascii or char? >> unicode or wchar? >> > >My votes would be "char" and "unicode". Erik makes a good case against "ascii". "wchar" and "wchar_t" are ugly C-committeeisms, IMO. > >-Russell B

Walter wrote in message <9lk4ij$2d7a$2@digitaldaemon.com>... >Oh, I hate the "_t" suffix too. I'd love to name it unicode, but since there >is a Unicode, Inc., I don't think I can. I checked. Unicode is a registered trademark of Unicode, Inc. They specifically say that "unicode" can't be included in a product. Oh well. I guess that's why the ANSI committee picked "wchar_t". Looks like "wchar" is what D will use. -Walter

"Walter" <walter@digitalmars.com> wrote in message news:9lk4vh$2dj3$1@digitaldaemon.com... > I checked. Unicode is a registered trademark of Unicode, Inc. They specifically say that "unicode" can't be included in a product. Oh well. > > I guess that's why the ANSI committee picked "wchar_t". XML uses UTF, so you could think about using 'utf' as one possible keyword. --Kent

Kent Sandvik wrote: > "Walter" <walter@digitalmars.com> wrote in message news:9lk4vh$2dj3$1@digitaldaemon.com... > > > I checked. Unicode is a registered trademark of Unicode, Inc. They specifically say that "unicode" can't be included in a product. Oh well. > > > > I guess that's why the ANSI committee picked "wchar_t". > > XML uses UTF, so you could think about using 'utf' as one possible keyword. --Kent Any clarification what UTF might mean? It's not necessarily obvious. Neither is wchar...but it's closer.

August 18, 2001

Re: char vs ascii

Posted by Kent Sandvik
in reply to Russ Lewis

Permalink

Kent Sandvik

Posted in reply to Russ Lewis

Permalink

Goodle is our friend. UTF or actually UTF-8 is one encoding scheme, stands
for
UCS Transformation Format, and actually USC is more in line with the Unicode
definition, or Universal Character Set. Anyway, if those buzz words are too
unknown, then wchar_t maybe is the way to go. --Kent

"Russ Lewis" <russ@deming-os.org> wrote in message news:3B7D9CA1.6A25385C@deming-os.org...
> Kent Sandvik wrote:
>
> > "Walter" <walter@digitalmars.com> wrote in message news:9lk4vh$2dj3$1@digitaldaemon.com...
> >
> > > I checked. Unicode is a registered trademark of Unicode, Inc. They specifically say that "unicode" can't be included in a product. Oh
well.
> > >
> > > I guess that's why the ANSI committee picked "wchar_t".
> >
> > XML uses UTF, so you could think about using 'utf' as one possible keyword. --Kent
>
> Any clarification what UTF might mean?  It's not necessarily obvious.
Neither
> is wchar...but it's closer.
>

In article <9lk4vh$2dj3$1@digitaldaemon.com>, Walter wrote: > > I checked. Unicode is a registered trademark of Unicode, Inc. They specifically say that "unicode" can't be included in a product. Oh well. I guess that's why the ANSI committee picked "wchar_t". > > Looks like "wchar" is what D will use. Please don't. I say, make form follow function. wchar is a throwback to some weird ansi'ism, having "wide char's". That's stupid. If you want to have D handle strings natively, *and* you want it to be some sort of internationalized version of a string, make it be a string, or even a "char", or "character". Make it sufficiently different from C, such that people will know. For 1-byte things, use the type "byte". Say what you mean, mean what you say. wchar? If you use UTF, it could be vchar (variable length), etc... --Toby.

Forums