drop ASCII characters from D? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » drop ASCII characters from D?

Thread overview

drop ASCII characters from D?
Apr 05, 2002 J. Daniel Smith
Apr 05, 2002 Pavel Minayev
Apr 05, 2002 J. Daniel Smith
Apr 05, 2002 roland
Apr 05, 2002 Walter
Apr 06, 2002 OddesE
Apr 06, 2002 Walter
Apr 08, 2002 J. Daniel Smith
Apr 08, 2002 Walter
Apr 09, 2002 J. Daniel Smith
Apr 09, 2002 Walter
Apr 06, 2002 Pavel Minayev

April 05, 2002

drop ASCII characters from D?

Posted by J. Daniel Smith

J. Daniel Smith

Walter's comment in the "Delegates" thread about the code

    void foo(char[]);
    void foo(wchar[]);
    ...
    foo("hello");
being ambiguous made me wonder about the point of supporting ASCII
characters in the first place.



Why not drop "wchar" and make "char" always means a 2-byte UNICODE character (or even a 4-byte ISO10646 character).  With the release of Windows XP last fall, the need for ASCII support is going to diminish as WinXP replaces Win98/WinME.



If you really need a single byte character to interface with legacy APIs, use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from "char" as needed.  Yea, that makes such code more difficult, but it should all be burried in some class anyway.



It seems that supporting both in D goes against current trends (Java, VB and C# are all UNICODE-only); it also implicitly encourages the continued use of ASCII which is a decision that is usually regretted in many real-world applications.



   Dan

April 05, 2002

Re: drop ASCII characters from D?

Posted by Pavel Minayev
in reply to J. Daniel Smith

Pavel Minayev

Posted in reply to J. Daniel Smith

"J. Daniel Smith" <j_daniel_smith@HoTMaiL.com> wrote in message news:a8kprh$4h2$1@digitaldaemon.com...

> Walter's comment in the "Delegates" thread about the code
>
>     void foo(char[]);
>     void foo(wchar[]);
>     ...
>     foo("hello");
> being ambiguous made me wonder about the point of supporting ASCII
> characters in the first place.
>
>
>
> Why not drop "wchar" and make "char" always means a 2-byte UNICODE
character
> (or even a 4-byte ISO10646 character).  With the release of Windows XP
last
> fall, the need for ASCII support is going to diminish as WinXP replaces Win98/WinME.

I believe no more than 10% of my friends have WinNT, 2K or XP. Your suggestion would make it very hard to write programs that run on 9x series, which is still most popular.

> If you really need a single byte character to interface with legacy APIs, use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from "char"
as
> needed.  Yea, that makes such code more difficult, but it should all be burried in some class anyway.

You can convert a single char; but what about strings? D doesn't convert arrays, AFAIK...

> It seems that supporting both in D goes against current trends (Java, VB
and
> C# are all UNICODE-only); it also implicitly encourages the continued use
of

VB is bloated, partially because of UNICODE-only strings. Java is platform-independent and doesn't really care of underlying system. C# is Microsoft's reply to Java, and is bloated as well.

D is a practical tool. Since most systems and most programs today still work with ASCII strings, they should be in the language.

April 05, 2002

Re: drop ASCII characters from D?

Posted by J. Daniel Smith
in reply to Pavel Minayev

J. Daniel Smith

Posted in reply to Pavel Minayev

Today there are still a lot of Win9x/WinME boxes out there, but that's not going to be the case for long.  I don't know what Walter's timeline is for officially releasing D to the world, but let's just say it's 1-Jan-2003. Add another year onto that for people to actually start adopting the language and developing/shipping programs en-masse and we're to 2004.  I think the Win9x/WinME numbers will look a lot different in 18+ months.  I don't think it's much of a stretch to say that in the not too distant future, ASCII will largely be considered legacy.

If you don't want to drop ASCII support completely from D, how about making it (much) more difficult to use by making UNICODE the default? "char" is UNICODE, "achar" is ASCII; a string/character literal is UNICODE, you have to use an ugly A prefix to get ASCII; there are no implicit conversions between UNICODE/ASCII - you've got to call some library routine (or maybe cast) instead.

D is a new language, it should look to the future.

   Dan

"Pavel Minayev" <evilone@omen.ru> wrote in message news:a8kttp$ued$1@digitaldaemon.com...
> "J. Daniel Smith" <j_daniel_smith@HoTMaiL.com> wrote in message news:a8kprh$4h2$1@digitaldaemon.com...
>
> > Walter's comment in the "Delegates" thread about the code
> >
> >     void foo(char[]);
> >     void foo(wchar[]);
> >     ...
> >     foo("hello");
> > being ambiguous made me wonder about the point of supporting ASCII
> > characters in the first place.
> >
> >
> >
> > Why not drop "wchar" and make "char" always means a 2-byte UNICODE
> character
> > (or even a 4-byte ISO10646 character).  With the release of Windows XP
> last
> > fall, the need for ASCII support is going to diminish as WinXP replaces Win98/WinME.
>
> I believe no more than 10% of my friends have WinNT, 2K or XP. Your suggestion would make it very hard to write programs that run on 9x series, which is still most popular.
>
> > If you really need a single byte character to interface with legacy
APIs,
> > use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from
"char"
> as
> > needed.  Yea, that makes such code more difficult, but it should all be burried in some class anyway.
>
> You can convert a single char; but what about strings? D doesn't convert arrays, AFAIK...
>
> > It seems that supporting both in D goes against current trends (Java, VB
> and
> > C# are all UNICODE-only); it also implicitly encourages the continued
use
> of
>
> VB is bloated, partially because of UNICODE-only strings. Java is platform-independent and doesn't really care of underlying system. C# is Microsoft's reply to Java, and is bloated as well.
>
> D is a practical tool. Since most systems and most programs today still work with ASCII strings, they should be in the language.
>
>
>
>
>
>

April 05, 2002

Re: drop ASCII characters from D?

Posted by roland
in reply to J. Daniel Smith

roland

Posted in reply to J. Daniel Smith

"J. Daniel Smith" a écrit :

> Today there are still a lot of Win9x/WinME boxes out there, but that's not going to be the case for long.  I don't know what Walter's timeline is for officially releasing D to the world, but let's just say it's 1-Jan-2003. Add another year onto that for people to actually start adopting the language and developing/shipping programs en-masse and we're to 2004.  I think the Win9x/WinME numbers will look a lot different in 18+ months.  I don't think it's much of a stretch to say that in the not too distant future, ASCII will largely be considered legacy.
>
> If you don't want to drop ASCII support completely from D, how about making it (much) more difficult to use by making UNICODE the default? "char" is UNICODE, "achar" is ASCII; a string/character literal is UNICODE, you have to use an ugly A prefix to get ASCII; there are no implicit conversions between UNICODE/ASCII - you've got to call some library routine (or maybe cast) instead.
>
> D is a new language, it should look to the future.
>
>    Dan
>
> "Pavel Minayev" <evilone@omen.ru> wrote in message news:a8kttp$ued$1@digitaldaemon.com...
> > "J. Daniel Smith" <j_daniel_smith@HoTMaiL.com> wrote in message news:a8kprh$4h2$1@digitaldaemon.com...
> >
> > > Walter's comment in the "Delegates" thread about the code
> > >
> > >     void foo(char[]);
> > >     void foo(wchar[]);
> > >     ...
> > >     foo("hello");
> > > being ambiguous made me wonder about the point of supporting ASCII
> > > characters in the first place.
> > >
> > >
> > >
> > > Why not drop "wchar" and make "char" always means a 2-byte UNICODE
> > character
> > > (or even a 4-byte ISO10646 character).  With the release of Windows XP
> > last
> > > fall, the need for ASCII support is going to diminish as WinXP replaces Win98/WinME.
> >
> > I believe no more than 10% of my friends have WinNT, 2K or XP. Your suggestion would make it very hard to write programs that run on 9x series, which is still most popular.
> >
> > > If you really need a single byte character to interface with legacy
> APIs,
> > > use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from
> "char"
> > as
> > > needed.  Yea, that makes such code more difficult, but it should all be burried in some class anyway.
> >
> > You can convert a single char; but what about strings? D doesn't convert arrays, AFAIK...
> >
> > > It seems that supporting both in D goes against current trends (Java, VB
> > and
> > > C# are all UNICODE-only); it also implicitly encourages the continued
> use
> > of
> >
> > VB is bloated, partially because of UNICODE-only strings. Java is platform-independent and doesn't really care of underlying system. C# is Microsoft's reply to Java, and is bloated as well.
> >
> > D is a practical tool. Since most systems and most programs today still work with ASCII strings, they should be in the language.
> >
> >
> >
> >
> >
> >

how is linux concerning caractere size ?
i personaly see my future rather near linux than XP

roland

April 05, 2002

Re: drop ASCII characters from D?

Posted by Walter
in reply to roland

Walter

Posted in reply to roland

"roland" <nancyetroland@free.fr> wrote in message news:3CAE2667.1375F510@free.fr...
> how is linux concerning caractere size ?
> i personaly see my future rather near linux than XP

Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be the future, but it is still many years away, and D should be agnostic about whether the app is ASCII or Unicode.

April 06, 2002

Re: drop ASCII characters from D?

Posted by Pavel Minayev
in reply to J. Daniel Smith

Pavel Minayev

Posted in reply to J. Daniel Smith

"J. Daniel Smith" <j_daniel_smith@HoTMaiL.com> wrote in message news:a8l5o5$17di$1@digitaldaemon.com...

> D is a new language, it should look to the future.

Unicode is not necessary future. Not in the few next years, at least.

And after all, you can always use alias in your programs:

    alias wchar Char;

April 06, 2002

Re: drop ASCII characters from D?

Posted by OddesE
in reply to Walter

OddesE

Posted in reply to Walter

"Walter" <walter@digitalmars.com> wrote in message news:a8lbqb$1d7s$1@digitaldaemon.com...
>
> "roland" <nancyetroland@free.fr> wrote in message news:3CAE2667.1375F510@free.fr...
> > how is linux concerning caractere size ?
> > i personaly see my future rather near linux than XP
>
> Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be
the
> future, but it is still many years away, and D should be agnostic about whether the app is ASCII or Unicode.
>
>

I think memory shouldn't be a concern. I
don't think text, as in characters and
strings of characters, is the real memory
user in today's computing is it? A
typical e-mail or document isn't big at
all. It's things like images, textures in
games and audio and video that consume
most space on disk or in memory.

I agree that ASCII, although it isn't
dead, deserves to die. I think the idea
to make 32-bit characters the standard
is a good one, although I have to admit
I don't know much about the standardisation
that is going on in that field.
But I do know that 256 characters is way
too little!

maybe it is just too early to make a
decision, when the standardisation
hasn't settled down...


--
Stijn
OddesE_XYZ@hotmail.com
http://OddesE.cjb.net
_________________________________________________
Remove _XYZ from my address when replying by mail

April 06, 2002

Re: drop ASCII characters from D?

Posted by Walter
in reply to OddesE

Walter

Posted in reply to OddesE

"OddesE" <OddesE_XYZ@hotmail.com> wrote in message news:a8n3tp$18pi$1@digitaldaemon.com...
> I think memory shouldn't be a concern. I
> don't think text, as in characters and
> strings of characters, is the real memory
> user in today's computing is it? A
> typical e-mail or document isn't big at
> all. It's things like images, textures in
> games and audio and video that consume
> most space on disk or in memory.

It still is a concern. I have an app on linux with wchars, and it still uses 200 megs of ram, mostly because of the 4 bytes per char. Secondly, if you're distributing an executable with a lot of text strings in it, it can bloat up the download size quite a bit. I can also neatly fit all my source code on a CD. I don't want it 4 times bigger <g>.

> I agree that ASCII, although it isn't
> dead, deserves to die. I think the idea
> to make 32-bit characters the standard
> is a good one, although I have to admit
> I don't know much about the standardisation
> that is going on in that field.
> But I do know that 256 characters is way
> too little!
> maybe it is just too early to make a
> decision, when the standardisation
> hasn't settled down...

Another huge reason to support ASCII in D is because D is meant to interface with C functions. C apps are nearly all written to use ASCII. ASCII support isn't going away anytime soon in operating systems, so D must support it easilly.

April 08, 2002

Re: drop ASCII characters from D?

Posted by J. Daniel Smith
in reply to Walter

J. Daniel Smith

Posted in reply to Walter

So what about my suggestion of making ASCII a bit more difficult to use - that is, Unicode is the prefered/default character type in D.  'char' is a Unicode character and "abc" is a Unicode string.

With the release of Windows XP, it's not going to be very long (months, not years) before a Unicode-enabled platform is the norm for most people.

And I'm not sure I buy the "memory" argument - my PocketPC which is easily more memory constrainted than any desktop PC only supports Unicode.

   Dan

"Walter" <walter@digitalmars.com> wrote in message news:a8lbqb$1d7s$1@digitaldaemon.com...
>
> "roland" <nancyetroland@free.fr> wrote in message news:3CAE2667.1375F510@free.fr...
> > how is linux concerning caractere size ?
> > i personaly see my future rather near linux than XP
>
> Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be
the
> future, but it is still many years away, and D should be agnostic about whether the app is ASCII or Unicode.
>
>

April 08, 2002

Re: drop ASCII characters from D?

Posted by Walter
in reply to J. Daniel Smith

Walter

Posted in reply to J. Daniel Smith

"J. Daniel Smith" <j_daniel_smith@HoTMaiL.com> wrote in message news:a8s2ng$1jg0$1@digitaldaemon.com...
> So what about my suggestion of making ASCII a bit more difficult to use - that is, Unicode is the prefered/default character type in D.  'char' is a Unicode character and "abc" is a Unicode string.

There is no default char type in D. char is ascii, wchar is unicode. The type of "abc" depends on context. The source text can be ascii or unicode (try it!).

> With the release of Windows XP, it's not going to be very long (months,
not
> years) before a Unicode-enabled platform is the norm for most people.

All win32 platforms support unicode already.

> And I'm not sure I buy the "memory" argument - my PocketPC which is easily more memory constrainted than any desktop PC only supports Unicode.

You can shrink down the memory for unicode quite a bit by using UTF8, at the expense of slowing things down.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation