Thread overview
pleading for String
Sep 13, 2003
Helmut Leitner
Sep 13, 2003
Matthew Wilson
Sep 13, 2003
Mike Wynn
Sep 15, 2003
Matthew Wilson
Sep 15, 2003
Sean L. Palmer
Sep 13, 2003
Sean L. Palmer
Sep 13, 2003
Helmut Leitner
Sep 14, 2003
Matthew Wilson
September 13, 2003
Currently there is no official String identifier in the D language. One can only guess why this is so: I would assume that this is a void left for an object String to come. For now "char []" fills its place for all practical purposes.

I would plead for an official

  alias char [] String;

do fill this void.

I'll try to add a few arguments.

First, it's effortless. Anyone can define it on its own and use it seemlessly even now, as in:

  int main(String [] args)

there are no hidden problems. I used "Str" for my Venus library and had no problems anywhere. I could rename it to "String" but I would prefer an official solution.

Second, the "String" type is already deeply engraved in the
current API system. There are Phobos identifiers like
   - toString
   - writeString
   - readString
   - ...
and DIG identifiers like
   - getString
   - saveString
   - colorString
   - ...
that all use "char []". If some new String-class would be defined,
these APIs would have be to renamed or left with a serious
inconsistency.

On the other hand: a big and powerful String class might look
attractive. It could include lots of functions usable by calls
like
   s.cvtUpper(); or s.toDouble();

But:

  - Using
      cvtUpper(s); or toDouble(s);
    isn't much worse. Technically its identical. You wouldn't be
    able to inherit from "String", but you also can't from "int".
    An alias would just give "String" the status of a primitive.

  - A String class can never be complete. You may provide a hundred
    functions and people will still add utilities of their own.
    And people will cry because they can't use this functionality
    easily for some StringBuffer (outbuffer) class that they need
    for performance reasons.

  - Such a class would bloat the code. As far as I know, the
    compiler / linker / system has no way to get rid of unneeded
    methods. It's clear that this is hard, because the method
    addresses must be part of some vtable thats needed in case
    of inheritance. So the linker would have to know about vtables
    and clean them up and strip methods during linking.

    So the situation is: any method of a String class would add to
    the footprint of almost any statically linked D executable.

====

Therefore I think "alias char [] String;" is the way to go. I suggest to add it to the Phobos library as soon as possible.

-- 
Helmut Leitner    leitner@hls.via.at
Graz, Austria   www.hls-software.com
September 13, 2003
I could be convinced, if it was string_t. The reason is that this would then be unambiguously a type(def) rather than a fully-fledged class.

Especially so, since String would be the first name of any future string class.

"Helmut Leitner" <helmut.leitner@chello.at> wrote in message news:3F62F30A.73C49276@chello.at...
> Currently there is no official String identifier in the D language. One can only guess why this is so: I would assume that this is a void left for an object String to come. For now "char []" fills its place for all practical purposes.
>
> I would plead for an official
>
>   alias char [] String;
>
> do fill this void.
>
> I'll try to add a few arguments.
>
> First, it's effortless. Anyone can define it on its own and use it seemlessly even now, as in:
>
>   int main(String [] args)
>
> there are no hidden problems. I used "Str" for my Venus library and had no problems anywhere. I could rename it to "String" but I would prefer an official solution.
>
> Second, the "String" type is already deeply engraved in the
> current API system. There are Phobos identifiers like
>    - toString
>    - writeString
>    - readString
>    - ...
> and DIG identifiers like
>    - getString
>    - saveString
>    - colorString
>    - ...
> that all use "char []". If some new String-class would be defined,
> these APIs would have be to renamed or left with a serious
> inconsistency.
>
> On the other hand: a big and powerful String class might look
> attractive. It could include lots of functions usable by calls
> like
>    s.cvtUpper(); or s.toDouble();
>
> But:
>
>   - Using
>       cvtUpper(s); or toDouble(s);
>     isn't much worse. Technically its identical. You wouldn't be
>     able to inherit from "String", but you also can't from "int".
>     An alias would just give "String" the status of a primitive.
>
>   - A String class can never be complete. You may provide a hundred
>     functions and people will still add utilities of their own.
>     And people will cry because they can't use this functionality
>     easily for some StringBuffer (outbuffer) class that they need
>     for performance reasons.
>
>   - Such a class would bloat the code. As far as I know, the
>     compiler / linker / system has no way to get rid of unneeded
>     methods. It's clear that this is hard, because the method
>     addresses must be part of some vtable thats needed in case
>     of inheritance. So the linker would have to know about vtables
>     and clean them up and strip methods during linking.
>
>     So the situation is: any method of a String class would add to
>     the footprint of almost any statically linked D executable.
>
> ====
>
> Therefore I think "alias char [] String;" is the way to go.
> I suggest to add it to the Phobos library as soon as possible.
>
> -- 
> Helmut Leitner    leitner@hls.via.at
> Graz, Austria   www.hls-software.com


September 13, 2003
> "Helmut Leitner" <helmut.leitner@chello.at> wrote in message
> news:3F62F30A.73C49276@chello.at...
> 
>>Currently there is no official String identifier in the D language.
>>One can only guess why this is so: I would assume that this is a
>>void left for an object String to come. For now "char []" fills its
>>place for all practical purposes.
>>
>>I would plead for an official
>>
>>  alias char [] String;
>>
>>do fill this void.
>>
>>I'll try to add a few arguments.
>>
>>First, it's effortless. Anyone can define it on its own and use
>>it seemlessly even now, as in:
>>
>>  int main(String [] args)
>>
Matthew Wilson wrote:
> I could be convinced, if it was string_t. The reason is that this would then
> be unambiguously a type(def) rather than a fully-fledged class.
>
> Especially so, since String would be the first name of any future string
> class.
>

I would like to see a true "string" type not just an alias to char[]
so string can be unicode (UTF8,16,32 internally as required)
idealy with a format function (sprintf/delphi format)
something useable as
x = String.format( "%d, %x", a, b );
x = String.format( "%d, %x", [a, b] );
or even a memebr function
x = "%d, %x".format( a, b );
x = "%d, %x".format( [a, b] );

September 13, 2003
I would want it to be called "string" not "String" just to be consistent with the rest of the basic types.  It's for that same reason that I don't like "string_t".  If you have _t on the end of the type names it should be on all the type names, and I don't think that is a good idea.

I think I'm with Mike though;   a string should be more than a simple typedef for char[].  It should support unicode for one.  We should make all the string functions work on string instead of char[], and have an implicit conversion from char[] to string.  String literals should be of type string instead of char[] as well.

If you want user extendability and no bloat, then all the "methods" of the string should in fact be global functions taking a string as argument.

I'd like to get away from the printf-style formatting and go to something more like:

string format(string formatstring, formatobject[]);

used like

string res = format("The %0 is %1 %2.", "moon", "very", "bright");

Then you can translate it:

string res = format("La %0 es %1 %2.", "luna", "muy", "brillante");

Even rearrange the text during translation:

string res = format("%1 %2 la %0 es.", "luna", "muy", "brillante");

And formatobject can be made to support any kind of formatting.

string res = format("%0", rightjustify("foo",12));

string res = format("%0", floatprecision(math.pi,16,12));

Sean

"Helmut Leitner" <helmut.leitner@chello.at> wrote in message news:3F62F30A.73C49276@chello.at...
> Currently there is no official String identifier in the D language. One can only guess why this is so: I would assume that this is a void left for an object String to come. For now "char []" fills its place for all practical purposes.
>
> I would plead for an official
>
>   alias char [] String;
>
> do fill this void.
>
> I'll try to add a few arguments.
>
> First, it's effortless. Anyone can define it on its own and use it seemlessly even now, as in:
>
>   int main(String [] args)
>
> there are no hidden problems. I used "Str" for my Venus library and had no problems anywhere. I could rename it to "String" but I would prefer an official solution.
>
> Second, the "String" type is already deeply engraved in the
> current API system. There are Phobos identifiers like
>    - toString
>    - writeString
>    - readString
>    - ...
> and DIG identifiers like
>    - getString
>    - saveString
>    - colorString
>    - ...
> that all use "char []". If some new String-class would be defined,
> these APIs would have be to renamed or left with a serious
> inconsistency.
>
> On the other hand: a big and powerful String class might look
> attractive. It could include lots of functions usable by calls
> like
>    s.cvtUpper(); or s.toDouble();
>
> But:
>
>   - Using
>       cvtUpper(s); or toDouble(s);
>     isn't much worse. Technically its identical. You wouldn't be
>     able to inherit from "String", but you also can't from "int".
>     An alias would just give "String" the status of a primitive.
>
>   - A String class can never be complete. You may provide a hundred
>     functions and people will still add utilities of their own.
>     And people will cry because they can't use this functionality
>     easily for some StringBuffer (outbuffer) class that they need
>     for performance reasons.
>
>   - Such a class would bloat the code. As far as I know, the
>     compiler / linker / system has no way to get rid of unneeded
>     methods. It's clear that this is hard, because the method
>     addresses must be part of some vtable thats needed in case
>     of inheritance. So the linker would have to know about vtables
>     and clean them up and strip methods during linking.
>
>     So the situation is: any method of a String class would add to
>     the footprint of almost any statically linked D executable.
>
> ====
>
> Therefore I think "alias char [] String;" is the way to go.
> I suggest to add it to the Phobos library as soon as possible.
>
> -- 
> Helmut Leitner    leitner@hls.via.at
> Graz, Austria   www.hls-software.com


September 13, 2003

"Sean L. Palmer" wrote:
> 
> I would want it to be called "string" not "String" just to be consistent with the rest of the basic types.  It's for that same reason that I don't like "string_t".  If you have _t on the end of the type names it should be on all the type names, and I don't think that is a good idea.

Ok. It doesn't make a difference between "string" and "String" as long
as we agree that there should never exist a situation where a
  - string primitive    and a
  - String class
should exist at the same time.

> I think I'm with Mike though;   a string should be more than a simple typedef for char[].  It should support unicode for one.  We should make all the string functions work on string instead of char[], and have an implicit conversion from char[] to string.  String literals should be of type string instead of char[] as well.

But that means that you project a major redesign of the language that will effect almost all existing code! D is just becoming popular. When do you want to do this?

Why not add another, more powerful string class, named e. g. Ustring at any time, without any problems and allow for a gradual transition.

> If you want user extendability and no bloat, then all the "methods" of the string should in fact be global functions taking a string as argument.

That's right.

> I'd like to get away from the printf-style formatting and go to something more like:
> 
> string format(string formatstring, formatobject[]);
> 
> used like
> 
> string res = format("The %0 is %1 %2.", "moon", "very", "bright");
> 
> Then you can translate it:
> 
> string res = format("La %0 es %1 %2.", "luna", "muy", "brillante");
> 
> Even rearrange the text during translation:
> 
> string res = format("%1 %2 la %0 es.", "luna", "muy", "brillante");
> 
> And formatobject can be made to support any kind of formatting.
> 
> string res = format("%0", rightjustify("foo",12));
> 
> string res = format("%0", floatprecision(math.pi,16,12));

I like this too, but I think it has nothing to do with the current "String" discussion. There will always be a need to do this on a low level (to a char [], to an outbuffer) as well.

-- 
Helmut Leitner    leitner@hls.via.at
Graz, Austria   www.hls-software.com
September 14, 2003
"Sean L. Palmer" <palmer.sean@verizon.net> wrote in message news:bjvnjq$1opq$1@digitaldaemon.com...
> I would want it to be called "string" not "String" just to be consistent with the rest of the basic types.  It's for that same reason that I don't like "string_t".  If you have _t on the end of the type names it should be on all the type names, and I don't think that is a good idea.

_t is for typedef, not for type (at least in this case)

> I think I'm with Mike though;   a string should be more than a simple typedef for char[].  It should support unicode for one.  We should make
all
> the string functions work on string instead of char[], and have an
implicit
> conversion from char[] to string.  String literals should be of type
string
> instead of char[] as well.
>
> If you want user extendability and no bloat, then all the "methods" of the string should in fact be global functions taking a string as argument.

Yes, this would have to be the way. There will always be the "one essential method that is missing", and it's just instinctively wrong to keep lumping stuff into the one class. Look at std::basic_string!




September 15, 2003
"Matthew Wilson" <matthew@stlsoft.org> ha scritto nel messaggio news:bjutpt$n4f$1@digitaldaemon.com...
> I could be convinced, if it was string_t. The reason is that this would
then
> be unambiguously a type(def) rather than a fully-fledged class.

If string was a base type I don't think there would be any need for the _t suffix, just as int is not int_t.

Furthermore, the reason why String classes exist is mainly because there's not as string type...

Ric


September 15, 2003
> > I could be convinced, if it was string_t. The reason is that this would
> then
> > be unambiguously a type(def) rather than a fully-fledged class.
>
> If string was a base type I don't think there would be any need for the _t suffix, just as int is not int_t.

Sure. My point was that because what Helmut wanted was specifically not a unique type, that the _t was appropriate, a visual reminder to all users that they're using an alias.

However, today I've hypocrited myself by defining a "boolean" alias (from
int, of course :) ) in the SynSoft libraries.

What can ya do??

> Furthermore, the reason why String classes exist is mainly because there's not as string type...

I don't know enough about the various localisation issues to comment on that side of things, but I'm very nervous about having a string type, purely out of a fear of feature-creep.



September 15, 2003
"Matthew Wilson" <matthew@stlsoft.org> ha scritto nel messaggio news:bk448q$1j2a$2@digitaldaemon.com...
> However, today I've hypocrited myself by defining a "boolean" alias (from
> int, of course :) ) in the SynSoft libraries.
>
> What can ya do??

Now _that_'s another story... I wish there were bool8, bool16 and bool32, and surely not as aliases of bit... They are simply ugly, but a great help to serialization and interfacing to APIs.

> I don't know enough about the various localisation issues to comment on
that
> side of things, but I'm very nervous about having a string type, purely
out
> of a fear of feature-creep.

If it were a type (not a class) and functions acting on it were simply
global functions, there shouldn't be much to worry about.
Localization (case insensivity, collating order, non-Latin alphabets...)
must be dealt with anyway; IMO a string type would even help the development
of localization functions.

Ric


September 15, 2003
Why should you care if it's a builtin or a typedef?  Works the same either way, it's opaque to the programmer.  You need to visit the declaration for the type and familiarize yourself with it.  This is always true.. you can't just assume anything about it.  Little 'hints' like _t are often misleading; they're just going to encourage people like me to "alias string_t string;"

People are moving away from hungarian notation, decorated names.  It's hard to maintain, and it's an eyesore, and nowadays you just put your cursor on a symbol and hit a button and it brings you right to the declaration... how much more can it hold your hand than that?  Plus then "typedef char[] string_t;" is incompatible with char[] without a cast, you'd want "alias char[] string_a;"  hehe and if string was a class instead, it'd have to be what, "class CString {}" ?!    Here we go with the name proliferation again.

I just don't see much point in cluttering up the type names.  If *you* want to do it, well there's always alias... that's what people tell me when I bitch about not liking the identifiers.

Sean

"Riccardo De Agostini" <riccardo.de.agostini@email.it> wrote in message news:bk429b$1gi5$10@digitaldaemon.com...
> "Matthew Wilson" <matthew@stlsoft.org> ha scritto nel messaggio news:bjutpt$n4f$1@digitaldaemon.com...
> > I could be convinced, if it was string_t. The reason is that this would
> then
> > be unambiguously a type(def) rather than a fully-fledged class.
>
> If string was a base type I don't think there would be any need for the _t suffix, just as int is not int_t.
>
> Furthermore, the reason why String classes exist is mainly because there's not as string type...
>
> Ric