August 21, 2004
"Matthew" <admin.hat@stlsoft.dot.org> wrote in message
> Of course, this still requires use of argument-based string-type
deduction, as in
>
>     char[] toString(int , char[]);
>     dchar[] toString(int, dchar[]);
>
> but I reckon that facility is going to be needed anyway.

What follows is just my opinion, and could be completely misguided:

Part of the problem here, as noted earlier, is the covariant return type. To provide what you suggest above, Walter will probably need to hack the method-resolution code to perform a special case.

IMO, this is just another stake through the heart of the C++ method resolution approach ... such resolution should use everything about the method signature (including its return type) as part of the matching process. But it doesn't.

I'd guess it currently doesn't because it would be possible to get confused over which of two (or more) methods to select when the return value is not used, and where the rest of the signature matches exactly. However; there are ways of making that operate acceptably.  I'm not sure that it's such a big issue.

Still; I don't see the method-matching algorithm ever changing in D. What's currently there is clearly considered to be the most appropriate approach. Hence, the above will likely never become a reality without introducing more special-cases and more confusion. That's rather unfortunate, IMHO.

If D were strongly-typed (as is claimed), said algorithm would be quite different, and would likely lend itself more towards this topic without further confusion. The need for method-aliasing would also go away completely (and would likely satisfy 90% of folks who have an opinion on that subject). That should be reserved for a different topic though.

Another way to approach the current subject matter is like so:

char[] toString (out dchar[])

  and/or

char[] toString (out wchar[])

Which produce both outputs at the same time. That's seriously fugly, and as inefficient as one might expect. Please ... let's not do that!

"Hackety Hack ~ Don't Look Back"



August 22, 2004
>"Batman" <Batman_member@pathlink.com> wrote in message news:cg6pcs$1cch$1@digitaldaemon.com...
>> In article <cg68tk$11on$1@digitaldaemon.com>, antiAlias says...
>>
>> >providing
>> >a "dchar toString()" in each class is not covariant with the "char
>> >toString()" living in the root Object.
>>
>> What does covariant mean exactly?

It's just the opposit of contravariant.


>Well, I won't profer a definition of the word, but it is usually used in software engineering to describe the following condition
>
>class B
>{}
>class D : public B
>{}
>
>class X
>{
>    virtual B clone();
>}
>
>class Y : public X
>{
>    virtual D clone();
>}
>
>Because D is (publicly) derived from B, it is legitimate for Y's overload of X's clone() method to return D instead of
>B.
>
>This is because inheritance is an "Is-A" relationship. Hence, any D is-a B.
>
>Since X's clone() requires a B, Y's clone() can return a D, because a D is-a B.
>
>Make sense? (I hope so, 'cos that's my top shelf explanation. You'll have to hope for other posts if not. <G>)

Well, in D (and in C++) derived classes are covariant to their base classes in other languages they possibly aren't.

Even in D classes aren't covariant to the interfaces they implement.


-- Matthias Becker


August 22, 2004
In article <cg68tk$11on$1@digitaldaemon.com>, antiAlias says...
>
>The problems with that particular approach are twofold:
>
>2) more importantly: it doesn't work for unicode strings, because providing
>a "dchar toString()" in each class is not covariant with the "char
>toString()" living in the root Object. I wish there was an nice, clean,
>elegant solution to this ...

I've wondered about that myself, but I guess having toString() return char[] is
not so bad. The magic of UTF-8 does, after all, allow us to store every
character in a char[] (even though not in a char).

But it would be really, really, /really/ cool, if all string types would *implicitly* cast to one another, *and* go through the relevant std.utf conversion routine to do so. Then classes could implement any of the following at their choice:

#    char[] toString();
#    wchar[] toString();
#    dchar[] toString();

Walter has opposed the notion that even /explicit/ casts from string to string should not do any conversion. I suggest:

#    char[] c;
#    dchar[] d;
#
#    c = d;                            // calls toUTF8()
#    c = cast(char[]) d;               // calls toUTF8()
#    c = cast(char[]) cast(void[]) d;  // does not call toUTF8()

The last case is to cover the unlikely circumstance that you might want to "paint" the { length, address } structure. Obviously, casting from string to non-string or vice-versa should use the paint method as now.

The current syntax is confusing. For example:

#    char[] c;
#    dchar[] d;
#
#    d = "hello";                      // converts
#    d = cast(dchar[]) "hello";        // converts
#
#    c = "hello";
#    d = c;                            // does not convert
#    d = cast(dchar[]) c;              // does not convert

The explanation is that in these cases, toUTFxx() is not called - instead, the
conversion happens at compile-time (the string being a compile-time constant).

But I would prefer it if strings were much easier to interchange. Maybe there is a need for a String class after all? Something like:

#    struct String
#    {
#        enum { UTF8, UTF16, UTF32 } Encoding;
#        Encoding encoding;
#        union
#        {
#            char[] c;
#            wchar[] w;
#            dchar[] d;
#        }
#        /* loads of useful functions */
#    }

I dunno. Just throwing ideas around.
Arcane Jill


August 22, 2004
In article <cg9rfu$brl$1@digitaldaemon.com>, Arcane Jill says...

Typo correction:

>Walter has opposed the notion that even /explicit/ casts from string to string should not do any conversion.

should read:

"Walter has opposed the notion that even /explicit/ casts from string to string should do any conversion."

(I don't normally bother to correct typos, but in this case I didn't want anyone being accidently misrepresented).

Jill


August 22, 2004
Hmm. It's always a bit dodgy to have casts do any implicit work, but since short->double does work, I don't really see why casting dchar[]->wchar[].

I can't think of any major objection why we shouldn't have it, at least in explicit form.

"Arcane Jill" <Arcane_member@pathlink.com> wrote in message news:cg9rfu$brl$1@digitaldaemon.com...
> In article <cg68tk$11on$1@digitaldaemon.com>, antiAlias says...
> >
> >The problems with that particular approach are twofold:
> >
> >2) more importantly: it doesn't work for unicode strings, because providing
> >a "dchar toString()" in each class is not covariant with the "char
> >toString()" living in the root Object. I wish there was an nice, clean,
> >elegant solution to this ...
>
> I've wondered about that myself, but I guess having toString() return char[] is
> not so bad. The magic of UTF-8 does, after all, allow us to store every
> character in a char[] (even though not in a char).
>
> But it would be really, really, /really/ cool, if all string types would *implicitly* cast to one another, *and* go through the relevant std.utf conversion routine to do so. Then classes could implement any of the following at their choice:
>
> #    char[] toString();
> #    wchar[] toString();
> #    dchar[] toString();
>
> Walter has opposed the notion that even /explicit/ casts from string to string should not do any conversion. I suggest:
>
> #    char[] c;
> #    dchar[] d;
> #
> #    c = d;                            // calls toUTF8()
> #    c = cast(char[]) d;               // calls toUTF8()
> #    c = cast(char[]) cast(void[]) d;  // does not call toUTF8()
>
> The last case is to cover the unlikely circumstance that you might want to "paint" the { length, address } structure. Obviously, casting from string to non-string or vice-versa should use the paint method as now.
>
> The current syntax is confusing. For example:
>
> #    char[] c;
> #    dchar[] d;
> #
> #    d = "hello";                      // converts
> #    d = cast(dchar[]) "hello";        // converts
> #
> #    c = "hello";
> #    d = c;                            // does not convert
> #    d = cast(dchar[]) c;              // does not convert
>
> The explanation is that in these cases, toUTFxx() is not called - instead, the
> conversion happens at compile-time (the string being a compile-time constant).
>
> But I would prefer it if strings were much easier to interchange. Maybe there is a need for a String class after all? Something like:
>
> #    struct String
> #    {
> #        enum { UTF8, UTF16, UTF32 } Encoding;
> #        Encoding encoding;
> #        union
> #        {
> #            char[] c;
> #            wchar[] w;
> #            dchar[] d;
> #        }
> #        /* loads of useful functions */
> #    }
>
> I dunno. Just throwing ideas around.
> Arcane Jill
>
>


August 22, 2004
In article <cg9u7c$dtp$1@digitaldaemon.com>, Matthew says...

>Hmm. It's always a bit dodgy to have casts do any implicit work,

Not "always", but often. The following line of D code has an implicit cast in it:

#    double x = 42;

and I don't actually consider that dodgy at all. What I'm suggesting is similar, because there is no loss of information.

But I don't have the answer - I'm only citing the problem. The problem is that Object defines this function:

#    char[] toString();

That's a problem because an internationalized class like - oh, let's say Name (of a human being) is likely to have an internal representation as a dchar[] - and this is precisely what internationalization writer classes are going to require - a succession of dchars. toString() is called in format() because it's in Object and therefore in everything. The upshot of all this is that to send our Name to a console using transcoding techniques you would have to:

(1) have toString() call toUTF8() on the internal dchar[]
(2) call toUTF32() to get back the dchars to feed into the writer
(3) transcode yet again between the writer and the console

It would be nicer if toString() could return /any/ kind of string. Or if
toString() returned dchar[] only (but that would break existing code). Or if
there were a String class which abstracted away the implementation, and
toString() returned that (which would also break existing code). Or if we had
member template functions and could define toString!(dchar[])() or something.

Like I said, I don't have answers, just concerns. All I know is that forcing everything to go via UTF-8 at every toString() juncture is not the most efficient way of going about things.

Arcane Jill



August 22, 2004
>>Hmm. It's always a bit dodgy to have casts do any implicit work,
>
>Not "always", but often. The following line of D code has an implicit cast in it:
>
>#    double x = 42;
>
>and I don't actually consider that dodgy at all. What I'm suggesting is similar, because there is no loss of information.
>
>But I don't have the answer - I'm only citing the problem. The problem is that Object defines this function:
>
>#    char[] toString();
>
>That's a problem because an internationalized class like - oh, let's say Name (of a human being) is likely to have an internal representation as a dchar[] - and this is precisely what internationalization writer classes are going to require - a succession of dchars. toString() is called in format() because it's in Object and therefore in everything. The upshot of all this is that to send our Name to a console using transcoding techniques you would have to:
>
>(1) have toString() call toUTF8() on the internal dchar[]
>(2) call toUTF32() to get back the dchars to feed into the writer
>(3) transcode yet again between the writer and the console
>
>It would be nicer if toString() could return /any/ kind of string.
How to do that? All versions would have to be defined in Object. But what happens if you implement some of them but not all? This could be very confusing.

>Or if
>toString() returned dchar[] only (but that would break existing code).

Yes, but I like this idea more then the previous one.

>Or if
>there were a String class which abstracted away the implementation, and
>toString() returned that (which would also break existing code).

Many people suggested string classes from time to time, but Walter seems not to like that idea. Well, you would get my vote.


>Like I said, I don't have answers, just concerns. All I know is that forcing everything to go via UTF-8 at every toString() juncture is not the most efficient way of going about things.

Right.

-- Matthias Becker


August 22, 2004
In article <cgaip0$sqf$1@digitaldaemon.com>, Matthias Becker says...

>>It would be nicer if toString() could return /any/ kind of string.
>How to do that? All versions would have to be defined in Object.

No, no. Object can stay as it is. Subclasses would only define /one/ toString() function. Look at it like this. Right now, the following piece of code:

#   class A
#   {
#       dchar[] toString()
#       {
#           return cast(dchar[])"A";
#       }
#   }

won't compile. The error is "function toString overrides but is not covariant with toString". But, you see, if they /were/ considered covariant, then it would.


>But what
>happens if you implement some of them but not all?

If you implemented more than one, it would be a compile error, because you can't overload on return type. If you implemented precisely one, an implicit conversion would happen from the type you actually return to the type expected in the calling expression (which could be no conversion at all). If you implemented no versions, the superclass toString() would be used.


>>Or if
>>toString() returned dchar[] only (but that would break existing code).
>
>Yes, but I like this idea more then the previous one.

Me too. But Walter would take some convincing.


>>Or if
>>there were a String class which abstracted away the implementation, and
>>toString() returned that (which would also break existing code).
>
>Many people suggested string classes from time to time, but Walter seems not to like that idea. Well, you would get my vote.

Sure, but in my view, the problem isn't defining a string class per se, it's that Object.toString() would have to return it. Ergo, it would have to be a Phobos-defined string, and I don't think people would like to be tied to a particular string implementation (unless it was /really/ good).

Jill


August 22, 2004
On Sun, 22 Aug 2004 10:11:10 +0000 (UTC), Arcane Jill <Arcane_member@pathlink.com> wrote:
> In article <cg68tk$11on$1@digitaldaemon.com>, antiAlias says...
>>
>> The problems with that particular approach are twofold:
>>
>> 2) more importantly: it doesn't work for unicode strings, because providing
>> a "dchar toString()" in each class is not covariant with the "char
>> toString()" living in the root Object. I wish there was an nice, clean,
>> elegant solution to this ...
>
> I've wondered about that myself, but I guess having toString() return char[] is
> not so bad. The magic of UTF-8 does, after all, allow us to store every
> character in a char[] (even though not in a char).
>
> But it would be really, really, /really/ cool, if all string types would
> *implicitly* cast to one another, *and* go through the relevant std.utf
> conversion routine to do so. Then classes could implement any of the following
> at their choice:
>
> #    char[] toString();
> #    wchar[] toString();
> #    dchar[] toString();
>
> Walter has opposed the notion that even /explicit/ casts from string to string
> should not do any conversion. I suggest:

I believe Walters opposition was due to the fact that a conversion would create inconsistency between string types and ubyte etc, also that the ability to 'paint' one type as another is desired.

I think the fact that char, wchar, and dchar have a specified encoding sets them apart from other types, this fact makes painting one string type to another completely useless, I cannot think of a reason to paint a char[], wchar[] or dchar[] to each other? can you?

If you can then I suggest something like:

> #    c = cast(char[]) cast(void[]) d;  // does not call toUTF8()

will suffice.

It *does* make sense to paint char[], wchar[] or dchar[] as ubyte[] or void[] etc so what I suggest is that conversion does occur, but, only if both the source type and destination type have a specified encoding, i.e. char, wchar and dchar to char, wchar or dchar.

In conclusion I cannot see any valid reason not to make this change, I believe it makes string handling:
 - simpler
 - more consistent
 - less error prone

This change would make a string class totally useless, which I believe was Walters original intention when creating these types.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
August 22, 2004
Right. But I don't think this stuff should be done by a cast(). I mean, you can just as easily convert them "manually" using the functions in utf.d, can't you? Where automated conversion would really help is in stringizing (the original topic; I think):

class A {}

int x;
long y;
dchar[] z;
A a = new A;

char[] narrow = "string some stuff together " ~ z ~ y ~ x ~ a;

and the wide version:

dchar[] wide = "string some stuff together " ~ z ~ y ~ x ~ a;

If the ~ concatenators could convert between dchar[] and char[] appropriately, then Matthew's idea about the type being specified by the left-hand side would probably work. Though, on reflection, this seem like an awful lot of work for an operator to perform. Particularly so when it's a special-case, as it is here (e.g. int[] does not does anything fancy like this).

Instead, how about a concat(...) method? It's not hard to make a typesafe
one that can do whatever conversion one desires (including calling
toString() and converting as necessary). Hell; you could have two concat()
methods: one for a dchar[] result and one for a char[] result.

return "my granny is "~age~" old";

becomes

return concat ("my granny is ", age, " old");

Is that really so awful?


Regardless; I think there's still an issue about toString() not handling dchar[]. Although you can utf8 encode the content, that's hardly a convenience, or exactly efficient.



"Regan Heath" <regan@netwin.co.nz> wrote in message news:opsc5nxehm5a2sq9@digitalmars.com...
> On Sun, 22 Aug 2004 10:11:10 +0000 (UTC), Arcane Jill
> <Arcane_member@pathlink.com> wrote:
> > In article <cg68tk$11on$1@digitaldaemon.com>, antiAlias says...
> >>
> >> The problems with that particular approach are twofold:
> >>
> >> 2) more importantly: it doesn't work for unicode strings, because
> >> providing
> >> a "dchar toString()" in each class is not covariant with the "char
> >> toString()" living in the root Object. I wish there was an nice, clean,
> >> elegant solution to this ...
> >
> > I've wondered about that myself, but I guess having toString() return
> > char[] is
> > not so bad. The magic of UTF-8 does, after all, allow us to store every
> > character in a char[] (even though not in a char).
> >
> > But it would be really, really, /really/ cool, if all string types would
> > *implicitly* cast to one another, *and* go through the relevant std.utf
> > conversion routine to do so. Then classes could implement any of the
> > following
> > at their choice:
> >
> > #    char[] toString();
> > #    wchar[] toString();
> > #    dchar[] toString();
> >
> > Walter has opposed the notion that even /explicit/ casts from string to
> > string
> > should not do any conversion. I suggest:
>
> I believe Walters opposition was due to the fact that a conversion would create inconsistency between string types and ubyte etc, also that the ability to 'paint' one type as another is desired.
>
> I think the fact that char, wchar, and dchar have a specified encoding sets them apart from other types, this fact makes painting one string type to another completely useless, I cannot think of a reason to paint a char[], wchar[] or dchar[] to each other? can you?
>
> If you can then I suggest something like:
>
> > #    c = cast(char[]) cast(void[]) d;  // does not call toUTF8()
>
> will suffice.
>
> It *does* make sense to paint char[], wchar[] or dchar[] as ubyte[] or void[] etc so what I suggest is that conversion does occur, but, only if both the source type and destination type have a specified encoding, i.e. char, wchar and dchar to char, wchar or dchar.
>
> In conclusion I cannot see any valid reason not to make this change, I
> believe it makes string handling:
>   - simpler
>   - more consistent
>   - less error prone
>
> This change would make a string class totally useless, which I believe was Walters original intention when creating these types.
>
> Regan
>
> --
> Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/