Jump to page: 1 24  
Page
Thread overview
char[] auto-conversion conflicts with method overloading
Sep 03, 2004
antiAlias
Sep 03, 2004
David L. Davis
Sep 06, 2004
Regan Heath
Sep 06, 2004
Arcane Jill
Sep 06, 2004
antiAlias
Sep 06, 2004
Regan Heath
Sep 06, 2004
antiAlias
Sep 06, 2004
Regan Heath
Sep 07, 2004
antiAlias
Sep 07, 2004
Regan Heath
Sep 07, 2004
antiAlias
Sep 07, 2004
Sean Kelly
Sep 07, 2004
antiAlias
Sep 07, 2004
Regan Heath
Sep 07, 2004
Arcane Jill
Sep 07, 2004
Regan Heath
Sep 07, 2004
antiAlias
Sep 08, 2004
Regan Heath
Sep 08, 2004
antiAlias
Sep 08, 2004
Regan Heath
Sep 09, 2004
Matthew
Sep 09, 2004
pragma
Sep 10, 2004
Matthew
Sep 10, 2004
Sean Kelly
Sep 10, 2004
Matthew
Sep 09, 2004
Matthew
Sep 07, 2004
Sean Kelly
Sep 07, 2004
Regan Heath
Sep 07, 2004
Derek Parnell
Sep 10, 2004
David L. Davis
Sep 10, 2004
Matthew
September 03, 2004
There was some commentary about the benefits of auto-conversion between char[], wchar[] and dchar[], where the compiler will automatically convert array literals as it sees fit. So, for example, if you have a method:

# void myFunc (wchar[] string) {...}

and call it with:

# myFunc ("a char array");

the compiler will convert the literal to a wchar[] instead of a char[]. This is apparently considered a GoodThing. Recently, there's been a lot of talk regarding additional automatic conversion, between UTF8 and its wchar[] and dchar[] representations.

Unfortunately, all this implicit conversion conflicts badly with method resolution. For instance, if I have two methods:

# myFunc (char[] string) { ... }
# myFunc (wchar[] string) { ... }

the compiler now can't tell which one should be called (vis-a-vis the prior example). To get around this, one has to cast the string literal like so:

# myFunc (cast(wchar[]) "a char array");
# myFunc (cast(char[]) "a char array");

Ugly. What some folks do to get around this is to add different method names for the same functionality, a la Win32:

# writeString (char[] x);
# writeStringW (wchar[] x);

They are forced into this approach to avoid having to use those ugly casts everywhere. This is what Streams.d does, along with others. Okay, so some might ask why this is a problem? Well, there are certain method names that are fixed in stone by the compiler, and you can't add a suffix even if you wanted to:

# class MyClass
# {
#     this (char[] initialContent) { ... }
#     this (wchar[] initialContent) { ... }
# }

See the problem? The compiler cannot resolve which constructor to use when you write

# new MyClass ("blah blah blah");

Instead, one is forced to use a cast:

# new MyClass (cast(char[]) "blah blah blah");

One can hardly add a "W" suffix to the keyword "this". The same thing happens for operator overloads too. One way around this is to introduce string prefixes, such as w"this is a wchar string". This has been suggested before, and it would certainly get rid of those ugly casts (I was under the misguided impression that casts should not be used on a general basis). BTW: the w"string" prefix is not a cast; it's a storage-attribute. I'd like to suggest such prefixes be supported and adopted.

Anyway; the reason for the post is to make folk aware of the kinds of problems introduced when a compiler performs implicit conversions. This is bound to become more troublesome if additional "convenience" conversions occur implicitly within the D language, such as the oft discussed UTF8 conversions.

Please consider.




September 03, 2004
antiAlias: I for one vote +1 for the addition of a w"" and d"" suffix for literal strings. (Thanks for pointing out all the issues!)


-------------------------------------------------------------------
"Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
September 06, 2004
On Fri, 3 Sep 2004 10:51:58 -0700, antiAlias <fu@bar.com> wrote:

<snip>

> Unfortunately, all this implicit conversion conflicts badly with method
> resolution. For instance, if I have two methods:
>
> # myFunc (char[] string) { ... }
> # myFunc (wchar[] string) { ... }
>
> the compiler now can't tell which one should be called (vis-a-vis the prior example).

Why does it matter which method is chosen?
Presumably they both do the same thing?
If not, why not? (Bad design?)

In an event where it does matter (I cannot think of one), then a cast is the best way to define the 'right' method if simply because the next guy to look at the code will know exactly what you intended.

<snip>

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
September 06, 2004
In article <opsdvuwlje5a2sq9@digitalmars.com>, Regan Heath says...
>On Fri, 3 Sep 2004 10:51:58 -0700, antiAlias <fu@bar.com> wrote:

>> Unfortunately, all this implicit conversion conflicts badly with method resolution. For instance, if I have two methods:
>>
>> # myFunc (char[] string) { ... }
>> # myFunc (wchar[] string) { ... }
>>
>> the compiler now can't tell which one should be called (vis-a-vis the prior example).
>
>Why does it matter which method is chosen?

i think antiAlias was referring to current behavior, not future possible behavior. If/when we have implicit run-time conversion between the three D string types, it won't matter. But right now, we only have implicit compile-time conversion for string constants, and this conflicts with method resolution.

AntiAlias needs two separate functions because we don't have implicit run-time conversion. He needs two separate functions because, if the parameters to myFunc are not compile-time-constants, they won't be implicitly converted.

The fix is, as you say, implicit run-time conversion. Then only one function would be required. (Though you might want to provide two or even three for efficiency - hopefully the compiler would be smart enough to call the right overload if it can avoid a conversion).

Jill


September 06, 2004
Unfortunately, it's not quite that simple. When the compiler sees more than one potentially matching method, it throws an error; implicit conversions tend to expose more that one match. Perhaps the compiler can be made a bit smarter in this respect, but there's often subtlety involved ~ keep in mind that a goal of the D compiler is ease of implementation. It's usually better to provide a clean mechanism to /support/ the conversion rather than try to second guess intent. However, using a cast() is not, IMO, the right way.


"Arcane Jill" <Arcane_member@pathlink.com> wrote in message
news:chh2ss$13j6$1@digitaldaemon.com...
In article <opsdvuwlje5a2sq9@digitalmars.com>, Regan Heath says...
>On Fri, 3 Sep 2004 10:51:58 -0700, antiAlias <fu@bar.com> wrote:

>> Unfortunately, all this implicit conversion conflicts badly with method resolution. For instance, if I have two methods:
>>
>> # myFunc (char[] string) { ... }
>> # myFunc (wchar[] string) { ... }
>>
>> the compiler now can't tell which one should be called (vis-a-vis the
>> prior example).
>
>Why does it matter which method is chosen?

i think antiAlias was referring to current behavior, not future possible
behavior. If/when we have implicit run-time conversion between the three D
string types, it won't matter. But right now, we only have implicit
compile-time
conversion for string constants, and this conflicts with method resolution.

AntiAlias needs two separate functions because we don't have implicit
run-time
conversion. He needs two separate functions because, if the parameters to
myFunc
are not compile-time-constants, they won't be implicitly converted.

The fix is, as you say, implicit run-time conversion. Then only one function would be required. (Though you might want to provide two or even three for efficiency - hopefully the compiler would be smart enough to call the right overload if it can avoid a conversion).

Jill



September 06, 2004
On Mon, 6 Sep 2004 07:16:44 +0000 (UTC), Arcane Jill <Arcane_member@pathlink.com> wrote:
> In article <opsdvuwlje5a2sq9@digitalmars.com>, Regan Heath says...
>> On Fri, 3 Sep 2004 10:51:58 -0700, antiAlias <fu@bar.com> wrote:
>
>>> Unfortunately, all this implicit conversion conflicts badly with method
>>> resolution. For instance, if I have two methods:
>>>
>>> # myFunc (char[] string) { ... }
>>> # myFunc (wchar[] string) { ... }
>>>
>>> the compiler now can't tell which one should be called (vis-a-vis the
>>> prior example).
>>
>> Why does it matter which method is chosen?
>
> i think antiAlias was referring to current behavior, not future possible
> behavior.

I know. Currently it conflicts, I was suggesting a fix (by asking leading questions), it could simply take the first one it finds, as in, if you have:

myFunc("hello world");

the constant string has not specific type, so could potentially match any of the 3 methods, however, it doesn't matter which method it matches as they all (presumably) do the same thing. (*)

If you instead have:

char[] a;
myFunc(a);

then it should match the char[] one, if it cannot find that method it should implicitly convert to the one it finds, either wchar[] or dchar[], it doesn't matter which it finds as (presumably) they all do the same thing. (*)

If a specific method is desired for some reason, then a cast appeals to me as the right solution, using it explicitly tells the next programmer to look at the code what you intended.

(*) Can anyone think of a potential bug this could cause? to me this implicit conversion idea seems identical to the implicit conversion that happens between long,int,short etc during method resolution.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
September 06, 2004
"Regan Heath" <regan@netwin.co.nz> wrote in message
(*) Can anyone think of a potential bug this could cause? to me this
implicit conversion idea seems identical to the implicit conversion that
happens between long,int,short etc during method resolution.

Yep; you pointed out the issue yourself, Regan (in the "Alias Peek-a-Boo Game") ~ implicit conversion of primitive types is the reason why all those wacky inherited-override edge-condition-examples exist, why the method-name resolution is castrated, and why method aliasing was 'invented' to cover it all up. That's a band-aid on top of a band-aid, and what you're suggesting is that the notion be extended to array-content also. I have to agree with the principal of consistency, but consistently broken is hardly an ideal.

Implicit conversion might look fine & dandy, but its beauty is truly skin-deep only. It simply does not agree with overloading, particularly where inheritance is involved.



September 06, 2004
On Mon, 6 Sep 2004 15:52:36 -0700, antiAlias <fu@bar.com> wrote:
> "Regan Heath" <regan@netwin.co.nz> wrote in message
> (*) Can anyone think of a potential bug this could cause? to me this
> implicit conversion idea seems identical to the implicit conversion that
> happens between long,int,short etc during method resolution.
>
> Yep; you pointed out the issue yourself, Regan (in the "Alias Peek-a-Boo
> Game") ~ implicit conversion of primitive types is the reason why all those wacky inherited-override edge-condition-examples exist, why the method-name resolution is castrated, and why method aliasing was 'invented' to cover it all up. That's a band-aid on top of a band-aid, and what you're suggesting is that the notion be extended to array-content also. I have to agree with the principal of consistency, but consistently broken is hardly an ideal.
>
> Implicit conversion might look fine & dandy, but its beauty is truly
> skin-deep only. It simply does not agree with overloading, particularly
> where inheritance is involved.

The name resolution rules are identical to those in C/C++ and Walter has given reasons for them being that way.

I would personally prefer a perfect solution to the problem, one where everyone is happy, given that no-one has voiced one, yet, I am happy with the status-quo.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
September 07, 2004
On Fri, 3 Sep 2004 10:51:58 -0700, antiAlias wrote:

> There was some commentary about the benefits of auto-conversion between char[], wchar[] and dchar[], where the compiler will automatically convert array literals as it sees fit. So, for example, if you have a method:
> 
> # void myFunc (wchar[] string) {...}
> 
> and call it with:
> 
> # myFunc ("a char array");
> 
> the compiler will convert the literal to a wchar[] instead of a char[]. This is apparently considered a GoodThing. Recently, there's been a lot of talk regarding additional automatic conversion, between UTF8 and its wchar[] and dchar[] representations.
> 
> Unfortunately, all this implicit conversion conflicts badly with method resolution. For instance, if I have two methods:
> 
> # myFunc (char[] string) { ... }
> # myFunc (wchar[] string) { ... }
> 
> the compiler now can't tell which one should be called (vis-a-vis the prior example). To get around this, one has to cast the string literal like so:
> 
> # myFunc (cast(wchar[]) "a char array");
> # myFunc (cast(char[]) "a char array");
> 
> Ugly. What some folks do to get around this is to add different method names for the same functionality, a la Win32:
> 
> # writeString (char[] x);
> # writeStringW (wchar[] x);
> 
> They are forced into this approach to avoid having to use those ugly casts everywhere. This is what Streams.d does, along with others. Okay, so some might ask why this is a problem? Well, there are certain method names that are fixed in stone by the compiler, and you can't add a suffix even if you wanted to:
> 
> # class MyClass
> # {
> #     this (char[] initialContent) { ... }
> #     this (wchar[] initialContent) { ... }
> # }
> 
> See the problem? The compiler cannot resolve which constructor to use when you write
> 
> # new MyClass ("blah blah blah");
> 
> Instead, one is forced to use a cast:
> 
> # new MyClass (cast(char[]) "blah blah blah");
> 
> One can hardly add a "W" suffix to the keyword "this". The same thing happens for operator overloads too. One way around this is to introduce string prefixes, such as w"this is a wchar string". This has been suggested before, and it would certainly get rid of those ugly casts (I was under the misguided impression that casts should not be used on a general basis). BTW: the w"string" prefix is not a cast; it's a storage-attribute. I'd like to suggest such prefixes be supported and adopted.
> 
> Anyway; the reason for the post is to make folk aware of the kinds of problems introduced when a compiler performs implicit conversions. This is bound to become more troublesome if additional "convenience" conversions occur implicitly within the D language, such as the oft discussed UTF8 conversions.
> 
> Please consider.

So in the general case, the compiler has to deal with literals in the source code text. Unless otherwise specified, the compiler needs to work out how to store and use the literal data.

D already has suffixes for integer and floating point literals, so the concept of decorating a literal to express it storage type is not new.

I think that an similar decoration (suffix?) for string literals makes a
lot of sense.

I also note an inconsistency in DMD. As you point out

 # myFunc (char[] string) { ... }
 # myFunc (wchar[] string) { ... }
 # myFunc( "abc" );

will cause a compiler error because it can't workout the type of literal you are using, but ...

 # myFunc (long string) { ... }
 # myFunc (int string) { ... }
 # myFunc( 2 );

does not cause an error and the compiler actually selects the correct 'myFunc' to use.

If you argue that in the second case, '2' is always an int and can never be a long, then why can't "abc" always be a char[] and never a wchar[]? But of course this breaks down because if 'myFunc(int)' was not defined then the literal '2' is assumed to be a long and not an int.

Inconsistent and not documented, I think.

-- 
Derek
Melbourne, Australia
7/Sep/04 9:50:33 AM
September 07, 2004
"Regan Heath" <regan@netwin.co.nz> wrote in message
The name resolution rules are identical to those in C/C++ and Walter has
given reasons for them being that way.

====================

That doesn't mean it's the best way to do it. Or even that it's a /good/ approach. I rather suspect the driving force behind all that was backward compatibility with existing C implicit conversions (rather than make such conversion-usage be explicit, for a more strongly-typed language). Note that D claims to be more strongly-typed ...

Heck; Dennis Ritchie even noted that while C types were influenced heavily by Algol-68, the designers of the latter would hardly approve of the type-implementation. I'd wager that he was talking about implicit conversions <g>

Of course, that doesn't mean D has to follow that same old route, since it doesn't have to compile raw, poorly written, C code. Nor does it mean we have to /like/ the current name-resolution implementation :-)

This name-resolution issue is one of the few things about D that feels flat-out wrong; especially when other modern languages are not bothered by such arcane nonsense. The one common thing about any design that makes my skin crawl is the "band-aid". I doubt very much that I'm alone in that respect.



« First   ‹ Prev
1 2 3 4