DIP 1015--Deprecation of Implicit Conversion of Int. & Char. Literals to bool--Formal Assement (page 8)

On Wednesday, 14 November 2018 at 18:11:59 UTC, Neia Neutuladh wrote: > On Tue, 13 Nov 2018 20:27:05 -0800, Walter Bright wrote: >> There have been various attempts over the years to "fix" various things in the D matching system by adding "just one more" match level. > > I kind of feel like, if something would be confusing like this, maybe the compiler shouldn't be making an automatic decision. Not "just one more" match level, but just...don't match. If there are multiple matching overloads, just error out. Don't try to be clever and surprise people, just tell the user to be more explicit. That type of behavior is best left to the programmer defining the public interface. -Alex

November 14, 2018

Re: DIP 1015--Deprecation of Implicit Conversion of Int. & Char. Literals to bool--Formal Assement

Posted by H. S. Teoh
in reply to Carl Sturtivant

Permalink

H. S. Teoh

Posted in reply to Carl Sturtivant

Permalink

On Wed, Nov 14, 2018 at 06:59:30PM +0000, Carl Sturtivant via Digitalmars-d-announce wrote:
> On Monday, 12 November 2018 at 10:05:09 UTC, Jonathan M Davis wrote:
> > *sigh* Well, I guess that's the core issue right there. A lot of us would strongly disagree with the idea that bool is an integral type and consider code that treats it as such as inviting bugs. We _want_ bool to be considered as being completely distinct from integer types. The fact that you can ever pass 0 or 1 to a function that accepts bool without a cast is a problem in and of itself.

+1.

Honestly, I think 'bool' as understood by Walter & Andrei ought to be renamed to 'bit', i.e., a numerical, rather than logical, value.

Of course, that still doesn't address the conceptually awkward behaviour of && and || returning a numerical value rather than a logical true/false state.

The crux of the issue is whether we look at it from an implementation POV, or from a conceptual POV.  Since there's a trivial 1-to-1 mapping from a logical true/false state to a binary digit, it's tempting to conflate the two, but they are actually two different things. It just so happens that in D, a true/false state is *implemented* as a binary value of 0 or 1.  Hence, if you think of it from an implementation POV, it sort of makes sense to treat it as a numerical entity, since after all, at the implementation level it's just a binary digit, a numerical entity. However, if you look at it from a conceptual POV, the mapping true=>1, false=>0 is an arbitrary one, and nothing about the truth values true/false entail an ability to operate on them as numerical values, much less promotion to multi-bit binary numbers like int.

I argue that viewing it from an implementation POV is a leaky abstraction, whereas enforcing the distinction of bool from integral types is more encapsulated -- because it hides away the implementation detail that a truth value is implemented as a binary digit.

It's a similar situation with char vs. ubyte: if we look at it from an implementation point of view, there is no need for the existence of char at all, since at the implementation level it's not any different from a ubyte.  But clearly, it is useful to distinguish between them, since otherwise why would Walter & Andrei have introduced distinct types for them in the first place?  The usefulness is that we can define char to be a UTF-8 code unit, with a different .init value, and this distinction lets the compiler catch potentially incorrect usages of the types in user code.  (Unfortunately, even here there's a fly in the ointment that char also implicitly converts to int -- again you see the symptoms of viewing things from an implementation POV, and the trouble that results, such as the wrong overload being invoked when you pass a char literal that no-thanks to VRP magically becomes an integral value.)

> > But it doesn't really surprise me that Walter doesn't agree on that point, since he's never agreed on that point, though I was hoping that this DIP was convincing enough, and its failure is certainly disappointing.

I am also disappointed.  One of the reasons I like D so much is its powerful abstraction mechanisms, and the ability of user types to behave (almost) like built-in types.  This conflation of bool with its implementation as a binary digit seems to be antithetical to abstraction and encapsulation, and frankly does not leave a good taste in the mouth. (Though I will concede that it's a minor enough point that it wouldn't be grounds for deserting D. But still, it does leave a bad taste in the mouth.)

> I'm at a loss to see any significant advantage to having bool as a part of the language itself if it isn't deliberately isolated from `integral types`.

Same thing with implicit conversion to/from char types and integral types.  I understand the historical / legacy reasons behind both cases, but I have to say it's rather disappointing from a modern programming language design point of view.

T

-- 
Written on the window of a clothing store: No shirt, no shoes, no service.

On Wednesday, 14 November 2018 at 02:45:38 UTC, Walter Bright wrote: > On 11/13/2018 3:29 PM, Rubn wrote: > > enum A : int { a = 127 } > > `a` is a manifest constant of type `A` with a value of `127`. > > Remember that `A` is not an `int`. It is implicitly convertible to an integer type that its value will fit in (Value Range Propagation). Other languages do not have VRP, so expectations from how those languages behave do not apply to D. VRP is a nice feature, it is why: > > enum s = 100; // typed as int > enum t = 300; // also typed as int > ubyte u = s + 50; // works, no cast required, > // although the type is implicitly converted > ubyte v = t + 50; // fails > > In your articles, it is crucial to understand the difference between a manifest constant of type `int` and one of type `A`. At least can you understand where the problem lies? If you have code like this: foo(Enum.value); Then it gets changed: // ops might be calling a different function now foo(runtimeCond ? Enum.value : Enum.otherValue); Or how about if we just add another enum to our list: enum Enum : int { // ... // add new enum here, shifting the values down value, // 126 -> 127 otherValue, // 127 -> 128 - Ops now we are calling a different function ~somewhere~ // ... } From your implementation perspective I can see why it is a good thing. But from my user's perspective this just screams unreliable chaotic mess, even in the most trivial examples. What D does is only suitable for the absolute most trivial example: enum int s = 100; ubyte v = s; // ok no cast required But even just a slightly more trivial example like we have now, and it falls apart: enum int s = 100; void foo(int); void foo(byte); foo(s); // Not suitable for determining overloads // though is good for variable initialization Not one's really asking to add another layer to anything. Merely to not treat named enum types as if they are just constants like anonymous enums. ubyte a = Enum.value; // this is ok foo(Enum.value); // this needs to be x1000 more reliable

On Wed, 14 Nov 2018 13:40:46 -0500, Steven Schveighoffer wrote: > You don't think this is confusing? > > enum A : int { > val > } > > A a; > foo(a); // error: be more specific > int x = a; > foo(x); // Sure I find this confusing: void foo(int i) {} void foo(ubyte b) {} enum A : int { val = 0 } foo(A.val); // calls foo(ubyte) A a = A.val; foo(a); // calls foo(int) If it instead produced an error, the error would look like: Error: foo called with argument types (E) matches both: example.d(1): foo(int i) and: example.d(2): foo(ubyte i) Or else: Error: none of the overloads of foo are callable using argument types (A), candidates are: example.d(1): foo(int i) example.d(2): foo(ubyte i) These aren't the intuitively obvious thing to me, but they're not going to surprise me by calling the wrong function, and there are obvious ways to make the code work as I want. Of the two, I'd prefer the former. The intuitively obvious thing for me is: * Don't use VRP to select an overload. Only use it if there's only one candidate with the right number of arguments. * Don't use VRP if the argument is a ctor, cast expression, or symbol expression referring to a non-builtin. Maybe disallow with builtins. * Don't use VRP if the argument is a literal with explicitly indicated type (0UL shouldn't match to byte, for instance). I think this would make things more as most people expect: foo(A.val); // A -> int, but no A -> byte; calls foo(int) foo(0); // errors (currently calls foo(int)) foo(0L); // errors (currently calls foo(ubyte)) foo(cast(ulong)0); // errors (currently calls foo(ubyte)) And when there's only one overload: void bar(byte b) {} bar(A.val); // errors; can't convert A -> byte bar(0); // type any-number and fits within byte, so should work bar(0UL); // errors; explicit incorrect type bar(0UL & 0x1F); // bitwise and expression can do VRP bar("foo".length); // length is a builtin; maybe do VRP? bar(byte.sizeof); // sizeof is a builtin; maybe do VRP?

On 11/14/18 4:32 PM, Neia Neutuladh wrote: > On Wed, 14 Nov 2018 13:40:46 -0500, Steven Schveighoffer wrote: >> You don't think this is confusing? >> >> enum A : int { >> val >> } >> >> A a; >> foo(a); // error: be more specific >> int x = a; >> foo(x); // Sure > > I find this confusing: > > void foo(int i) {} > void foo(ubyte b) {} > enum A : int { val = 0 } > foo(A.val); // calls foo(ubyte) > A a = A.val; > foo(a); // calls foo(int) > > If it instead produced an error, the error would look like: > > Error: foo called with argument types (E) matches both: > example.d(1): foo(int i) > and: > example.d(2): foo(ubyte i) I'm reminded of my son, who I sometimes give him a glass of water when he asks for it, and after I hand it to him, he says "should I drink this?" To me, making the user jump through these hoops will be insanely frustrating. > > Or else: > > Error: none of the overloads of foo are callable using > argument types (A), candidates are: > example.d(1): foo(int i) > example.d(2): foo(ubyte i) > > These aren't the intuitively obvious thing to me, but they're not going to > surprise me by calling the wrong function, and there are obvious ways to > make the code work as I want. Of the two, I'd prefer the former. I prefer the correct version which calls foo(int). I think truly this is a misapplication of the overload rules, and can be fixed. > The intuitively obvious thing for me is: > > * Don't use VRP to select an overload. Only use it if there's only one > candidate with the right number of arguments. > * Don't use VRP if the argument is a ctor, cast expression, or symbol > expression referring to a non-builtin. Maybe disallow with builtins. > * Don't use VRP if the argument is a literal with explicitly indicated type > (0UL shouldn't match to byte, for instance). To me, an enum based on int is an int before it's a typeless integer value. VRP should be trumped by type (which it normally is). I would say an enum *derives* from int, and is a more specialized form of int. It is not derived from byte or ubyte. For it to match those overloads *over* int is surprising. Just like you wouldn't expect the `alias this` inside a class to match an overload over its base class. Oh, crap, it actually does... But I think that's a different bug. > > I think this would make things more as most people expect: > > foo(A.val); // A -> int, but no A -> byte; calls foo(int) > foo(0); // errors (currently calls foo(int)) If you have foo(int) and for some reason you can't call it with foo(0), nobody is going to expect or want that. > foo(0L); // errors (currently calls foo(ubyte)) > foo(cast(ulong)0); // errors (currently calls foo(ubyte)) I'm actually OK with this being the way it is, or even if it called the foo(int) version. Either way, as long as there is a clear definition of why it does that. > > And when there's only one overload: > > void bar(byte b) {} > bar(A.val); // errors; can't convert A -> byte > bar(0); // type any-number and fits within byte, so should work > bar(0UL); // errors; explicit incorrect type > bar(0UL & 0x1F); // bitwise and expression can do VRP > bar("foo".length); // length is a builtin; maybe do VRP? > bar(byte.sizeof); // sizeof is a builtin; maybe do VRP? > I am OK with VRP calls, even in the case of the enum when there's no int overload. -Steve

Forums