November 05, 2018
On Monday, November 5, 2018 2:49:32 PM MST bachmeier via Digitalmars-d wrote:
> On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis
>
> wrote:
> > I don't know how reasonable it is to fix it at this point, much as I would love to see it fixed.
>
> It's hard for me to see how it would be reasonable to not fix it. This is one of those ugly parts of the language that need to be evolved out of the language. If there's a reason to support this, it should be done with a compiler switch. I'm pretty sure that this was one of the weird things that hit me when I started with the language, it was frustrating, and it didn't make a good impression.

It really comes down to what code would break due to the change, how that code breakage could be mitigated, what the transition process would look like, and how Walter views the issue at this point. Historically, he has not seen this as a problem like many of us have, but his views have evolved somewhat over time. However, he's also become far more wary of breaking code with language changes.

If this change were proposed in a DIP, a clean transition would be required, and if a compiler flag were required, I don't know that it would ever happen. I don't recall a single transition that has required a compiler switch in D that has ever been completed. Some that have had a compiler switch to get the old behavior back have worked, but stuff like -property or -dip25 have never reached the finish line. -dip25 may yet get there given how it's tied into -dip1000, and I expect that Walter will find a way to push -dip1000 through given its importance for @safe, but it's still an open question how on earth we're going to transition to DIP 1000 being the normal behavior given how big a switch that is.

So, if someone can figure out how to cleanly transition behavior to get rid of the implicit conversion between character types and integer types without needing a -dipxxx switch to enable the new behavior, and they can argue it well enough to convince Walter, then we may very well get there, but otherwise, I expect that we're stuck. Either way, I think that we have to see how https://github.com/dlang/DIPs/blob/master/DIPs/DIP1015.md does first. If _that_ can't get through, I don't think that a DIP to fix implict conversions and char stands a chance. I'm currently expecting that DIP to be accepted, but you never know.

- Jonathan M Davis



November 05, 2018
On 11/5/18 4:11 PM, Jonathan M Davis wrote:
> n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via Digitalmars-d
> wrote:
>> On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d
> wrote:
>>> On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
>>>> I believe that following code shouldn't even compile, but it does
>>>> and gives non-printable symbol appended at the end of string.
>>>
>>> Me too, this is a design flaw in the language. Following C's example,
>>> int and char can convert to/from each other. So string ~ int will
>>> convert int to char (as in reinterpret cast) and append that.
>>>
>>> It is just the way it is, alas.
>>
>> I have said before, and will continue to say, that I think implicit
>> conversion between char and non-char types in D does not make sense.
>>
>> In C, converting between char and int is very common because of the
>> conflation of char with byte, but in D we have explicit types for byte
>> and ubyte, which should take care of any of those kinds of use cases,
>> and char is explicitly defined to be a UTF8 code unit.  Now sure, there
>> are cases where you want to get at the numerical value of a char --
>> that's what cast(int) and cast(char) is for.  But *implicitly*
>> converting between char and int, especially when we went through the
>> trouble of defining a separate type for char that stands apart from
>> byte/ubyte, does not make any sense to me.
>>
>> This problem is especially annoying with function overloads that take
>> char vs. byte: because of implicit conversion, often the wrong overload
>> ends up getting called WITHOUT ANY WARNING.  Once, while refactoring
>> some code, I changed a representation of an object from char to a byte
>> ID, but in order to do the refactoring piecemeal, I needed to overload
>> between byte and char so that older code will continue to compile while
>> the refactoring is still in progress.  Bad idea.  All sorts of random
>> problems and runtime crashes happened because C's stupid int conversion
>> rules were liberally applied to D types, causing a gigantic mess where
>> you never know which overload will get called. (Well OK, it's
>> predictable if you sit down and work it out, but it's just plain
>> annoying when a lousy char literal calls the byte overload whereas a
>> char variable calls the char overload.)  I ended up having to wrap the
>> type in a struct just to stop the implicit conversion from tripping me
>> up.
> 
> +1
> 
> Unfortunately, I don't know how reasonable it is to fix it at this point,
> much as I would love to see it fixed. Historically, I don't think that
> Walter could have been convinced, but based on some of the stuff he's said
> in recent years, I think that he'd be much more open to the idea now.
> However, even if he could now be convinced that ideally the conversion
> wouldn't exist, I don't know how easy it would be to get a DIP through when
> you consider the potential code breakage. But maybe it's possible to do it
> in a smooth enough manner that it could work - especially when many of the
> kind of cases where you might actually _want_ such a conversion already
> require casting anyway thanks to the rules about integer promotions and
> narrowing conversions (e.g. when adding or subtracting from chars).
> Regardless, it would have to be well-written DIP with a clean transition
> scheme. Having that DIP on removing the implicit conversion of integer and
> character literals to bool be accepted would be a start in the right
> direction though. If that gets rejected (which I sure hope that it isn't),
> then there's probably no hope for a DIP fixing the char situation.

It's not just ints to chars, but chars to wchars or dchars, and wchars to dchars.

Basically a character type should not convert from any other type. period. Because it's not "just a number" in a different format.

Do we need a DIP? Probably. but we have changed these types of things in the past from what I remember (I seem to recall we had at one point implicit truncation for adding 2 smaller numbers together). It is possible to still fix.

-Steve
November 05, 2018
On Monday, November 5, 2018 3:08:24 PM MST 12345swordy via Digitalmars-d wrote:
> We need to avoid the situation where we have to create a DIP for
> every unwanted implicit conversion with regards to calling the
> wrong overload function, we need
> better way of doing this. No one wants to wait a year for a DIP
> approval for something that is very minor such as deprecating a
> implicit conversion for native data types.
>
> I think a better course of action is to introduce the keywords explicit and implicit. Not as attributes though! I don't want to see functions with @nogc @nothrow safe pure @explicit as that is too much verbiage and hard to read! Which brings up the question of which parameter exactly is explicit?
>
> It much easier to read: void example(int bar, explicit int bob)
>
> The explicit keyword will become very important if we are to introduce the implicit keyword, as both of them are instrumental in creating types with structs.
>
> I don't mind writing a DIP regarding this, as I think this is much easier for the DIP to be accepted then the other one that I currently have.

This really shouldn't be decided on a per function basis. It's an issue on the type level and should be fixed with the types themselves. The OP's problem didn't even happen with a function. It happened with a built-in operator.

Regardless, if you attempt to add keywords to the language at this point, you will almost certainly lose. I would be _very_ surprised to see Walter or Andrei go for it. Whether you think attributes are easy to read or not, they don't eat up an identifier, and Walter and Andrei consider that to be very important. AFAIK, they also don't consider attributes to be a readability problem. So, even if trying to add some sort of implicit or explicit marker to parameters made sense (and I really don't think that it does), I think that Walter and Andrei have made it pretty clear that that sort of thing would have to be an attribute and not a keyword.

And honestly, I think that any DIP trying to add general control over implicit and explict conversions in the language has a _way_ lower chance of being accepted than one that gets rid of implicit conversions between character types and integer types. However, in the end, one does not depend on the other or even really have much to do with the other. A DIP to fix the implicit conversions between character types and integer types would be a DIP to fix precisely that, whereas a DIP to mark parameters with implicit or explicit would be about trying to control implicit or explicit conversions in general and not about character or integer types specifically, so while they might be tangentially related, they're really separate issues.

Given the recent DIP on copy constructors and the discussion there, it would not surprise me to see a future DIP about adding @implicit to constructors to allow for implicit construction, though I don't know how likely it is for such a DIP to be accepted given that D's approach (outside of built-in types anyway) has generally been to avoid implicit conversions to reduce the risk of bugs, and when combined with alias this, things really start to get interesting. But I would think that the chances of that getting accepted are far greater than adding attributes to parameters (be they keywords or actual attributes). Regardless, that's an issue of conversions in general, and not just implicit conversions between character types and integer types, which is really what the discussion is about fixing here, and that can be fixed regardless of what happens with providing additional control over implicit conversions in general.

- Jonathan M Davis



November 05, 2018
On Mon, Nov 05, 2018 at 05:43:19PM -0500, Steven Schveighoffer via Digitalmars-d wrote: [...]
> It's not just ints to chars, but chars to wchars or dchars, and wchars to dchars.
> 
> Basically a character type should not convert from any other type. period.  Because it's not "just a number" in a different format.

+1.  I recall having this conversation before.  Was this ever filed as a bug?  I couldn't find it this morning when I tried to look.


> Do we need a DIP? Probably. but we have changed these types of things in the past from what I remember (I seem to recall we had at one point implicit truncation for adding 2 smaller numbers together). It is possible to still fix.
[...]

If it's possible to fix, I'd like to see it fixed.  So far, I don't recall hearing anyone strongly oppose such a change; all objections appear to be only coming from the fear of breaking existing code.

Some things to consider:

- What this implies for the "if C code is compilable as D, it must have
  the same semantics" philosophy that Walter appears to be strongly
  insistent on.  Basically, anything that depends on C's conflation of
  char and (u)byte must either give an error, or give the correct
  semantics.

- The possibility of automatically fixing code broken by the change
  (possibly partial, leaving corner cases as errors to be handled by the
  user -- the idea being to eliminate the rote stuff and only require
  user intervention in the tricky cases).  This may be a good and simple
  use-case for building a tool that could do something like that.  This
  isn't the first time potential code breakage threatens an otherwise
  beneficial language change, where having an automatic source upgrade
  tool could alleviate many of the concerns.

- Once we start making a clear distinction between char types and
  non-char types, will char types still obey C-like int promotion rules,
  or should we consider discarding old baggage that's no longer so
  applicable to modern D?  For example, I envision that this DIP would
  make int + char or char + int illegal, but what should the result of
  char + char or char + wchar be?  I'm tempted to propose outright
  banning char arithmetic without casting, but for some applications
  this might be too onerous.  If we continue follow C rules, char + char
  would implicitly promote to dchar, which arguably could be annoying.


T

-- 
"Computer Science is no more about computers than astronomy is about telescopes." -- E.W. Dijkstra
November 05, 2018
On Monday, November 5, 2018 4:14:18 PM MST H. S. Teoh via Digitalmars-d wrote:
> On Mon, Nov 05, 2018 at 05:43:19PM -0500, Steven Schveighoffer via Digitalmars-d wrote: [...]
>
> > It's not just ints to chars, but chars to wchars or dchars, and wchars to dchars.
> >
> > Basically a character type should not convert from any other type. period.  Because it's not "just a number" in a different format.
>
> +1.  I recall having this conversation before.  Was this ever filed as a bug?  I couldn't find it this morning when I tried to look.
>
> > Do we need a DIP? Probably. but we have changed these types of things in the past from what I remember (I seem to recall we had at one point implicit truncation for adding 2 smaller numbers together). It is possible to still fix.
>
> [...]
>
> If it's possible to fix, I'd like to see it fixed.  So far, I don't recall hearing anyone strongly oppose such a change; all objections appear to be only coming from the fear of breaking existing code.
>
> Some things to consider:
>
> - What this implies for the "if C code is compilable as D, it must have
>   the same semantics" philosophy that Walter appears to be strongly
>   insistent on.  Basically, anything that depends on C's conflation of
>   char and (u)byte must either give an error, or give the correct
>   semantics.

I'm pretty sure that the change would just result in more errors, so I don't think that it would cause problems on this front.

> - The possibility of automatically fixing code broken by the change
>   (possibly partial, leaving corner cases as errors to be handled by the
>   user -- the idea being to eliminate the rote stuff and only require
>   user intervention in the tricky cases).  This may be a good and simple
>   use-case for building a tool that could do something like that.  This
>   isn't the first time potential code breakage threatens an otherwise
>   beneficial language change, where having an automatic source upgrade
>   tool could alleviate many of the concerns.

An automatic tool would be nice, but I don't know that focusing on that would be helpful, since it would be making it seem like the amount of breakage was large, which would make the change seem less acceptable. Regardless, the breakage couldn't be immediate. It would have to be some sort of deprecation warning first - possibly similar to whatever was done with the integer promotion changes a few releases back, though I never understood what happened there.

> - Once we start making a clear distinction between char types and
>   non-char types, will char types still obey C-like int promotion rules,
>   or should we consider discarding old baggage that's no longer so
>   applicable to modern D?  For example, I envision that this DIP would
>   make int + char or char + int illegal, but what should the result of
>   char + char or char + wchar be?  I'm tempted to propose outright
>   banning char arithmetic without casting, but for some applications
>   this might be too onerous.  If we continue follow C rules, char + char
>   would implicitly promote to dchar, which arguably could be annoying.

Well, as I understand it, the fact that char + char -> int is related to how the CPU works, and having it become char + char -> char would be a problem from that perspective. Having char + char -> dchar would also go against the whole idea that char is an encoding, because adding two chars together isn't necessarily going to get you a valid dchar. In reality though, I would expect reasonable code to be adding ints to char, because you're going to get stuff like x - 48 to convert ASCII numbers to integers. And honestly, adding two chars together doesn't even make sense. What does that even mean? 'A' + 'Q' does what? It's nonsense. Ultimately, I think that it would be too large a change to disallow it (and _maybe_ someone out there has some weird use case where it sort of makes sense), but I don't see how it makes any sense to actually do it. So, making it so that adding two chars together continues to result in an int makes the most sense to me, as does adding an int and a char (which is the operation that code is actually going to be doing). Code can then cast back to char (which is what it already has to do now anyway). It allows code to continue to function as it has (thus reducing how disruptive the changes are), but if we eliminate the implicit conversions, we eliminate the common bugs. I think that you'll get _far_ stronger opposition to trying to change the arithmetic operations than to changing the implicit conversions, and I also think that the gains are far less obvious.

So basically, I wouldn't advise mucking around with the arithmetic operations. I'd suggest simply making it so that implicitly converting between character types and any other type (unless explicitly defined by something like alias this) be disallowed.

Given that you already have to cast with the arithmetic stuff (at least to get it back into char), I'm pretty sure the result would actually be that almost all of the code that would have to be changed as a result would be code that was either broken or a code smell, which would probably make it a lot easier to convince Walter to make the change.

- Jonathan M Davis



November 06, 2018
On Mon, 05 Nov 2018 15:14:08 -0700, Jonathan M Davis wrote:
> It really comes down to what code would break due to the change, how that code breakage could be mitigated, what the transition process would look like, and how Walter views the issue at this point.

Get a patch and I can make dubautotester run on it to see what breaks.

I originally intended to use it to determine which patches we could safely backport in order to construct more stable DMDFE versions without vastly increasing the amount of human work. Like, we get a patch against master; try to apply it against 2.080.1 and see if it (a) still works (b) compiles everything that 2.080.1 did (c) doesn't compile anything that 2.080.1 didn't.

Automatically apply it to the next patch version if all that passes. Automatically apply the patch to the next minor version if only (c) fails. Make a report for human intervention. Or something like that. The main issue is that it takes a *lot* of time to run these tests, so more than one patch per day would require me to set up parallelism and upgrade the build box.

Evaluating the effects of a proposal like this is pretty similar.
November 06, 2018
On Monday, 5 November 2018 at 23:00:23 UTC, Jonathan M Davis wrote:

> Regardless, if you attempt to add keywords to the language at this point, you will almost certainly lose. I would be _very_ surprised to see Walter or Andrei go for it.
What exactly makes you say that? I had scan read the old dips that were rejected on the wiki and on the github, and it seems to be rejected for other reasons.
Is there previous discussion that you(or others) can linked to?

> Whether you think attributes are easy to read or not, they don't eat up an identifier, and Walter and Andrei consider that to be very important.
The feature that I have in mind requires to be an keyword as it heavily inspired by C# explicit and implicit keywords.
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/explicit
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/implicit
Even c++ have type conversions.
http://www.cplusplus.com/doc/tutorial/typecasting/

Bear in mind though that I am still in the brainstorm session and research process in regarding this. Don't expect the details from me, as I havn't figure everything out yet.
> AFAIK, they also don't consider attributes to be a readability problem. So, even if trying to add some sort of implicit or explicit marker to parameters made sense (and I really don't think that it does), I think that Walter and Andrei have made it pretty clear that that sort of thing would have to be an attribute and not a keyword.
I don't want the implicit and explicit keywords to be attributes in the dip I going to write, unless I really have to in order to get the DIP approval from Walter and Andrei.

> And honestly, I think that any DIP trying to add general control over implicit and explict conversions in the language has a _way_ lower chance of being accepted than one that gets rid of implicit conversions between character types and integer types. However, in the end, one does not depend on the other or even really have much to do with the other. A DIP to fix the implicit conversions between character types and integer types would be a DIP to fix precisely that, whereas a DIP to mark parameters with implicit or explicit would be about trying to control implicit or explicit conversions in general and not about character or integer types specifically, so while they might be tangentially related, they're really separate issues.

That is very good point. Well consider that when writing the dip.

> Given the recent DIP on copy constructors and the discussion there, it would not surprise me to see a future DIP about adding @implicit to constructors to allow for implicit construction, though I don't know how likely it is for such a DIP to be accepted given that D's approach (outside of built-in types anyway) has generally been to avoid implicit conversions to reduce the risk of bugs, and when combined with alias this, things really start to get interesting.
Gah, don't remind me of alias this. That can of worms has yet to be open with multi alias this. Which btw is STILL not implemented yet! Hell, I will sign up for the next round of community fund projects just to implement the damn thing, because I am that impatient.

> But I would think that the chances of that getting accepted are far greater than adding attributes to parameters (be they keywords or actual attributes). Regardless, that's an issue of conversions in general, and not just implicit conversions between character types and integer types, which is really what the discussion is about fixing here, and that can be fixed regardless of what happens with providing additional control over implicit conversions in general.
>
> - Jonathan M Davis

Sure thing, though the DIP process for deprecation of small features shouldn't be that slow!

-Alex
November 05, 2018
On Monday, November 5, 2018 7:47:42 PM MST 12345swordy via Digitalmars-d wrote:
> On Monday, 5 November 2018 at 23:00:23 UTC, Jonathan M Davis
>
> wrote:
> > Regardless, if you attempt to add keywords to the language at this point, you will almost certainly lose. I would be _very_ surprised to see Walter or Andrei go for it.
>
> What exactly makes you say that? I had scan read the old dips
> that were rejected on the wiki and on the github, and it seems to
> be rejected for other reasons.
> Is there previous discussion that you(or others) can linked to?

I would have to go digging through the newsgroup history. It's come up on a number of occasions in various threads, and I couldn't say which at this point. But the very reason that we started putting @ on things in the first place was to avoid creating keywords. We did it before user-defined attributes were even a thing. And for years now, any time that it's been considered to add any kind of attribute, it always starts with @. It has been years since anything involving adding a new keyword gotten anywhere. Similiarly, Walter has shot down the idea of using contextual keywords (which you can probably find discussions on pretty easily by searching the newsgroup history).

So, if you want to create a DIP that proposes adding implicit and explicit as keywords, feel free to do so, but from what I know of Walter and Andrei's position on the topic of keywords from what they've said in the newsgroup or in anything they've said in any discussions that I've had with them in person, they're not going to be interested in adding new keywords when adding an attribute that starts with @ will work, because adding keywords means restricting the list of available identifiers, whereas adding new attributes does not. I honestly do not expect that D2 will _ever_ get any additional keywords and would be very surprised if it ever did.

> Sure thing, though the DIP process for deprecation of small features shouldn't be that slow!

I won't claim the the DIP process couldn't or shouldn't be improved, but at least we now have a DIP process that actually works, even if it can be slow. With the old process, DIPs basically almost never went anywhere. A few did, but most weren't ever even seriously reviewed by Walter and Andrei. While it may not be perfect, the current process is an _enormous_ improvement.

- Jonathan M Davis



November 12, 2018
On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
> Hello to everyone! By mistake I typed some code like the following without using [std.conv: to] and get strange result. I believe that following code shouldn't even compile, but it does and gives non-printable symbol appended at the end of string.
> The same problem is encountered even without [enum]. Just using plain integer value gives the same. Is it a bug or someone realy could rely on this behaviour?
>
> import std.stdio;
>
> enum TestEnum: ulong {
>    Item1 = 2,
>    Item3 = 5
> }
>
> void main()
> {
>     string res = `Number value: ` ~ TestEnum.Item1;
>     writeln(res);
> }
>
> Output:
> Number value: 

Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon.

-Alex
November 12, 2018
On 11/12/18 3:01 PM, 12345swordy wrote:
> On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
>> Hello to everyone! By mistake I typed some code like the following without using [std.conv: to] and get strange result. I believe that following code shouldn't even compile, but it does and gives non-printable symbol appended at the end of string.
>> The same problem is encountered even without [enum]. Just using plain integer value gives the same. Is it a bug or someone realy could rely on this behaviour?
>>
>> import std.stdio;
>>
>> enum TestEnum: ulong {
>>    Item1 = 2,
>>    Item3 = 5
>> }
>>
>> void main()
>> {
>>     string res = `Number value: ` ~ TestEnum.Item1;
>>     writeln(res);
>> }
>>
>> Output:
>> Number value: 
> 
> Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon.
> 
> -Alex

If we deprecate that we also need to deprecate:

    string res = `Number value: ` ~ 65;

Not saying we shouldn't, just that there are many implications.


Andrei