Dealing with Autodecode (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Dealing with Autodecode (page 2)

June 01, 2016

Re: Dealing with Autodecode

Posted by tsbockman
in reply to Brad Roberts

tsbockman

Posted in reply to Brad Roberts

On Wednesday, 1 June 2016 at 02:58:36 UTC, Brad Roberts wrote:
> ...the rate of bug fixing which exceeds the rate of fix pulling.

Speaking of which:
    https://github.com/dlang/phobos/pull/4345
    https://github.com/dlang/phobos/pull/3973

June 01, 2016

Re: Dealing with Autodecode

Posted by Kirill Kryukov
in reply to Adam D. Ruppe

Kirill Kryukov

Posted in reply to Adam D. Ruppe

On Wednesday, 1 June 2016 at 01:36:43 UTC, Adam D. Ruppe wrote:
> D USERS **WANT** BREAKING CHANGES THAT INCREASE OVERALL CODE QUALITY WITH A SIMPLE MIGRATION PATH!!!!!!!!!!!!!!!!!!!!

This.

I only recently started full scale use of D, but I lurked here for years. D has a few quirks here and there, but overall it's a fantastic language. However the biggest putting off factor for me is the attitude of the leadership towards fixing the issues and completing the language.

The idea of autodecoding is very natural to appear for someone who only recently discovered Unicode. Whoa, instead of code pages we now have "unicode code points". Great. Only much later the person realizes that working with code points isn't always correct. So I don't blame anyone for designing/implementing autodecoding years ago. But. Not acknowledging that autodecoding is seriously wrong now, looks like a complete brain damage.

The entire community seems united in the view that autodecoding is both slow and usually wrong. The users are begging for this breaking change. There's a number of approaches about handling the deprecation. Even the code that for some reason really needs to work with code points will benefit from explicitly stating that it needs code points. But no we must endure this madness forever.

I realize that priorities of a language user might be different from those of a language leadership. With fixed (removed) autodecoding the user gets a cleaner language. Their program will work faster and is easier to reason about. User's brain cycles are not wasted for useless crap like working around autodecoding.

On the other hand, the language/stdlib designer now has to admit their initial design was sub-optimal. Their books and articles are now obsolete. And they will be the ones who receive complaints from the inevitable few upset with the change.

However keeping the current situation means for me personally: 1. Not switching to D wholesale, but just toying with it. 2. Even when using D for work I don't want to talk about it to others. I was seriously thinking about starting a D-learning seminar at work, and I still might, but the thought that autodecoding is going to stay is cooling my enthusiasm.

I just did a numerical app in D, where it shines, I think. However much of my work code is dealing with huge texts. I don't want to fight with autodecode at every step. I'd like arrays of chars be arrays of chars without any magic crap auto-inserted behind my back. I don't want to become an expert in avoiding language pitfalls (The reason I abandoned C++ years ago). I also don't want to re-implement the staple string processing routines (though I might, if at least the language constructs work without autodecode, which seems not the case here).

Think about it. 99% of code working with code points is _broken_ anyway. (In the sense, that the usual assumption is that code point represents a character, while in fact it does not). When working with code units, the developer will notice the problem right away. When working with code points, the problem is not apparent until years later (essentially what happened to D itself).

Feel free to ignore my non-D-core-dev comment. Even though I suspect many D users may agree with me. An even larger number of potential D users does not want autodecoding either.

Thanks,
Kirill

May 31, 2016

Re: Dealing with Autodecode

Posted by H. S. Teoh

H. S. Teoh

On Tue, May 31, 2016 at 07:28:04PM -0700, Jonathan M Davis via Digitalmars-d wrote: [...]
> The other critical thing is to make sure that Phobos in general works with byDChar, byCodeUnit, etc. For instance, pretty much as soon as I started trying to use byCodeUnit instead of naked strings, I ran into this:
> 
> https://issues.dlang.org/show_bug.cgi?id=15800

This is an example of current Phobos code assuming (sometimes implicitly) that strings are ranges of dchar, which leads to subtle breakage like this one:

	https://issues.dlang.org/show_bug.cgi?id=15972


T

-- 
"640K ought to be enough" -- Bill G. (allegedly), 1984.
"The Internet is not a primary goal for PC usage" -- Bill G., 1995.
"Linux has no impact on Microsoft's strategy" -- Bill G., 1999.

May 31, 2016

Re: Dealing with Autodecode

Posted by H. S. Teoh
in reply to Stefan Koch

H. S. Teoh

Posted in reply to Stefan Koch

On Wed, Jun 01, 2016 at 12:56:03AM +0000, Stefan Koch via Digitalmars-d wrote:
> On Wednesday, 1 June 2016 at 00:46:04 UTC, Walter Bright wrote:
> > It is not practical to just delete or deprecate autodecode - it is too embedded into things.
>
> Which Things ?
>
> > The way to deal with it is to replace reliance on autodecode with .byDchar (.byDchar has a bonus of not throwing an exception on invalid UTF, but using the replacement dchar instead.)
> 
> > To that end, and this will be an incremental process:
> > ....
> 
> So does this mean we intend to carry the auto-decoding wart with us
> into the future. And telling everyone :
> "The oblivious way is broken we just have it for backwards
> compatibility ?"

If we can pull off what Walter proposed, it will put us one step closer to killing autodecode for good. Killing autodecode today is very drastic and unwise to do in one fell swoop. I see .byDchar as a first step.

First it's introduced as an optional feature so that people can start using it. We promote its usage everywhere.

Then we make it a deprecation to *not* use it, perhaps with a migration compiler switch so that people are not forced to migrate immediately, but they are warned beforehand.

After enough time elapses, the compiler switch becomes the default, with an option to disable it if the user so chooses.

Then after another while the switch is removed and using .byDchar becomes required.

Finally, autodecoding is relegated to the dustbin of history and there will be much rejoicing. :-P  I will personally savor every moment of pressing the delete-line command in my editor while making the PR to finally kill off the last of the autodecoding code.

T

-- 
Famous last words: I *think* this will work...

June 01, 2016

Re: Dealing with Autodecode

Posted by default0
in reply to Adam D. Ruppe

default0

Posted in reply to Adam D. Ruppe

On Wednesday, 1 June 2016 at 01:36:43 UTC, Adam D. Ruppe wrote:
> D USERS **WANT** BREAKING CHANGES THAT INCREASE OVERALL CODE QUALITY WITH A SIMPLE MIGRATION PATH!!!!!!!!!!!!!!!!!!!!

Agree with that very much.
Yes, you still have to think about cost/benefit for breaking changes, but in general when I sign up for D I expect it to throw out mistakes of the past so long as the correction of them is worth the cost of breakage.

So the cost of breakage for autodecoding is that the behaviour of roughly all string handling code changes. Now most of this string handling code was broken to begin with since VERY VERY VERY little string handling code ever cares about code points.
This means the code that is actually broken in terms of being buggy after the change when it wasn't buggy before is probably not a lot.
The other cost of breakage is to force a user to go through potentially thousands of LoC and update their string handling code. Personally, I find that cost dramatically reduced if there are two prerequisites met: Compiler Errors everywhere we have relied on the feature before (we can apparently do that, so check) and error/deprecation messages detailed enough to go into further reading so I can make meaningful decisions about it (we can also do that, I am sure, so check). If I just have to hop from one compiler error to the next and fix my broken code with confidence after having read about the context for 30-60 minutes, even going through vast amounts of code is not actually that big of a deal since you really only have to inspect a fraction of it (the fraction the compiler tells you about).
Another cost is if we have unmaintained 3rd party libraries, when we actually make the change the default in the future, they will stop compiling on recent compiler versions. I suppose a tool could be made tracking the specific compiler errors and simply using .byDchar to make the code "just work" exactly the way it used to work (ie unreliably, slowly and with bugs in string handling) before the change.

The cost of backwards-compatibility is also two-fold from what I can see:
-We will continue to be inefficient and waste time autodecoding by default (mobile users are going to be especially happy about that).
-By default, string handling code is still broken, just more subtly, meaning more string handling bugs in D code make it to production

June 01, 2016

Re: Dealing with Autodecode

Posted by Jacob Carlborg
in reply to Walter Bright

Jacob Carlborg

Posted in reply to Walter Bright

On 2016-06-01 02:46, Walter Bright wrote:
> It is not practical to just delete or deprecate autodecode - it is too
> embedded into things. What we can do, however, is stop using it
> ourselves and stop relying on it in the documentation, much like [] is
> eschewed in favor of std::vector in C++.
>
> The way to deal with it is to replace reliance on autodecode with
> .byDchar

Don't you get the same behavior using byDchar as with autodecode?

-- 
/Jacob Carlborg

June 01, 2016

Re: Dealing with Autodecode

Posted by Walter Bright
in reply to Jacob Carlborg

Walter Bright

Posted in reply to Jacob Carlborg

On 5/31/2016 11:57 PM, Jacob Carlborg wrote:
>> The way to deal with it is to replace reliance on autodecode with
>> .byDchar
> Don't you get the same behavior using byDchar as with autodecode?

Yes (except that byDchar returns the replacement char on invalid Unicode, while autodecode throws an exception). But the point is that byDchar is opt-in.

June 01, 2016

Re: Dealing with Autodecode

Posted by Guillaume Chatelet
in reply to Adam D. Ruppe

Guillaume Chatelet

Posted in reply to Adam D. Ruppe

On Wednesday, 1 June 2016 at 01:36:43 UTC, Adam D. Ruppe wrote:
> I have a better one, that we discussed on IRC last night:
>
> 1) put the string overloads for front and popFront on a version switch:
>
> D USERS **WANT** BREAKING CHANGES THAT INCREASE OVERALL CODE QUALITY WITH A SIMPLE MIGRATION PATH!!!!!!!!!!!!!!!!!!!!
>
> 2) After a while, we swap the version conditions, so opting into it preserves the old behavior for a while.
>
> 3) A wee bit longer, we exterminate all this autodecoding crap and enjoy Phobos being a smaller, more efficient library.

+1

June 01, 2016

Re: Dealing with Autodecode

Posted by Andrea Fontana
in reply to Guillaume Chatelet

Andrea Fontana

Posted in reply to Guillaume Chatelet

On Wednesday, 1 June 2016 at 08:21:36 UTC, Guillaume Chatelet wrote:
> On Wednesday, 1 June 2016 at 01:36:43 UTC, Adam D. Ruppe wrote:
>> I have a better one, that we discussed on IRC last night:
>>
>> 1) put the string overloads for front and popFront on a version switch:
>>
>> D USERS **WANT** BREAKING CHANGES THAT INCREASE OVERALL CODE QUALITY WITH A SIMPLE MIGRATION PATH!!!!!!!!!!!!!!!!!!!!
>>
>> 2) After a while, we swap the version conditions, so opting into it preserves the old behavior for a while.
>>
>> 3) A wee bit longer, we exterminate all this autodecoding crap and enjoy Phobos being a smaller, more efficient library.
>
> +1

+1

June 01, 2016

Re: Dealing with Autodecode

Posted by poliklosio
in reply to Kirill Kryukov

poliklosio

Posted in reply to Kirill Kryukov

On Wednesday, 1 June 2016 at 05:46:29 UTC, Kirill Kryukov wrote:
> On Wednesday, 1 June 2016 at 01:36:43 UTC, Adam D. Ruppe wrote:
>> D USERS **WANT** BREAKING CHANGES THAT INCREASE OVERALL CODE QUALITY WITH A SIMPLE MIGRATION PATH!!!!!!!!!!!!!!!!!!!!
>
> This.
> (...)
> I don't want to become an expert in avoiding language pitfalls (The reason I abandoned C++ years ago).

+1
If you have too many pitfalls in the language, its not easier to learn than C++, just different (regardless of the maximum productivity you have when using the language, that's another issue).
The worst case is you just want to use ASCII text and suddenly you have to spend weeks reading a ton of confusing stuff about Unicode, D and autodecoding, just to know how to use char[] correctly in D.
Compare that to how trivial it is to process ASCII text in, say, C++.
And processing just plain ASCII is a very common case, e.g. processing textual logs from tools.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation