The Case Against Autodecode

How to improve autodecoding? (Was: The Case Against Autodecode)
May 30, 2016 Andrei Alexandrescu
May 30, 2016 Dmitry Olshansky
May 30, 2016 Jack Stouffer
May 30, 2016 Andrei Alexandrescu
May 31, 2016 Jonathan M Davis
May 31, 2016 Andrei Alexandrescu

May 30, 2016

Vladimir Panteleev

May 30, 2016

May 31, 2016

May 31, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

[OT] UTF-16
May 31, 2016 Marco Leise

May 31, 2016

May 31, 2016

Jun 01, 2016

Jun 01, 2016

[OT] Effect of UTF-8 on 2G connections
Jun 01, 2016 Marco Leise
Jun 01, 2016 Joakim
Jun 01, 2016 Wyatt
Jun 02, 2016 Joakim

[OT] The Case Against... Unicode?
Jun 01, 2016 Wyatt
Jun 01, 2016 Patrick Schluter
Jun 01, 2016 deadalnix
Jun 01, 2016 Nick Sabalausky
Jun 01, 2016 Kagamin
Jun 01, 2016 Kagamin

May 30, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 15, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 17, 2016

May 17, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 13, 2016

May 15, 2016

May 16, 2016

May 16, 2016

May 16, 2016

May 26, 2016

May 26, 2016

May 26, 2016

May 27, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 27, 2016

May 31, 2016

May 31, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 28, 2016

May 28, 2016

May 28, 2016

May 28, 2016

May 29, 2016

May 29, 2016

May 29, 2016

May 29, 2016

May 29, 2016

May 29, 2016

May 29, 2016

May 30, 2016

May 30, 2016

May 29, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 31, 2016

Jun 01, 2016

May 30, 2016

May 31, 2016

May 28, 2016

May 29, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 30, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 30, 2016

May 30, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 28, 2016

May 27, 2016

May 27, 2016

May 31, 2016

May 28, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 28, 2016

May 31, 2016

May 29, 2016

May 29, 2016

May 29, 2016

May 30, 2016

May 31, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 30, 2016

May 30, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 02, 2016

Jun 01, 2016

Jun 01, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 05, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 04, 2016

Jun 04, 2016

Jun 04, 2016

Jun 04, 2016

Jun 05, 2016

Jun 05, 2016

Jun 05, 2016

Jun 04, 2016

Jun 04, 2016

Jun 04, 2016

Jun 04, 2016

Jun 04, 2016

Jun 05, 2016

Jun 05, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 03, 2016

Jun 05, 2016

Jun 03, 2016

Jun 04, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 03, 2016

Jun 03, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

decoding foreach
Jun 02, 2016 Timon Gehr

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 01, 2016

Jun 02, 2016

May 31, 2016

May 31, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

Jun 01, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

Jun 02, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

Jun 01, 2016

May 31, 2016

May 31, 2016

May 31, 2016

Jun 01, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 27, 2016

May 29, 2016

May 30, 2016

May 12, 2016

The Case Against Autodecode

Posted by Walter Bright

Permalink

Walter Bright

Permalink

On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote:
> I am as unclear about the problems of autodecoding as I am about the necessity
> to remove curl. Whenever I ask I hear some arguments that work well emotionally
> but are scant on reason and engineering. Maybe it's time to rehash them? I just
> did so about curl, no solid argument seemed to come together. I'd be curious of
> a crisp list of grievances about autodecoding. -- Andrei

Here are some that are not matters of opinion.

1. Ranges of characters do not autodecode, but arrays of characters do. This is a glaring inconsistency.

2. Every time one wants an algorithm to work with both strings and ranges, you wind up special casing the strings to defeat the autodecoding, or to decode the ranges. Having to constantly special case it makes for more special cases when plugging together components. These issues often escape detection when unittesting because it is convenient to unittest only with arrays.

3. Wrapping an array in a struct with an alias this to an array turns off autodecoding, another special case.

4. Autodecoding is slow and has no place in high speed string processing.

5. Very few algorithms require decoding.

6. Autodecoding has two choices when encountering invalid code units - throw or produce an error dchar. Currently, it throws, meaning no algorithms using autodecode can be made nothrow.

7. Autodecode cannot be used with unicode path/filenames, because it is legal (at least on Linux) to have invalid UTF-8 as filenames. It turns out in the wild that pure Unicode is not universal - there's lots of dirty Unicode that should remain unmolested, and autocode does not play with that.

8. In my work with UTF-8 streams, dealing with autodecode has caused me considerably extra work every time. A convenient timesaver it ain't.

9. Autodecode cannot be turned off, i.e. it isn't practical to avoid importing std.array one way or another, and then autodecode is there.

10. Autodecoded arrays cannot be RandomAccessRanges, losing a key benefit of being arrays in the first place.

11. Indexing an array produces different results than autodecoding, another glaring special case.

May 12, 2016

Re: The Case Against Autodecode

Posted by Vladimir Panteleev
in reply to Walter Bright

Permalink

Vladimir Panteleev

Posted in reply to Walter Bright

Permalink

On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
> On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote:
> > I am as unclear about the problems of autodecoding as I am
> about the necessity
> > to remove curl. Whenever I ask I hear some arguments that
> work well emotionally
> > but are scant on reason and engineering. Maybe it's time to
> rehash them? I just
> > did so about curl, no solid argument seemed to come together.
> I'd be curious of
> > a crisp list of grievances about autodecoding. -- Andrei
>
> Here are some that are not matters of opinion.
>
> 1. Ranges of characters do not autodecode, but arrays of characters do. This is a glaring inconsistency.
>
> 2. Every time one wants an algorithm to work with both strings and ranges, you wind up special casing the strings to defeat the autodecoding, or to decode the ranges. Having to constantly special case it makes for more special cases when plugging together components. These issues often escape detection when unittesting because it is convenient to unittest only with arrays.
>
> 3. Wrapping an array in a struct with an alias this to an array turns off autodecoding, another special case.
>
> 4. Autodecoding is slow and has no place in high speed string processing.
>
> 5. Very few algorithms require decoding.
>
> 6. Autodecoding has two choices when encountering invalid code units - throw or produce an error dchar. Currently, it throws, meaning no algorithms using autodecode can be made nothrow.
>
> 7. Autodecode cannot be used with unicode path/filenames, because it is legal (at least on Linux) to have invalid UTF-8 as filenames. It turns out in the wild that pure Unicode is not universal - there's lots of dirty Unicode that should remain unmolested, and autocode does not play with that.
>
> 8. In my work with UTF-8 streams, dealing with autodecode has caused me considerably extra work every time. A convenient timesaver it ain't.
>
> 9. Autodecode cannot be turned off, i.e. it isn't practical to avoid importing std.array one way or another, and then autodecode is there.
>
> 10. Autodecoded arrays cannot be RandomAccessRanges, losing a key benefit of being arrays in the first place.
>
> 11. Indexing an array produces different results than autodecoding, another glaring special case.

12. The result of autodecoding, a range of Unicode code points, is rarely actually useful, and code that relies on autodecoding is rarely actually, universally correct. Graphemes are occasionally useful for a subset of scripts, and a subset of that subset has all graphemes mapped to single code points, but this only applies to some scripts/languages.

In the majority of cases, autodecoding provides only the illusion of correctness.

May 12, 2016

Re: The Case Against Autodecode

Posted by H. S. Teoh
in reply to Vladimir Panteleev

Permalink

H. S. Teoh

Posted in reply to Vladimir Panteleev

Permalink

On Thu, May 12, 2016 at 08:24:23PM +0000, Vladimir Panteleev via Digitalmars-d wrote:
> On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
[...]
> >1. Ranges of characters do not autodecode, but arrays of characters do.  This is a glaring inconsistency.
> >
> >2. Every time one wants an algorithm to work with both strings and ranges, you wind up special casing the strings to defeat the autodecoding, or to decode the ranges. Having to constantly special case it makes for more special cases when plugging together components. These issues often escape detection when unittesting because it is convenient to unittest only with arrays.

Example of string special-casing leading to bugs:

	https://issues.dlang.org/show_bug.cgi?id=15972

This particular issue highlight the problem quite well: one would hardly expect '#'.repeat(i) to return anything but a range of char. After all, how could a single char need to be "auto-decoded" to a dchar? Unfortunately, due to Phobos algorithms assuming autodecoding, the resulting range of char is not recognized as "string-like" data by .joiner, thus causing a compile error.

The workaround (as described in the bug comments) also illustrates the inconsistency in handling ranges of char vs. ranges of dchar: writing .joiner("\n".byCodeUnit) will actually fix the problem, basically by explicitly disabling autodecoding.

We can, of course, fix .joiner to recognize this case and handle it correctly, but the fact the using .byCodeUnit works perfectly proves that autodecoding is not necessary here. Which begs the question, why have autodecoding at all, and then require .byCodeUnit to work around issues it causes?

T

-- 
It is widely believed that reinventing the wheel is a waste of time; but I disagree: without wheel reinventers, we would be still be stuck with wooden horse-cart wheels.

May 12, 2016

Re: The Case Against Autodecode

Posted by H. S. Teoh
in reply to Vladimir Panteleev

Permalink

H. S. Teoh

Posted in reply to Vladimir Panteleev

Permalink

On Thu, May 12, 2016 at 08:24:23PM +0000, Vladimir Panteleev via Digitalmars-d wrote: [...]
> 12. The result of autodecoding, a range of Unicode code points, is rarely actually useful, and code that relies on autodecoding is rarely actually, universally correct. Graphemes are occasionally useful for a subset of scripts, and a subset of that subset has all graphemes mapped to single code points, but this only applies to some scripts/languages.
> 
> In the majority of cases, autodecoding provides only the illusion of correctness.

A range of Unicode code points is not the same as a range of graphemes (a grapheme is what a layperson would consider to be a "character"). Autodecoding returns dchar, a code point, rather than a grapheme.

Therefore, autodecoding actually only produces intuitively correct results when your string has a 1-to-1 correspondence between grapheme and code point. In general, this is only true for a small subset of languages, mainly a few common European languages and a handful of others.  It doesn't work for Korean, and doesn't work for any language that uses combining diacritics or other modifiers.  You need byGrapheme to have the correct results.

So basically autodecoding, as currently implemented, fails to meet its goal of segmenting a string by "character" (i.e., grapheme), and yet imposes a performance penalty that is difficult to "turn off" (you have to sprinkle your code with byCodeUnit everywhere, and many Phobos algorithms just return a range of dchar anyway). Not to mention that a good number of string algorithms don't actually *need* autodecoding at all.

(One could make a case for auto-segmenting by grapheme, but that's even worse in terms of performance (it requires a non-trivial Unicode algorithm involving lookup tables, and may need memory allocation). At the end of the day, we're back to square one: iterate by code unit, and explicitly ask for byGrapheme where necessary.)

T

-- 
"I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly

May 12, 2016

Re: The Case Against Autodecode

Posted by Daniel Kozak
in reply to Walter Bright

Permalink

Daniel Kozak

Posted in reply to Walter Bright

Permalink

On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
> On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote:
> > I am as unclear about the problems of autodecoding as I am
> about the necessity
> > to remove curl. Whenever I ask I hear some arguments that
> work well emotionally
> > but are scant on reason and engineering. Maybe it's time to
> rehash them? I just
> > did so about curl, no solid argument seemed to come together.
> I'd be curious of
> > a crisp list of grievances about autodecoding. -- Andrei
>
> Here are some that are not matters of opinion.
>
> 1. Ranges of characters do not autodecode, but arrays of characters do. This is a glaring inconsistency.
>
> 2. Every time one wants an algorithm to work with both strings and ranges, you wind up special casing the strings to defeat the autodecoding, or to decode the ranges. Having to constantly special case it makes for more special cases when plugging together components. These issues often escape detection when unittesting because it is convenient to unittest only with arrays.
>
> 3. Wrapping an array in a struct with an alias this to an array turns off autodecoding, another special case.
>
> 4. Autodecoding is slow and has no place in high speed string processing.
>
> 5. Very few algorithms require decoding.
>
> 6. Autodecoding has two choices when encountering invalid code units - throw or produce an error dchar. Currently, it throws, meaning no algorithms using autodecode can be made nothrow.
>
> 7. Autodecode cannot be used with unicode path/filenames, because it is legal (at least on Linux) to have invalid UTF-8 as filenames. It turns out in the wild that pure Unicode is not universal - there's lots of dirty Unicode that should remain unmolested, and autocode does not play with that.
>
> 8. In my work with UTF-8 streams, dealing with autodecode has caused me considerably extra work every time. A convenient timesaver it ain't.
>
> 9. Autodecode cannot be turned off, i.e. it isn't practical to avoid importing std.array one way or another, and then autodecode is there.
>
> 10. Autodecoded arrays cannot be RandomAccessRanges, losing a key benefit of being arrays in the first place.
>
> 11. Indexing an array produces different results than autodecoding, another glaring special case.

For me it is not about autodecoding. I would like to have something like String type which do that. But what I am really piss of is that current string type is alias to immutable(char)[] (so it is not usable at all). This is really problem for me. Because this make working on array of chars almost impossible.

Even char[] is unusable. So I am force to used ubyte[], but this is really not an array of chars.

ATM D does not support even full Unicode strings and even basic array of chars :(.

I hope this will be fixed one day. So I could start to expand D in Czech, until than I am unable to do that.

May 12, 2016

Re: The Case Against Autodecode

Posted by Walter Bright
in reply to Daniel Kozak

Permalink

Walter Bright

Posted in reply to Daniel Kozak

Permalink

On 5/12/2016 4:23 PM, Daniel Kozak wrote:
> But what I am really piss of is that current string type is
> alias to immutable(char)[] (so it is not usable at all). This is really problem
> for me. Because this make working on array of chars almost impossible.
>
> Even char[] is unusable. So I am force to used ubyte[], but this is really not
> an array of chars.
>
> ATM D does not support even full Unicode strings and even basic array of chars :(.
>
> I hope this will be fixed one day. So I could start to expand D in Czech, until
> than I am unable to do that.

I can't find any actionable information in this.

May 13, 2016

Re: The Case Against Autodecode

Posted by Marco Leise
in reply to Walter Bright

Permalink

Marco Leise

Posted in reply to Walter Bright

Permalink

Am Thu, 12 May 2016 13:15:45 -0700
schrieb Walter Bright <newshound2@digitalmars.com>:

> 7. Autodecode cannot be used with unicode path/filenames, because it is legal (at least on Linux) to have invalid UTF-8 as filenames.

More precisely they are byte strings with '/' reserved to separate path elements. While on an out-of-the-box Linux nowadays everything is typically presented as UTF-8, there are still die-hards that use code pages, corrupted file systems or incorrectly bound network shares displaying with the wrong charset. It is safer to work with them as a ubyte[] and that also bypasses auto-decoding.

I'd like 'string' to mean valid UTF-8 in D as far as the encoding goes. A filename should not be a 'string'.

-- 
Marco

May 13, 2016

Re: The Case Against Autodecode

Posted by Jack Stouffer
in reply to Walter Bright

Permalink

Jack Stouffer

Posted in reply to Walter Bright

Permalink

On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
> Here are some that are not matters of opinion.

If you're serious about removing auto-decoding, which I think you and others have shown has merits, you have to the THE SIMPLEST migration path ever, or you will kill D. I'm talking a simple press of a button.

I'm not exaggerating here. Python, a language which was much more popular than D at the time, came out with two versions in 2008: Python 2.7 which had numerous unicode problems, and Python 3.0 which fixed those problems. Almost eight years later, and Python 2 is STILL the more popular version despite Py3 having five major point releases since and Python 2 only getting security patches. Think the tango vs phobos problem, only a little worse.

D is much less popular now than was Python at the time, and Python 2 problems were more straight forward than the auto-decoding problem.  You'll need a very clear migration path, years long deprecations, and automatic tools in order to make the transition work, or else D's usage will be permanently damaged.

May 12, 2016

Re: The Case Against Autodecode

Posted by Walter Bright
in reply to Marco Leise

Permalink

Walter Bright

Posted in reply to Marco Leise

Permalink

On 5/12/2016 4:52 PM, Marco Leise wrote:
> I'd like 'string' to mean valid UTF-8 in D as far as the
> encoding goes. A filename should not be a 'string'.

I would have agreed with you in the past, but more and more it just doesn't seem practical. UTF-8 is dirty in the real world, and D code will have to deal with it.

By dealing with it I mean not crash, throw exceptions, or other tantrums when encountering it. Unless it matters, it should pass the invalid encodings along unmolested and without comment. For example, if you're searching for 'a' in a UTF-8 string, what does it matter if there are invalid encodings in that string?

For filenames/paths in particular, having redone the file/path code in Phobos, I realized that invalid encodings are completely immaterial.

May 13, 2016

Re: The Case Against Autodecode

Posted by Jack Stouffer
in reply to Jack Stouffer

Permalink

Jack Stouffer

Posted in reply to Jack Stouffer

Permalink

On Friday, 13 May 2016 at 00:47:04 UTC, Jack Stouffer wrote:
> I'm not exaggerating here. Python, a language which was much more popular than D at the time, came out with two versions in 2008: Python 2.7 which had numerous unicode problems, and Python 3.0 which fixed those problems. Almost eight years later, and Python 2 is STILL the more popular version despite Py3 having five major point releases since and Python 2 only getting security patches. Think the tango vs phobos problem, only a little worse.

To hammer this home a little more, Python 3 had a really useful library in order to abstract most of the differences automatically. But despite that, here is a list of the top 200 Python packages in 2011, three years after the fork, and if they supported Python 3 or not: https://web.archive.org/web/20110215214547/http://python3wos.appspot.com/

This is _three years_ later, and only 18 out of the top 200 supported Python 3.

And here it is now, eight years later, at 174 out of 200 https://python3wos.appspot.com/

Top | Forum index | About this forum

Forums