September 26, 2014
On Thursday, 25 September 2014 at 21:49:43 UTC, H. S. Teoh via Digitalmars-d wrote:
> It's not just about performance.

Something I recently realized: because of auto-decoding, std.algorithm.find("foo", 'o') cannot be implemented using memchr. I think this points to a huge design fail, performance-wise.

There are also subtle correctness problems: haystack[0..haystack.countUntil(needle)] is wrong, even if it works right on ASCII input.

For once I agree with Walter Bright - regarding the chair throwing :)
September 26, 2014
On Thu, Sep 25, 2014 at 08:11:02PM -0700, Walter Bright via Digitalmars-d wrote:
> On 9/25/2014 2:47 PM, H. S. Teoh via Digitalmars-d wrote:
> >Not a bad idea. If we do it right, we could (mostly) avoid user outrage.  E.g., start with a "soft deprecation" (a compile-time message, but not an actual warning, to the effect that byCodeUnit / byCodePoint should be used with strings from now on), then a warning, then an actual deprecation, then remove autodecoding code from Phobos algorithms (leaving only byCodePoint for those who still want autodecoding).
> 
> Consider this PR:
> 
> https://github.com/D-Programming-Language/phobos/pull/2423
> 
> which is blocked because several people do not agree with using byCodeunit.

Actually, several committers have already agreed that this particular PR shouldn't be blocked pending the decision whether to autodecode or not. What's actually blocking it right now, is that it calls stripExtension, which only works with arrays, not general ranges.  (Speaking of which, thanks for reminding me that I need to work on that.  :-P)


T

-- 
The fact that anyone still uses AOL shows that even the presence of options doesn't stop some people from picking the pessimal one. - Mike Ellis
September 26, 2014
"Walter Bright"  wrote in message news:m02lt5$2hg4$1@digitalmars.com...

> I should add that this impasse has COMPLETELY stalled changes to Phobos to remove dependency on the GC.

Maybe it would be more successful if it didn't try to do both at once. 

September 26, 2014
On Thu, Sep 25, 2014 at 08:44:23PM -0700, Andrei Alexandrescu via Digitalmars-d wrote:
> On 9/25/14, 8:17 PM, Walter Bright wrote:
> >On 9/25/2014 8:11 PM, Walter Bright wrote:
> >>Consider this PR:
> >>
> >>https://github.com/D-Programming-Language/phobos/pull/2423
> >>
> >>which is blocked because several people do not agree with using byCodeunit.
> >>
> >
> >I should add that this impasse has COMPLETELY stalled changes to Phobos to remove dependency on the GC.
> 
> I think the way to break that stalemate is to add RC strings and reference counted exceptions. -- Andrei

But it does nothing to bring us closer to a decision about the autodecoding issue.


T

-- 
Heads I win, tails you lose.
September 26, 2014
"Iain Buclaw"  wrote in message news:dqgkcmdmxekzqpvfbcim@forum.dlang.org...

On Saturday, 20 September 2014 at 12:39:23 UTC, Tofu Ninja wrote:
>
> What do you think are the worst parts of D?
>

> 1) D Inline Assembler.

Relying on the system assembler sucks too.

> 7) Interfacing with C++
> - A new set of features that is danger of falling into the same "let's get it working" pit.  First warning sign was C++ template mangling, then DMD gave up on mangling D 'long' in any predictable way.  It's all been downhill from there.

C++ template mangling is fine.  'long' mangling is messy, but it's better than what was there before.

> 8) Real
> - What a pain.

Oh yeah.  D's one variable-sized type. 

September 26, 2014
On 9/25/2014 9:12 PM, Daniel Murphy wrote:
> "Walter Bright"  wrote in message news:m02lt5$2hg4$1@digitalmars.com...
>
>> I should add that this impasse has COMPLETELY stalled changes to Phobos to
>> remove dependency on the GC.
>
> Maybe it would be more successful if it didn't try to do both at once.

What would the 3rd version of setExtension be named, then?
September 26, 2014
On Fri, Sep 26, 2014 at 04:05:18AM +0000, Vladimir Panteleev via Digitalmars-d wrote:
> On Thursday, 25 September 2014 at 21:49:43 UTC, H. S. Teoh via Digitalmars-d wrote:
> >It's not just about performance.
> 
> Something I recently realized: because of auto-decoding, std.algorithm.find("foo", 'o') cannot be implemented using memchr. I think this points to a huge design fail, performance-wise.

Well, if you really want to talk performance, we've already failed. Any string operation that starts from a narrow string and ends with a narrow string (of the same width) will incur the overhead of decoding / reencoding *every single character*, even if it's mostly redundant.

What bugs me even more is the fact that every single Phobos algorithm that might conceivably deal with characters has to be special-cased for narrow string in order to be performant. That's a mighty high price to pay for what's a relatively small benefit -- note that autodecoding does *not* guarantee Unicode correctness, even if, according to the argument of some, it helps. So we're paying a high price in terms of performance and code maintainability in Phobos, for the dubious benefit of only partial Unicode conformance.


> There are also subtle correctness problems: haystack[0..haystack.countUntil(needle)] is wrong, even if it works right on ASCII input.
> 
> For once I agree with Walter Bright - regarding the chair throwing :)

Not to mention that autodecoding *still* doesn't fix the following problem:

	assert("á".canFind("á")); // fails

(Note: you may need to save this message verbatim and edit it into a D source file to see this effect; cut-n-paste on some systems may erase the effect.)

And the only way to fix this would be so prohibitively expensive, I don't think even Andrei would agree to it. :-P

So basically, we're paying (1) lower performance, (2) non-random access
for strings, (3) subtle distinction between index and count and other
such gotchas, and (4) tons of special-cased Phobos code with the
associated maintenance costs, all for incomplete Unicode correctness.
Doesn't seem like the benefit measures up to the cost. :-(


T

-- 
We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare.  Now, thanks to the Internet, we know this is not true. -- Robert Wilensk
September 26, 2014
"Walter Bright"  wrote in message news:m02qcm$2mmn$1@digitalmars.com...

> On 9/25/2014 9:12 PM, Daniel Murphy wrote:
> > "Walter Bright"  wrote in message news:m02lt5$2hg4$1@digitalmars.com...
> >
> >> I should add that this impasse has COMPLETELY stalled changes to Phobos
> to
> >> remove dependency on the GC.
> >
> > Maybe it would be more successful if it didn't try to do both at once.
>
> What would the 3rd version of setExtension be named, then?

setExtension.  Making up new clever names for functions that do the same thing with different types is a burden for the users. 

September 26, 2014
On Thu, Sep 25, 2014 at 09:34:30PM -0700, Walter Bright via Digitalmars-d wrote:
> On 9/25/2014 9:12 PM, Daniel Murphy wrote:
> >"Walter Bright"  wrote in message news:m02lt5$2hg4$1@digitalmars.com...
> >
> >>I should add that this impasse has COMPLETELY stalled changes to Phobos to remove dependency on the GC.
> >
> >Maybe it would be more successful if it didn't try to do both at once.
> 
> What would the 3rd version of setExtension be named, then?

I think that PR, and the others slated to follow it, should just merge with autodecoding in conformance to the rest of Phobos right now, independently of however the decision on the autodecoding issue will turn out. If we do decide eventually to get rid of autodecoding, we're gonna have to rewrite much of Phobos anyway, so it's not as though merging PRs now is going to make it significantly worse. I don't see why the entire burden of deciding upon autodecoding should rest upon a PR that merely introduces a new string function.


T

-- 
Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn
September 26, 2014
On Friday, 26 September 2014 at 03:44:23 UTC, Andrei Alexandrescu wrote:
> I think the way to break that stalemate is to add RC strings and reference counted exceptions. -- Andrei

I dont want gc, exceptions or rc strings. You really need to make sure rc is optional throughout. RC inc/dec abort read-transactions. That's a bad long term strategy.