Thread overview | |||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
March 06, 2014 Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
A major goal for D in the short term is to reduce reliance in Phobos on the GC. I was looking at std.string last night, and I noticed a couple things: 1. The inputs are constrained to being strings. This is overly restrictive, the inputs should be InputRanges. 2. The outputs should be a range, too. In fact, the string functions should become algorithms. Then they won't need to allocate any memory at all. The existing functions should not be removed, but perhaps rewritten as wrappers around the algorithm versions. I've found that rewriting traditional code, which is what std.string is now, in terms of algorithms is a bit mind-bending. But it's well worth it, and fun. So who wants to step up? Don't have to do the whole thing in one go, just pick a function and do that one. |
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Thursday, 6 March 2014 at 21:26:45 UTC, Walter Bright wrote: > The existing functions should not be removed, but perhaps rewritten as wrappers around the algorithm versions. How does one handle case sensitivity for ranges of abstract types? > I've found that rewriting traditional code, which is what std.string is now, in terms of algorithms is a bit mind-bending. But it's well worth it, and fun. Would be pretty neat if std.string and std.regex would work with char-like types which actually carry more data per character. That way, it'd be possible to do string/regex transforms (search & replace, etc.) but keep track where exactly each character came from. |
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Thursday, 6 March 2014 at 21:26:45 UTC, Walter Bright wrote:
> A major goal for D in the short term is to reduce reliance in Phobos on the GC. I was looking at std.string last night, and I noticed a couple things:
>
> 1. The inputs are constrained to being strings. This is overly restrictive, the inputs should be InputRanges.
>
> 2. The outputs should be a range, too. In fact, the string functions should become algorithms. Then they won't need to allocate any memory at all.
>
> The existing functions should not be removed, but perhaps rewritten as wrappers around the algorithm versions.
>
> I've found that rewriting traditional code, which is what std.string is now, in terms of algorithms is a bit mind-bending. But it's well worth it, and fun.
>
> So who wants to step up? Don't have to do the whole thing in one go, just pick a function and do that one.
Seems like a good idea to reduce memory usage wherever possible, but I thought that the reason std.string exists (and duplicates some functionality that exists elsewhere) is to provide string-specific versions that were either optimized specifically for strings, or have completely different functionality due to working with strings.
|
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vladimir Panteleev | On Thu, 06 Mar 2014 16:30:46 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
> On Thursday, 6 March 2014 at 21:26:45 UTC, Walter Bright wrote:
>> The existing functions should not be removed, but perhaps rewritten as wrappers around the algorithm versions.
>
> How does one handle case sensitivity for ranges of abstract types?
Perhaps he means many of the functions only accept const(Char)[], where Char can be templatized, but it really should accept any range.
-Steve
|
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vladimir Panteleev | 07-Mar-2014 01:30, Vladimir Panteleev пишет: > On Thursday, 6 March 2014 at 21:26:45 UTC, Walter Bright wrote: >> The existing functions should not be removed, but perhaps rewritten as >> wrappers around the algorithm versions. > > How does one handle case sensitivity for ranges of abstract types? +1 >> I've found that rewriting traditional code, which is what std.string >> is now, in terms of algorithms is a bit mind-bending. But it's well >> worth it, and fun. > > Would be pretty neat if std.string and std.regex would work with > char-like types which actually carry more data per character. That way, > it'd be possible to do string/regex transforms (search & replace, etc.) > but keep track where exactly each character came from. Exactly. I've been toying with idea of having generic notion of Alphabets for more then a year now. That would also be generalizing code unit / code point stuff of Unicode, into legacy encodings and beyond. One case I had in mind is the very limited (A, C, T, G) alphabet in bio-informatics. -- Dmitry Olshansky |
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vladimir Panteleev | On Thursday, 6 March 2014 at 21:30:47 UTC, Vladimir Panteleev wrote:
> On Thursday, 6 March 2014 at 21:26:45 UTC, Walter Bright wrote:
>> The existing functions should not be removed, but perhaps rewritten as wrappers around the algorithm versions.
>
> How does one handle case sensitivity for ranges of abstract types?
The same way std.algorithm does: through equality predicates.
|
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vladimir Panteleev | On 3/6/2014 1:30 PM, Vladimir Panteleev wrote:
> On Thursday, 6 March 2014 at 21:26:45 UTC, Walter Bright wrote:
>> The existing functions should not be removed, but perhaps rewritten as
>> wrappers around the algorithm versions.
>
> How does one handle case sensitivity for ranges of abstract types?
Use 'static if' to detect the element type.
And besides, I had meant that an algorithm should work on an InputRange of char's, and not be restricted to char[].
|
March 06, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Meta | On 3/6/2014 1:40 PM, Meta wrote:
> Seems like a good idea to reduce memory usage wherever possible, but I thought
> that the reason std.string exists (and duplicates some functionality that exists
> elsewhere) is to provide string-specific versions that were either optimized
> specifically for strings, or have completely different functionality due to
> working with strings.
1. By using template specializations, algorithms can be custom optimized for certain inputs.
2. I expect the developer of these algorithms to do performance profiling of them vs the originals, and address any problems.
3. I've done some similar algorithms, and was able to achieve performance parity, and sometimes even do better (because no memory needed to be allocated).
std.string was one of the first Phobos modules written, was written by myself, and long predates notions of ranges and algorithms. It has evolved since then, but its fundamental nature has not changed. It's time for that to change to a modern, D style.
|
March 07, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Thu, Mar 06, 2014 at 01:26:46PM -0800, Walter Bright wrote: > A major goal for D in the short term is to reduce reliance in Phobos on the GC. I was looking at std.string last night, and I noticed a couple things: > > 1. The inputs are constrained to being strings. This is overly restrictive, the inputs should be InputRanges. > > 2. The outputs should be a range, too. In fact, the string functions should become algorithms. Then they won't need to allocate any memory at all. [...] What about using output ranges? T -- What do you call optometrist jokes? Vitreous humor. |
March 07, 2014 Re: Lots of low hanging fruit in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Friday, 7 March 2014 at 00:44:45 UTC, H. S. Teoh wrote:
> What about using output ranges?
I think most the string functions should be transformative, like std.algorithm.map, so they take an input range and return an input range.
This lets them chain most easily, letting the user sink them into a particular range at the end.
Though we could do a bit of magic to both take an output range and return an input range for it (which can also be backward-compatible, as we talked about in a thread a month or so ago), the most straightforward way is surely to treat it all like map.
|
Copyright © 1999-2021 by the D Language Foundation