| Thread overview | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
April 03, 2009 design question | ||||
|---|---|---|---|---|
| ||||
I'm toying with a rather new approach to a certain problem, and wanted to ask for some feedback before coding with it.
The STL has two functions with a "stable" variant: stable_sort and stable_partition. They differ from sort and partition in that they preserve the order of equivalent objects. They are more costly, hence the default is "unstable".
I started with the same approach in std.algorithm, but before long I figured two things:
a) There are many more algorithms that can benefit from stable vs. unstable versions: remove, topN, schwartzSort, partialSort, completeSort (take a sorted range and an unsorted one and make a sorted range out of them), and makeIndex.
b) Aside from stable and unstable, there's a semistable notion: when partitioning, the left half stays stable and the right half is unstable.
Clearly encoding stability in the function name isn't faring very well, so I made a policy of it:
enum SwapStrategy { stable, semistable, unstable }
and parameterized the appropriate functions with SwapStrategy. A use case looks like this:
// even to the left, odd to the right
auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
assert(p == arr[5 .. $]);
Ok, until now there's only been background :o|. The new part starts here.
I think the stability request could be better encoded as a wrapper of the range making the request. For example:
auto p = partition!("(a & 1) == 0")(keepStable(arr));
So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say:
sort(keepStable(arr));
instead of:
sort!(SwapStrategy.stable)(arr);
I have the feeling this approach will scale better in the future. Also it is less verbose and in sync with the current way of doing searches:
auto f = find(assumeSorted(haystack), needle);
In that case, assumeSorted uses the same means of transmitting information about a range into an algorithm.
So... I'd appreciate any thoughts you might have. Thanks.
Andrei
| ||||
April 03, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article > I think the stability request could be better encoded as a wrapper of > the range making the request. For example: > auto p = partition!("(a & 1) == 0")(keepStable(arr)); > So in order to tell partition you want stability, you just wrap the > range with a call to keepStable. (The runtime cost is negligible). > Similarly, say you want a stable sort. You'd say: > sort(keepStable(arr)); > instead of: > sort!(SwapStrategy.stable)(arr); Can you elaborate a little? In particular, would the KeepStable wrapper actually _do_ anything in itself to make modifications to the underlying range stable, or would it just wrap the range in a new type so that you could stick a static if statement in the sort algorithm to decide what to? | |||
April 03, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to dsimcha | dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
>> I think the stability request could be better encoded as a wrapper of
>> the range making the request. For example:
>> auto p = partition!("(a & 1) == 0")(keepStable(arr));
>> So in order to tell partition you want stability, you just wrap the
>> range with a call to keepStable. (The runtime cost is negligible).
>> Similarly, say you want a stable sort. You'd say:
>> sort(keepStable(arr));
>> instead of:
>> sort!(SwapStrategy.stable)(arr);
>
> Can you elaborate a little? In particular, would the KeepStable wrapper actually _do_ anything in itself to
> make modifications to the underlying range stable, or would it just wrap the range in a new type so that
> you could stick a static if statement in the sort algorithm to decide what to?
keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it.
Andrei
| |||
April 03, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu wrote: > keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it. Would these be composable? I.e.: thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute? | |||
April 03, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it.
>
> Would these be composable? I.e.:
>
> thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
Great question. Not sure whether I can implement composition to be entirely order-invariant, but yes, the algorithm should "see" various attributes of the wrapped range.
Andrei
| |||
April 03, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Reply to Andrei,
> Walter Bright wrote:
>
>> Andrei Alexandrescu wrote:
>>
>>> keepStable is a template function that simply wraps the range in a
>>> different type. It's a way to pass information about the range
>>> (e.g., "keep this stable", or "this is sorted already"), to the
>>> function understanding it.
>>>
>> Would these be composable? I.e.:
>>
>> thisAttribute(thatAttribute(T)) is the same as
>> thatAttribute(thisAttribute(T)) ? Would the algorithm detect the
>> 'inner' attribute?
>>
> Great question. Not sure whether I can implement composition to be
> entirely order-invariant, but yes, the algorithm should "see" various
> attributes of the wrapped range.
>
> Andrei
>
you could do composition by tuple/set unioning and always sort the set (by .stringof)
| |||
April 03, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | I don't like it. You're introducing types for things that should only exist at the call site. Now I can declare variables with those types, and even start composing them together. If a user zips two stable ranges together, they should be pissed if they don't get a stable range out. Even if std.algorithm gets it right, 3rd party libs might not. Additionally, std.algorithm should support others mimicing the same type-based design style.
Overall, i don't think it makes implementing sort easier, but it opens a lot of undesirable issues.
Andrei Alexandrescu Wrote:
> I'm toying with a rather new approach to a certain problem, and wanted to ask for some feedback before coding with it.
>
> The STL has two functions with a "stable" variant: stable_sort and stable_partition. They differ from sort and partition in that they preserve the order of equivalent objects. They are more costly, hence the default is "unstable".
>
> I started with the same approach in std.algorithm, but before long I figured two things:
>
> a) There are many more algorithms that can benefit from stable vs. unstable versions: remove, topN, schwartzSort, partialSort, completeSort (take a sorted range and an unsorted one and make a sorted range out of them), and makeIndex.
>
> b) Aside from stable and unstable, there's a semistable notion: when partitioning, the left half stays stable and the right half is unstable.
>
> Clearly encoding stability in the function name isn't faring very well, so I made a policy of it:
>
> enum SwapStrategy { stable, semistable, unstable }
>
> and parameterized the appropriate functions with SwapStrategy. A use case looks like this:
>
> // even to the left, odd to the right
> auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
> auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
> assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
> assert(p == arr[5 .. $]);
>
> Ok, until now there's only been background :o|. The new part starts here.
>
> I think the stability request could be better encoded as a wrapper of the range making the request. For example:
>
> auto p = partition!("(a & 1) == 0")(keepStable(arr));
>
> So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say:
>
> sort(keepStable(arr));
>
> instead of:
>
> sort!(SwapStrategy.stable)(arr);
>
> I have the feeling this approach will scale better in the future. Also it is less verbose and in sync with the current way of doing searches:
>
> auto f = find(assumeSorted(haystack), needle);
>
> In that case, assumeSorted uses the same means of transmitting information about a range into an algorithm.
>
> So... I'd appreciate any thoughts you might have. Thanks.
>
>
> Andrei
| |||
April 04, 2009 Re: design question | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 03 Apr 2009 20:53:18 +0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > I'm toying with a rather new approach to a certain problem, and wanted to ask for some feedback before coding with it. > > The STL has two functions with a "stable" variant: stable_sort and stable_partition. They differ from sort and partition in that they preserve the order of equivalent objects. They are more costly, hence the default is "unstable". > > I started with the same approach in std.algorithm, but before long I figured two things: > > a) There are many more algorithms that can benefit from stable vs. unstable versions: remove, topN, schwartzSort, partialSort, completeSort (take a sorted range and an unsorted one and make a sorted range out of them), and makeIndex. > > b) Aside from stable and unstable, there's a semistable notion: when partitioning, the left half stays stable and the right half is unstable. > > Clearly encoding stability in the function name isn't faring very well, so I made a policy of it: > > enum SwapStrategy { stable, semistable, unstable } > > and parameterized the appropriate functions with SwapStrategy. A use case looks like this: > > // even to the left, odd to the right > auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; > auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr); > assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]); > assert(p == arr[5 .. $]); > > Ok, until now there's only been background :o|. The new part starts here. > > I think the stability request could be better encoded as a wrapper of the range making the request. For example: > > auto p = partition!("(a & 1) == 0")(keepStable(arr)); > > So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say: > > sort(keepStable(arr)); > > instead of: > > sort!(SwapStrategy.stable)(arr); > > I have the feeling this approach will scale better in the future. Also it is less verbose and in sync with the current way of doing searches: > > auto f = find(assumeSorted(haystack), needle); > > In that case, assumeSorted uses the same means of transmitting information about a range into an algorithm. > > So... I'd appreciate any thoughts you might have. Thanks. > > > Andrei This sounds interesting, but I don't thing it's good idea overall. assumeSorted(range) is fine, because "sorted" is a property of a range. auto r = assumeSorted(range); // makes sense keepStable(range) is not okay, because stable is not a property of a range. It's a property of an algorithm. auto r = keepStable(range); // makes no sense to me | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply