Jump to page: 1 2
Thread overview
[Issue 5076] New: std.algorithm.sorted / schwartzSorted
Oct 19, 2010
Peter Alexander
Oct 19, 2010
Peter Alexander
Oct 19, 2010
Peter Alexander
Oct 20, 2010
Peter Alexander
Feb 07, 2012
Andrej Mitrovic
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076

           Summary: std.algorithm.sorted / schwartzSorted
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Keywords: patch
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2010-10-18 19:05:59 PDT ---
I propose to add two new little functions to Phobos, named
sorted()/schwartzSorted(). Their purpose is to copy the input items, sort them
and return them.

The -ed versions are more functional, they don't modify the input data, so they may work with a immutable input sequence too. They may be used as expressions instead of as statements, so in theory they allow code like (currently this doesn't work, see bug 5074 ):


auto foo(immutable(int)[] data) {
    return map!q{ -a }(sorted(data));
}


Instead of:

auto foo(immutable(int)[] data) {
    int[] arr = data.dup;
    sort(arr);
    return map!q{ -a }(arr);
}


sorted()/schwartzSorted() may be seen as less efficient than
sort()/schwartzSort() because they copy the input data, but there are many
situations (like script-like D code) where the programmer wants short and quick
code, and knows the number of input items will be low enough to not cause
memory shortage. Python too has both list.sort() and sorted() built-ins.


A possible simple implementation, for DMD 2.049 (Code not tested much):


import std.algorithm: SwapStrategy, schwartzSort, sort;
import std.range: isRandomAccessRange, hasLength;
import std.array: array;

auto sorted(alias less = "a < b", SwapStrategy ss = SwapStrategy.unstable,
        Range)(Range r)
{
    auto auxr = array(r);
    sort!(less, ss)(auxr);
    return auxr;
}

auto schwartzSorted(alias transform, alias less = "a < b",
        SwapStrategy ss = SwapStrategy.unstable, Range)(Range r)
    if (isRandomAccessRange!(Range) && hasLength!(Range))
{
    auto auxr = array(r);
    schwartzSort!(transform, less, ss)(auxr);
    return auxr;
}

import std.typecons: Tuple;
import std.stdio: writeln;

alias Tuple!(int, "x", int, "y") P;

void main() {
    P[] data = [P(1,4), P(2,3), P(3,1), P(4,0)];
    writeln(data);
    writeln(schwartzSorted!((x) { return x.y; })(data));
    writeln(data);
    writeln(sorted!q{ a.y < b.y }(data));
    writeln(data);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076


Peter Alexander <peter.alexander.au@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter.alexander.au@gmail.co
                   |                            |m


--- Comment #1 from Peter Alexander <peter.alexander.au@gmail.com> 2010-10-19 00:20:20 PDT ---
This actually seems to be a common pattern.

By "this", I mean:

auto foo(T value)
{
  T copy = value.dup;
  modify(copy);
  return copy;
}

"modify" here could be sort, reverse, schwartzSort, partition etc. which would give you sorted, reversed, schwartzSorted and partitioned. It's also the same as defining op+ in terms of op+=.

I have no idea what you would call foo though :-(

arr2 = transformed!(sort)(arr1);
arr2 = mutated!(sort)(arr1);
arr2 = modified!(sort)(arr1);
arr2 = copyModify!(sort)(arr1);

???

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076


Andrei Alexandrescu <andrei@metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrei@metalanguage.com


--- Comment #2 from Andrei Alexandrescu <andrei@metalanguage.com> 2010-10-19 06:22:08 PDT ---
One issue is that there's no standardized "create a copy" function for ranges.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #3 from Peter Alexander <peter.alexander.au@gmail.com> 2010-10-19 09:15:12 PDT ---
(In reply to comment #2)
> One issue is that there's no standardized "create a copy" function for ranges.

auto copy = array(input);

?

Ideally you'd be able to specify what container you want to copy into, but that should do as a default.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #4 from Andrei Alexandrescu <andrei@metalanguage.com> 2010-10-19 09:42:37 PDT ---
(In reply to comment #3)
> (In reply to comment #2)
> > One issue is that there's no standardized "create a copy" function for ranges.
> 
> auto copy = array(input);
> 
> ?
> 
> Ideally you'd be able to specify what container you want to copy into, but that should do as a default.

array creates an array from anything. We should have a way to say "duplicate and preserve type".

Probably the best idea is to define an optional property ".dup" for ranges. Then arrays implement it automatically, and other ranges may define it as they find fit.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #5 from Peter Alexander <peter.alexander.au@gmail.com> 2010-10-19 11:15:30 PDT ---
(In reply to comment #4)
> array creates an array from anything. We should have a way to say "duplicate and preserve type".
> 
> Probably the best idea is to define an optional property ".dup" for ranges. Then arrays implement it automatically, and other ranges may define it as they find fit.

What if you don't want to preserve type?

e.g. you have a list, but you want to get a sorted array of the elements in that list?

Why not take the best of both worlds, allowing the user to specify the new container type, but have it default to the original type, something like:

auto transformed(alias xform, OutputRange = InputRange, InputRange)(InputRange
range)
{
  ...
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #6 from bearophile_hugs@eml.cc 2010-10-19 13:32:04 PDT ---
(In reply to comment #4)

> array creates an array from anything. We should have a way to say "duplicate and preserve type".

After thinking about your words for some time I have understood your point.

So after your change, for the semantics I am looking for, I'll need to write a
bit longer code:
sorted(array(some_linked_list))

What if the input collection is not sortable? Like:
sorted(some_hash_set)

In that case I presume the compilation will fail, and I'll have to use
something like:
sorted(array(some_hash_set))

Or even:
sorted(toList(some_hash_set))

(Where toList() is similar to array() but produces some kind of list out of an
iterable).

A problem in using this:
sorted(array(some_linked_list))
is that array() is supposed to create an array duplicate of the collection, and
then sorted() is supposed to create a second useless copy of it.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 19, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #7 from bearophile_hugs@eml.cc 2010-10-19 13:33:57 PDT ---
(In reply to comment #6)

Another problem is that after your change this code will fail to compile even
if some_linked_list is immutable:
sorted(some_linked_list)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 20, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #8 from bearophile_hugs@eml.cc 2010-10-20 04:04:21 PDT ---
See another case:

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22381

This is supposed to not work:
sort(map(...))

This is supposed to work:
sorted(map(...))

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 20, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5076



--- Comment #9 from Peter Alexander <peter.alexander.au@gmail.com> 2010-10-20 08:49:43 PDT ---
(In reply to comment #8)
> See another case:
> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22381
> This is supposed to not work:
> sort(map(...))
> This is supposed to work:
> sorted(map(...))

Just implement sorted et al. something like this:

auto sorted(Output = ElementType!InputRange[], InputRange)(InputRange range)
{
  Output output = Output(range); // copy range into new container
  sort(output);
  return output;
}

I don't know if you can construct built-in arrays like that, but you should, and you can always specialise for it if necessary.

This allows the input range to be whatever it likes (including Map), and also gives you the choice of the output range.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
« First   ‹ Prev
1 2