January 23, 2015 [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
So H.S. Teoh awesomely took https://github.com/D-Programming-Language/phobos/pull/2878 to completion. We now have a working and fast relational "group by" facility. See it at work! ---- #!/usr/bin/rdmd void main() { import std.algorithm, std.stdio; [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] .groupBy!(a => a & 1) .writeln; } ---- [[293, 453], [600], [929, 339], [812, 222, 680], [529], [768]] The next step is to define an aggregate() function, which is a lot similar to reduce() but works on ranges of ranges and aggregates a function over each group. Continuing the previous example: [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] .groupBy!(a => a & 1) .aggregate!max .writeln; should print: [453, 600, 929, 812, 529, 768] The aggregate function should support aggregating several functions at once, e.g. aggregate!(min, max) etc. Takers? Andrei |
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 23 Jan 2015 10:08:30 -0800, Andrei Alexandrescu wrote:
> So H.S. Teoh awesomely took https://github.com/D-Programming-Language/phobos/pull/2878 to completion. We now have a working and fast relational "group by" facility.
>
This is great news. It seems like every time I make use of component programming, I need groupBy at least once. I have a D file with an old copy of a groupBy implementation (I think it's Andrei's original stab at it) and it gets copied around to the various projects.
|
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, Jan 23, 2015 at 10:08:30AM -0800, Andrei Alexandrescu via Digitalmars-d wrote: > So H.S. Teoh awesomely took https://github.com/D-Programming-Language/phobos/pull/2878 to completion. We now have a working and fast relational "group by" facility. Unfortunately it doesn't work in pure/@safe/nothrow code because of limitations in the current RefCounted implementation. [...] > The next step is to define an aggregate() function, which is a lot > similar to reduce() but works on ranges of ranges and aggregates a > function over each group. Continuing the previous example: > > [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] > .groupBy!(a => a & 1) > .aggregate!max > .writeln; > > should print: > > [453, 600, 929, 812, 529, 768] > > The aggregate function should support aggregating several functions at once, e.g. aggregate!(min, max) etc. > > Takers? [...] Isn't that just a simple matter of defining aggregate() in terms of map() and reduce()? Working example: import std.algorithm.comparison : max; import std.algorithm.iteration; import std.stdio; auto aggregate(alias func, RoR)(RoR ror) { return ror.map!(reduce!func); } void main() { [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] .groupBy!(a => a & 1) .aggregate!max .writeln; } Output is as expected. T -- Verbing weirds language. -- Calvin (& Hobbes) |
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
On Fri, Jan 23, 2015 at 10:29:13AM -0800, H. S. Teoh via Digitalmars-d wrote: > On Fri, Jan 23, 2015 at 10:08:30AM -0800, Andrei Alexandrescu via Digitalmars-d wrote: [...] > > The next step is to define an aggregate() function, which is a lot > > similar to reduce() but works on ranges of ranges and aggregates a > > function over each group. Continuing the previous example: > > > > [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] > > .groupBy!(a => a & 1) > > .aggregate!max > > .writeln; > > > > should print: > > > > [453, 600, 929, 812, 529, 768] > > > > The aggregate function should support aggregating several functions at once, e.g. aggregate!(min, max) etc. [...] Here's a working variadic implementation: import std.algorithm.comparison : max, min; import std.algorithm.iteration; import std.stdio; template aggregate(funcs...) { auto aggregate(RoR)(RoR ror) { return ror.map!(reduce!funcs); } } void main() { [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] .groupBy!(a => a & 1) .aggregate!(max,min) .writeln; } Output (kinda ugly, but it works): [Tuple!(int, int)(453, 293), Tuple!(int, int)(600, 600), Tuple!(int, int)(929, 339), Tuple!(int, int)(812, 222), Tuple!(int, int)(529, 529), Tuple!(int, int)(768, 768)] Of course, it will require a little more polish before merging into Phobos, but the core implementation is nowhere near the complexity of groupBy. T -- The best compiler is between your ears. -- Michael Abrash |
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On 1/23/15 10:29 AM, H. S. Teoh via Digitalmars-d wrote:
> On Fri, Jan 23, 2015 at 10:08:30AM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
>> So H.S. Teoh awesomely took
>> https://github.com/D-Programming-Language/phobos/pull/2878 to
>> completion. We now have a working and fast relational "group by"
>> facility.
>
> Unfortunately it doesn't work in pure/@safe/nothrow code because of
> limitations in the current RefCounted implementation.
>
>
> [...]
>> The next step is to define an aggregate() function, which is a lot
>> similar to reduce() but works on ranges of ranges and aggregates a
>> function over each group. Continuing the previous example:
>>
>> [293, 453, 600, 929, 339, 812, 222, 680, 529, 768]
>> .groupBy!(a => a & 1)
>> .aggregate!max
>> .writeln;
>>
>> should print:
>>
>> [453, 600, 929, 812, 529, 768]
>>
>> The aggregate function should support aggregating several functions at
>> once, e.g. aggregate!(min, max) etc.
>>
>> Takers?
> [...]
>
> Isn't that just a simple matter of defining aggregate() in terms of
> map() and reduce()? Working example:
>
> import std.algorithm.comparison : max;
> import std.algorithm.iteration;
> import std.stdio;
>
> auto aggregate(alias func, RoR)(RoR ror) {
> return ror.map!(reduce!func);
> }
>
> void main() {
> [293, 453, 600, 929, 339, 812, 222, 680, 529, 768]
> .groupBy!(a => a & 1)
> .aggregate!max
> .writeln;
> }
>
> Output is as expected.
Clever! Or, conversely, I'm not that bright! Yes, this is awesome. Probably the actual name "aggregate" should be defined even with that trivial implementation to help folks like me :o). -- Andrei
|
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On 1/23/15 10:34 AM, H. S. Teoh via Digitalmars-d wrote: > Of course, it will require a little more polish before merging into > Phobos, but the core implementation is nowhere near the complexity of > groupBy. open https://github.com/D-Programming-Language/phobos/pulls [F5]... [F5]... [F5]... Andrei |
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, Jan 23, 2015 at 10:47:28AM -0800, Andrei Alexandrescu via Digitalmars-d wrote: > On 1/23/15 10:34 AM, H. S. Teoh via Digitalmars-d wrote: > >Of course, it will require a little more polish before merging into Phobos, but the core implementation is nowhere near the complexity of groupBy. > > open https://github.com/D-Programming-Language/phobos/pulls > > [F5]... [F5]... [F5]... [...] void main() { foreach (iota(0 .. 60 * 60 * F5sPerSecond)) writeln("[F5]..."); writeln(q"ENDMSG https://github.com/D-Programming-Language/phobos/pull/2899 ENDMG); } ;-) T -- No! I'm not in denial! |
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 1/23/15 3:08 PM, Andrei Alexandrescu wrote: > So H.S. Teoh awesomely took > https://github.com/D-Programming-Language/phobos/pull/2878 to > completion. We now have a working and fast relational "group by" facility. > > See it at work! > > ---- > #!/usr/bin/rdmd > > void main() > { > import std.algorithm, std.stdio; > [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] > .groupBy!(a => a & 1) > .writeln; > } > ---- > > [[293, 453], [600], [929, 339], [812, 222, 680], [529], [768]] > > The next step is to define an aggregate() function, which is a lot > similar to reduce() but works on ranges of ranges and aggregates a > function over each group. Continuing the previous example: > > [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] > .groupBy!(a => a & 1) > .aggregate!max > .writeln; > > should print: > > [453, 600, 929, 812, 529, 768] > > The aggregate function should support aggregating several functions at > once, e.g. aggregate!(min, max) etc. > > Takers? > > > Andrei In most languages group by yields a tuple of {group key, group values}. For example (Ruby or Crystal): a = [1, 4, 2, 4, 5, 2, 3, 7, 9] groups = a.group_by { |x| x % 3 } puts groups #=> {1 => [1, 4, 4, 7], 2 => [2, 5, 2], 0 => [3, 9]} In C# it's also called group by: http://www.dotnetperls.com/groupby Java: http://docs.oracle.com/javase/8/docs/api/java/util/stream/Collectors.html#groupingBy-java.util.function.Function- SQL: http://www.w3schools.com/sql/sql_groupby.asp So I'm not sure "groupBy" is a good name for this. |
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ary Borenszweig | On 1/23/15 12:19 PM, Ary Borenszweig wrote:
> In most languages group by yields a tuple of {group key, group values}.
Interesting, thanks. Looks like we're at a net loss of information with our current approach.
@quickfur, do you think you could expose a tuple with "key" and "values"? The former would be the function value, the latter would be what we offer right now.
That would apply only to the unary version of groupBy.
Andrei
|
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ary Borenszweig | Ary Borenszweig:
> In most languages group by yields a tuple of {group key, group values}.
I'm saying this since some years... (and those languages probably don't use sorting to perform the aggregation).
Bye,
bearophile
|
Copyright © 1999-2021 by the D Language Foundation