Thread overview | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
February 13, 2015 groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Looks like there's a backlog of stuff to finalize for groupBy and aggregate: * Perhaps rename groupBy to chunkBy. People coming from SQL and other languages might expect groupBy to do hash-based grouping. * The unary function implementation must return for each group a tuple consisting of the key and the lazy range of values. The binary function implementation should continue to only return the lazy range of values. * SortedRange should add a method called group(). Invoked with no predicate, group() should do what chunkBy does, using the sorting predicate. * aggregate() should detect the two kinds of results per group (well, chunk) and process them accordingly: for unary-predicate chunks, pass the key through and only process the lazy range. Meaning: auto data = [ tuple("John", 100), tuple("John", 35), tuple("Jane", 200), tuple("Jane", 87), ]; auto r = data.chunkBy!(x => x[0]).aggregate!sum; yields a range of tuples: tuple("John", 135), tuple("Jane", 187). Andrei |
February 13, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Friday, 13 February 2015 at 18:32:35 UTC, Andrei Alexandrescu wrote: > * Perhaps rename groupBy to chunkBy. People coming from SQL and other languages might expect groupBy to do hash-based grouping. Agreed. > * The unary function implementation must return for each group a tuple consisting of the key and the lazy range of values. The binary function implementation should continue to only return the lazy range of values. Is the purpose of this just to avoid the user potentially needing to evaluate the key function twice? > * SortedRange should add a method called group(). Invoked with no predicate, group() should do what chunkBy does, using the sorting predicate. Will need to be called something else since there may be existing code trying to call std.algorithm.group using UFCS. This would change its behaviour. > * aggregate() should detect the two kinds of results per group (well, chunk) and process them accordingly: for unary-predicate chunks, pass the key through and only process the lazy range. Meaning: > > auto data = [ > tuple("John", 100), > tuple("John", 35), > tuple("Jane", 200), > tuple("Jane", 87), > ]; > auto r = data.chunkBy!(x => x[0]).aggregate!sum; > > yields a range of tuples: tuple("John", 135), tuple("Jane", 187). Not sure I understand how this is meant to work. With your second bullet implemented, data.chunkBy!(x => x[0]) will return: tuple("John", [tuple("John", 100), tuple("John", 35)]), tuple("Jane", [tuple("Jane", 200), tuple("Jane", 87)]) (here [...] denotes the sub-range, not an array). So aggregate will ignore the key part, but how does it know to ignore the name in sub-ranges? |
February 14, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Peter Alexander | On 2/13/15 3:45 PM, Peter Alexander wrote: > On Friday, 13 February 2015 at 18:32:35 UTC, Andrei Alexandrescu wrote: >> * Perhaps rename groupBy to chunkBy. People coming from SQL and other >> languages might expect groupBy to do hash-based grouping. > > Agreed. > > >> * The unary function implementation must return for each group a tuple >> consisting of the key and the lazy range of values. The binary >> function implementation should continue to only return the lazy range >> of values. > > Is the purpose of this just to avoid the user potentially needing to > evaluate the key function twice? Yah. Also in many cases of grouping you need the key anyway. >> * SortedRange should add a method called group(). Invoked with no >> predicate, group() should do what chunkBy does, using the sorting >> predicate. > > Will need to be called something else since there may be existing code > trying to call std.algorithm.group using UFCS. This would change its > behaviour. Oops, I thought that's groups. I guess we could call it groupBy as well, even though it has no predicate so "by" does not participate to a sentence. >> * aggregate() should detect the two kinds of results per group (well, >> chunk) and process them accordingly: for unary-predicate chunks, pass >> the key through and only process the lazy range. Meaning: >> >> auto data = [ >> tuple("John", 100), >> tuple("John", 35), >> tuple("Jane", 200), >> tuple("Jane", 87), >> ]; >> auto r = data.chunkBy!(x => x[0]).aggregate!sum; >> >> yields a range of tuples: tuple("John", 135), tuple("Jane", 187). > > Not sure I understand how this is meant to work. > > With your second bullet implemented, data.chunkBy!(x => x[0]) will return: > > tuple("John", [tuple("John", 100), tuple("John", 35)]), > tuple("Jane", [tuple("Jane", 200), tuple("Jane", 87)]) Correct. > (here [...] denotes the sub-range, not an array). > > So aggregate will ignore the key part, but how does it know to ignore > the name in sub-ranges? Oops, I was wrong here. Let's think about aggregate() integration post-2.067 and remove it for now. Peter, could you please take this? Andrei |
February 15, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Saturday, 14 February 2015 at 19:39:44 UTC, Andrei Alexandrescu wrote: > > Peter, could you please take this? Yep, I have some time. https://issues.dlang.org/show_bug.cgi?id=14183 |
February 15, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Peter Alexander | On 2/15/15 11:34 AM, Peter Alexander wrote:
> On Saturday, 14 February 2015 at 19:39:44 UTC, Andrei Alexandrescu wrote:
>>
>> Peter, could you please take this?
>
> Yep, I have some time.
>
> https://issues.dlang.org/show_bug.cgi?id=14183
Fantastic, thanks! Remember we plan to release on March 1. -- Andrei
|
April 17, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Sunday, 15 February 2015 at 19:42:16 UTC, Andrei Alexandrescu wrote: > On 2/15/15 11:34 AM, Peter Alexander wrote: >> >> https://issues.dlang.org/show_bug.cgi?id=14183 > > Fantastic, thanks! Remember we plan to release on March 1. -- I am somewhat confused. I know these changes have been done. The function has been renamed to chunkBy and the return type of the unary version has been changed. I am surprised to learn, however, that the old version is included in the 2.067.0 release: http://dlang.org/phobos/std_algorithm_iteration.html#.groupBy What has happened? |
April 18, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ulrich Küttler | On 4/17/15 2:30 PM, "Ulrich =?UTF-8?B?S8O8dHRsZXIi?= <kuettler@gmail.com>" wrote:
> On Sunday, 15 February 2015 at 19:42:16 UTC, Andrei Alexandrescu wrote:
>> On 2/15/15 11:34 AM, Peter Alexander wrote:
>>>
>>> https://issues.dlang.org/show_bug.cgi?id=14183
>>
>> Fantastic, thanks! Remember we plan to release on March 1. --
>
> I am somewhat confused. I know these changes have been done. The
> function has been renamed to chunkBy and the return type of the unary
> version has been changed. I am surprised to learn, however, that the old
> version is included in the 2.067.0 release:
>
> http://dlang.org/phobos/std_algorithm_iteration.html#.groupBy
>
> What has happened?
Sighhhh... the master has chunkBy but 2.067.0 has groupBy. Martin? -- Andrei
|
April 18, 2015 Re: groupBy/chunkBy redux | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | I wonder what it's going to look like to see byChunk and chunkBy next to each other. |
Copyright © 1999-2021 by the D Language Foundation