January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On 1/23/15 1:36 PM, H. S. Teoh via Digitalmars-d wrote:
> On Fri, Jan 23, 2015 at 08:44:05PM +0000, via Digitalmars-d wrote:
> [...]
>> You are talking about two different functions here. group by and
>> partition by. The function that has been implemented is often called
>> partition by.
> [...]
>
> It's not too late to rename it, since we haven't released it yet. We
> still have a little window of time to make this change if necessary.
> Andrei?
>
> Returning each group as a tuple sounds like a distinct, albeit related,
> function. It can probably be added separately.
We already have partition() functions that actually partition a range into two subranges, so adding partitionBy with a different meaning may be confusing. -- Andrei
|
January 23, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to MattCoder | On 1/23/15 1:56 PM, MattCoder wrote:
> On Friday, 23 January 2015 at 18:08:30 UTC, Andrei Alexandrescu wrote:
>> So H.S. Teoh awesomely took
>> https://github.com/D-Programming-Language/phobos/pull/2878 to
>> completion. We now have a working and fast relational "group by"
>> facility.
>>
>> See it at work!
>>
>> ----
>> #!/usr/bin/rdmd
>>
>> void main()
>> {
>> import std.algorithm, std.stdio;
>> [293, 453, 600, 929, 339, 812, 222, 680, 529, 768]
>> .groupBy!(a => a & 1)
>> .writeln;
>> }
>> ----
>>
>> [[293, 453], [600], [929, 339], [812, 222, 680], [529], [768]]
>
> Sorry if this a dumb question, but since you're grouping an array
> according some rule, this shouldn't be:
>
> [293, 453, 929, 339, 529][600, 812, 222, 680, 768]
>
> ?
>
> Because then you have the array of "trues" and "falses" according the
> condition (a & 1).
Yah, that would be partition(). -- Andrei
|
January 24, 2015 proper groupBy | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu |
On Friday, 23 January 2015 at 20:28:32 UTC, Andrei Alexandrescu wrote:
> On 1/23/15 12:19 PM, Ary Borenszweig wrote:
>> In most languages group by yields a tuple of {group key, group values}.
>
> Interesting, thanks. Looks like we're at a net loss of information with our current approach.
>
> @quickfur, do you think you could expose a tuple with "key" and "values"? The former would be the function value, the latter would be what we offer right now.
>
> That would apply only to the unary version of groupBy.
>
>
> Andrei
groupby hack below ? I haven't yet read the source code and don't feel I understand ranges deeply enough to know if this will work in the general case. But it at least works for the example (I think).
Laeeth.
#!/usr/bin/rdmd
void main()
{
import std.algorithm, std.stdio, std.range;
auto index=[293, 453, 600, 929, 339, 812, 222, 680, 529, 768];
auto vals=[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
auto zippy=zip(index,vals);
zippy.groupBy!(a=> a[0] & 1)
.writeln;
}
[root@fedorabox test]# ./groupby
[[Tuple!(int, int)(293, 1), Tuple!(int, int)(453, 2)], [Tuple!(int, int)(600, 3)], [Tuple!(int, int)(929, 4), Tuple!(int, int)(339, 5)], [Tuple!(int, int)(812, 6), Tuple!(int, int)(222, 7), Tuple!(int, int)(680, 8)], [Tuple!(int, int)(529, 9)], [Tuple!(int, int)(768, 10)]]
|
January 25, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 1/23/15 8:54 PM, Andrei Alexandrescu wrote:
> On 1/23/15 1:36 PM, H. S. Teoh via Digitalmars-d wrote:
>> On Fri, Jan 23, 2015 at 08:44:05PM +0000, via Digitalmars-d wrote:
>> [...]
>>> You are talking about two different functions here. group by and
>>> partition by. The function that has been implemented is often called
>>> partition by.
>> [...]
>>
>> It's not too late to rename it, since we haven't released it yet. We
>> still have a little window of time to make this change if necessary.
>> Andrei?
>>
>> Returning each group as a tuple sounds like a distinct, albeit related,
>> function. It can probably be added separately.
>
> We already have partition() functions that actually partition a range
> into two subranges, so adding partitionBy with a different meaning may
> be confusing. -- Andrei
Another name might be chunkBy: it returns chunks that are grouped by some logic.
|
January 25, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 1/23/15 7:30 PM, bearophile wrote:
> H. S. Teoh:
>
>> What you describe could be an interesting candidate to add, though. It
>> could iterate over distinct values of the predicate, and traverse the
>> forward range (input ranges obviously can't work unless you allocate,
>> which makes it no longer lazy) each time. This, however, has O(n*k)
>> complexity where k is the number of distinct predicate values.
>
> Let's allocate, creating an associative array inside the grouping
> function :-)
>
> Bye,
> bearophile
All languages I know do this for `group by` (because of the complexity involved), and I think it's ok to do so.
|
January 25, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ary Borenszweig | On Sun, Jan 25, 2015 at 01:39:59AM -0300, Ary Borenszweig via Digitalmars-d wrote: > On 1/23/15 8:54 PM, Andrei Alexandrescu wrote: > >On 1/23/15 1:36 PM, H. S. Teoh via Digitalmars-d wrote: > >>On Fri, Jan 23, 2015 at 08:44:05PM +0000, via Digitalmars-d wrote: [...] > >>>You are talking about two different functions here. group by and partition by. The function that has been implemented is often called partition by. > >>[...] > >> > >>It's not too late to rename it, since we haven't released it yet. We still have a little window of time to make this change if necessary. Andrei? > >> > >>Returning each group as a tuple sounds like a distinct, albeit related, function. It can probably be added separately. > > > >We already have partition() functions that actually partition a range into two subranges, so adding partitionBy with a different meaning may be confusing. -- Andrei > > Another name might be chunkBy: it returns chunks that are grouped by some logic. Incidentally, that was the original name I implemented it under. T -- What do you get if you drop a piano down a mineshaft? A flat minor. |
January 25, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | > We already have partition() functions that actually partition a range into two subranges, so adding partitionBy with a different meaning may be confusing. -- Andrei In ruby, the closest to D's currently-named groupBy method is a set of three methods: slice_before slice_after slice_when http://ruby-doc.org/core-2.2.0/Enumerable.html#method-i-slice_when Your example in ruby would be: 2.2.0 > [293, 453, 600, 929, 339, 812, 222, 680, 529, 768].slice_when { |x,y| x & 1 != y & 1 }.to_a => [[293, 453], [600], [929, 339], [812, 222, 680], [529], [768]] O. |
January 26, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 2015-01-23 at 10:08 -0800, Andrei Alexandrescu via Digitalmars-d wrote: […] > #!/usr/bin/rdmd > > void main() > { > import std.algorithm, std.stdio; > [293, 453, 600, 929, 339, 812, 222, 680, 529, 768] > .groupBy!(a => a & 1) > .writeln; > } > ---- > > [[293, 453], [600], [929, 339], [812, 222, 680], [529], [768]] > […] I think I must be missing something, for me the result of a groupBy operation on the above input data should be: [1:[293, 453, 929, 339, 529], 0:[600, 812, 222, 680, 768]] i.e. a map with keys being the cases and values being the values that meet the case. In this example a & 1 asks for cases "lowest bit 0 or 1" aka "odd or even". There is nothing wrong with the semantics of the result above, but is it's name "group by" as understood by the rest of the world? -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder@ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel@winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder |
January 26, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to Russel Winder | Russel Winder:
> but is it's name "group by" as understood by the rest of the world?
Nope...
Bye,
bearophile
|
January 26, 2015 Re: [WORK] groupBy is in! Next: aggregate | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Mon, Jan 26, 2015 at 11:26:04AM +0000, bearophile via Digitalmars-d wrote: > Russel Winder: > > >but is it's name "group by" as understood by the rest of the world? > > Nope... [...] I proposed to rename it but it got shot down. *shrug* We still have a short window of time to sort this out, before 2.067 is released... T -- Don't drink and derive. Alcohol and algebra don't mix. |
Copyright © 1999-2021 by the D Language Foundation