Thread overview | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
October 24, 2014 [Issue 13595] A possible problem with std.algorithm.groupBy | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 hsteoh@quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hsteoh@quickfur.ath.cx -- |
October 24, 2014 [Issue 13595] A possible problem with std.algorithm.groupBy | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 hsteoh@quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- OS|Windows |All --- Comment #1 from hsteoh@quickfur.ath.cx --- Definitely a bug, the returned subranges should not have empty elements. -- |
October 24, 2014 [Issue 13595] A possible problem with std.algorithm.groupBy | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 hsteoh@quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement --- Comment #2 from hsteoh@quickfur.ath.cx --- Actually, I take that back. This is not a bug (w.r.t. the current documentation) because the predicate is expected to be an equivalence relation, but (x,y)=>((x*y)%3)==0 is not a reflexive relation. So the issue is more, should groupBy be extended to support non-equivalence relations? I have considered this before but decided against it because it would be more inefficient to implement. However, allowing non-equivalence relations between adjacent elements *does* allow for much more interesting groupings (e.g., you could implement word breaks with it), so arguably it should be supported somehow. Marking this as as enhancement request. -- |
October 27, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 hsteoh@quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|A possible problem with |Extend |std.algorithm.groupBy |std.algorithm.groupBy to | |support non-equivalence | |relations -- |
November 04, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 hsteoh@quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |pull --- Comment #3 from hsteoh@quickfur.ath.cx --- https://github.com/D-Programming-Language/phobos/pull/2654 Note that the Haskell example is actually wrong, because the Haskell groupBy function also assumes that the predicate is an equivalence relation. It just so happens that Haskell's groupBy only assumes transitivity, whereas the original implementation of std.algorithm.groupBy also assumes reflexivity, which causes an infinite loop when that assumption fails to hold. In the above referenced PR, groupBy now defaults to assuming a non-equivalence relation, meaning that the predicate is evaluated for every pair of adjacent elements, so the OP's predicate (x,y) => (x*y)%3 == 0 will produce the grouping [[1], [2,3,4], [5,6,7], [8,9]], because both 2*3 and 3*4 are divisible by 3, as are 5*6 and 6*7. (Notice that Haskell's groupBy evaluates the predicate against the first element of the subrange, thus it checks 2*3 and 2*4 instead of 2*3 and 3*4 for the second subrange.) -- |
November 04, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 --- Comment #4 from hsteoh@quickfur.ath.cx --- P.S. Due to the extra performance overhead in evaluating the predicate for every adjacent pair of elements (and also additional logic to ensure correctness), groupBy now takes a compile-time parameter that can be used to get the original performance when the predicate is an equivalence relation. This way we have the best of both worlds. :-) -- |
November 04, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 --- Comment #5 from bearophile_hugs@eml.cc --- (In reply to hsteoh from comment #4) > P.S. Due to the extra performance overhead in evaluating the predicate for every adjacent pair of elements (and also additional logic to ensure correctness), groupBy now takes a compile-time parameter that can be used to get the original performance when the predicate is an equivalence relation. This way we have the best of both worlds. :-) Thank you. But there's a risk of "Boolean Blindness" (http://existentialtype.wordpress.com/2011/03/15/boolean-blindness/ ), so perhaps it's better to use std.typecons.Flag instead of a bool. Regarding groupBy, I hope someday we'll also have a hashGroupBy or something similar (Issue 9842 ). -- |
December 01, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 --- Comment #6 from github-bugzilla@puremagic.com --- Commit pushed to master at https://github.com/D-Programming-Language/phobos https://github.com/D-Programming-Language/phobos/commit/5986e740873d6af793610381a942df78ab41f137 Merge pull request #2654 from quickfur/issue13595b Issue 13595: Extend groupBy to handle non-equivalence relations. -- |
December 01, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 hsteoh@quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED -- |
December 01, 2014 [Issue 13595] Extend std.algorithm.groupBy to support non-equivalence relations | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=13595 --- Comment #7 from bearophile_hugs@eml.cc --- As second try I've used groupBy to find the longest anagrams in a dictionary of words. "words.txt" is a text file that contains one different word each line. void main() { import std.stdio, std.algorithm, std.string, std.file, std.array; auto anags = "words.txt" .readText .split .schwartzSort!((string w) => w.dup.representation.sort().release) .groupBy!((w1, w2) => w1.dup.representation.sort().equal(w2.dup.representation.sort())) .map!array .array .sort!((g1, g2) => g1.length > g2.length) .release .groupBy!((g1, g2) => g1.length == g2.length) .front; writefln("%(%s\n%)", anags); } If I replace "(string w)" with "w" in the first schwartzSort I receive the errors: ...\dmd2\src\phobos\std\algorithm.d(4659,5): Error: field _prev must be initialized in constructor, because it is nested struct ...\dmd2\src\phobos\std\algorithm.d(4770,51): Error: template instance anagrams1c.main.groupBy!(__lambda2, cast(Flag)false, SortedRange!(string[], (a, b) => binaryFun!less(unaryFun!transform(a), unaryFun!transform(b)))) error instantiating anagrams1c.d(9,10): instantiated from here: groupBy!((w1, w2) => w1.dup.representation.sort().equal(w2.dup.representation.sort()), SortedRange!(string[], (a, b) => binaryFun!less(unaryFun!transform(a), unaryFun!transform(b)))) If I remove the last ".release" (at line 13) I receive the similar errors: ...\dmd2\src\phobos\std\algorithm.d(4659,5): Error: field _prev must be initialized in constructor, because it is nested struct ...\dmd2\src\phobos\std\algorithm.d(4770,51): Error: template instance anagrams1c.main.groupBy!(__lambda4, cast(Flag)false, SortedRange!(string[][], __lambda3)) error instantiating anagrams1c.d(14,10): instantiated from here: groupBy!((g1, g2) => g1.length == g2.length, SortedRange!(string[][], __lambda3)) Is this a problem of groupBy? -- |
Copyright © 1999-2021 by the D Language Foundation