Thread overview |
---|

December 03 Intersection of two sets | ||||
---|---|---|---|---|

| ||||

It seems i don't google the right keywords. What i want to do: I have two sets. (I didn't find how to do sets, so i have two associative boolean arrays `bool[<something>]`). And i want to join them, via an intersection. I know how to code this. Loop over one AA, if the key is also in the other one, we add that to the third array which is going to be returned. pseudocode: alias set = bool[<something>] set foo = ... set bar = ... set result; foreach(key; foo) { if (key in bar) { result[key] = true; } } return result; 1) Is there a better way for creating a set, other then `alias set = bool[keyClass]`? 2) Are there some build-in function for handling such sets? |

December 03 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to Jan Hönig | On Tuesday, 3 December 2019 at 13:43:26 UTC, Jan Hönig wrote: > It seems i don't google the right keywords. > > What i want to do: I have two sets. (I didn't find how to do sets, so i have two associative boolean arrays `bool[<something>]`). And i want to join them, via an intersection. > > I know how to code this. Loop over one AA, if the key is also in the other one, we add that to the third array which is going to be returned. > > pseudocode: > alias set = bool[<something>] > set foo = ... > set bar = ... > set result; > > foreach(key; foo) > { > if (key in bar) > { > result[key] = true; > } > } > return result; > > > 1) Is there a better way for creating a set, other then `alias set = bool[keyClass]`? This depends on the available accesses on your sets. In terms of ranges: Are your sets InputRanges, ForwardRange, ... ? > 2) Are there some build-in function for handling such sets? This is maybe what you are looking for: https://dlang.org/phobos/std_algorithm_setops.html |

December 03 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to Jan Hönig | ```
On Tuesday, 3 December 2019 at 13:43:26 UTC, Jan Hönig wrote:
> It seems i don't google the right keywords.
>
> What i want to do: I have two sets. (I didn't find how to do sets, so i have two associative boolean arrays `bool[<something>]`). And i want to join them, via an intersection.
>
> I know how to code this. Loop over one AA, if the key is also in the other one, we add that to the third array which is going to be returned.
>
> pseudocode:
> alias set = bool[<something>]
> set foo = ...
> set bar = ...
> set result;
>
> foreach(key; foo)
> {
> if (key in bar)
> {
> result[key] = true;
> }
> }
> return result;
>
>
> 1) Is there a better way for creating a set, other then `alias set = bool[keyClass]`?
> 2) Are there some build-in function for handling such sets?
Never tried, but depending of the nature of your "something" you can try bit sets. There are efficient algorithms for large bit arrays (see "roaring" for example).
``` |

December 03 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to ikod | ```
On Tuesday, 3 December 2019 at 15:14:03 UTC, ikod wrote:
>
> Never tried, but depending of the nature of your "something" you can try bit sets. There are efficient algorithms for large bit arrays (see "roaring" for example).
"roaring" is massive overkill for my case, but thanks for suggesting it. I didn't know about it.
``` |

December 03 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to Alex | ```
On Tuesday, 3 December 2019 at 13:55:51 UTC, Alex wrote:
>
> This depends on the available accesses on your sets. In terms of ranges:
> Are your sets InputRanges, ForwardRange, ... ?
>
>
>> 2) Are there some build-in function for handling such sets?
>
> This is maybe what you are looking for: https://dlang.org/phobos/std_algorithm_setops.html
In terms of ranges, i need to understand ranges properly first.
The std.algorithm.setops have definitely the functionality i need.
I guess my current implementation would be a simple array.
I just need to make sure delete, or not create double entries.
``` |

December 03 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to Jan Hönig | On Tuesday, 3 December 2019 at 18:45:18 UTC, Jan Hönig wrote: > On Tuesday, 3 December 2019 at 13:55:51 UTC, Alex wrote: >> >> This depends on the available accesses on your sets. In terms of ranges: >> Are your sets InputRanges, ForwardRange, ... ? >> >> >>> 2) Are there some build-in function for handling such sets? >> >> This is maybe what you are looking for: https://dlang.org/phobos/std_algorithm_setops.html > > In terms of ranges, i need to understand ranges properly first. The top of https://dlang.org/phobos/std_range.html has some talks and tutorials to consider. > The std.algorithm.setops have definitely the functionality i need. > I guess my current implementation would be a simple array. > I just need to make sure delete, or not create double entries. A problem with the unittest examples is that they assume the reader is very familiar with the language :) Here are some things to keep in mind with setIntersection: 1. setIntersection expects its inputs to be sorted. If they aren't, it doesn't notice; you just don't get the behavior you want. int[] a = [4, 2, 3, 1], b = [8, 4, 2, 6]; assert(setIntersection(a, b).array == []); // ! assert(setIntersection(a.sort, b.sort).array == [2, 4]); 2. setIntersection returns a range, not an array. This probably lets you return the infinite intersections of two infinite ranges, for example. If you're looking at this for Advent of Code you'll probably want to pass the intersections through some other operators that will accept a range. (Also, sort mutates *and* returns the array. if you're depending on the positions of 2 and 4 in the original arrays, you should get the intersection of a sorted copy of them.) In general, pay attention to the static constraints: if (Rs.length >= 2 && allSatisfy!(isInputRange, Rs) && !is(CommonType!(staticMap!(ElementType, Rs)) == void)); In English: 1. if you're getting an intersection of ranges, you need to provide at least two ranges. 2. all of the ranges need to be input ranges 3. the ranges' element types need to have a common type that isn't 'void'. You can use these same constraints in your code if you're not sure if they apply static assert (isInputRange!(typeof(a))); assert(setIntersection(a, b).array == []); or you can test them interactively: $ rdmd --eval 'isInputRange!(typeof([1, 2, 3])).writeln' true And you can also check on their documentation and see if it's what you expect: https://dlang.org/phobos/std_range_primitives.html D doesn't have 500+ line errors for the simplest template misuse like C does, but its much shorter template error messages still could be friendlier. Usually they're in terms of these constraints, or you've a more basic error like having a range type (like the SetIntersection struct that setIntersection actually returns) where some other code expects a simple array, or vice versa. |

December 04 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to Jan Hönig | On Tuesday, 3 December 2019 at 13:43:26 UTC, Jan Hönig wrote: > pseudocode: > alias set = bool[<something>] > set foo = ... > set bar = ... > set result; One simple optimization is to set* smallest; set* largest; if (foo.length < bar.length) { smallest = &foo; largest = &bar; } else { smallest = &bar; largest = &foo; } > foreach(key; *smallest) > { > if (key in *largest) > { > result[key] = true; > } > } > return result; Provided that your set type has an `in`-operator with time-complexity O(1), this will, in the worst case, reduce time complexity from O(max(foo.length, bar.length)) to O(min(foo.length, bar.length)) |

December 06 Re: Intersection of two sets | ||||
---|---|---|---|---|

| ||||

Posted in reply to Per Nordlöw | ```
On Wednesday, 4 December 2019 at 08:01:59 UTC, Per Nordlöw wrote:
> On Tuesday, 3 December 2019 at 13:43:26 UTC, Jan Hönig wrote:
>> pseudocode:
>> alias set = bool[<something>]
>> set foo = ...
>> set bar = ...
>> set result;
>
> One simple optimization is to
>
> set* smallest;
> set* largest;
>
> if (foo.length < bar.length)
> {
> smallest = &foo;
> largest = &bar;
> }
> else
> {
> smallest = &bar;
> largest = &foo;
> }
>
>> foreach(key; *smallest)
>> {
>> if (key in *largest)
>> {
>> result[key] = true;
>> }
>> }
>> return result;
>
> Provided that your set type has an `in`-operator with time-complexity O(1), this will, in the worst case, reduce time complexity from
>
> O(max(foo.length, bar.length))
>
> to
>
> O(min(foo.length, bar.length))
Thanks!
I didn't even thought about optimizations like this :)
``` |

Copyright © 1999-2018 by the D Language Foundation