Thread overview
Set Intersection of SortedRanges
Dec 05, 2016
Nordlöw
Dec 05, 2016
Nordlöw
Dec 05, 2016
Nordlöw
Dec 06, 2016
Nicholas Wilson
Dec 06, 2016
Nicholas Wilson
Dec 06, 2016
Nicholas Wilson
Dec 06, 2016
Nordlöw
December 05, 2016
What's the fastest way of calculating a set-intersection of two or more SortedRanges (all containing unique values)?

Any typical branchings of the algorithm depending on the lengths of the SortedRanges?
December 05, 2016
On Monday, 5 December 2016 at 21:34:39 UTC, Nordlöw wrote:
> What's the fastest way of calculating a set-intersection of two or more SortedRanges (all containing unique values)?

Ahh, setops has intersection aswell:

https://dlang.org/phobos/std_algorithm_setops.html#setIntersection

I should have a guessed that.
December 05, 2016
On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:
> Ahh, setops has intersection aswell:
>
> https://dlang.org/phobos/std_algorithm_setops.html#setIntersection
>
> I should have a guessed that.

Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here

http://cs.stackexchange.com/a/37124/25769

might be faster.

Has anybody already done this?
December 06, 2016
On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:
> On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:
>> Ahh, setops has intersection aswell:
>>
>> https://dlang.org/phobos/std_algorithm_setops.html#setIntersection
>>
>> I should have a guessed that.
>
> Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here
>
> http://cs.stackexchange.com/a/37124/25769
>
> might be faster.
>
> Has anybody already done this?

The double binary search linked only finds *one* element of the intersection.

It should be as simple as a linear find (over the range of ranges) that returns (i.e. caches to .front) if the elements intersect.

December 06, 2016
On Tuesday, 6 December 2016 at 01:46:38 UTC, Nicholas Wilson wrote:
> On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:
>> On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:
>>> Ahh, setops has intersection aswell:
>>>
>>> https://dlang.org/phobos/std_algorithm_setops.html#setIntersection
>>>
>>> I should have a guessed that.
>>
>> Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here
>>
>> http://cs.stackexchange.com/a/37124/25769
>>
>> might be faster.
>>
>> Has anybody already done this?
>
> The double binary search linked only finds *one* element of the intersection.
>
> It should be as simple as a linear find (over the range of ranges) that returns (i.e. caches to .front) if the elements intersect.

Hmm not sure that will work.

The annoying property of this problem is calculating .empty and having .empty not mutate the ranges. however if you settle for an eager empty something like

auto setIntersect(RoR,alias _pred)(RoR ror) if (allSatisfy!(RoR, isSortedRange!_pred) &&
                                                                  allSatisfy!(RoR, isSame!(ElementType) &&
                                                                 !allSatisfy!(RoR, isRandomAccessRange))
{
    enum lengths = RoR.length;
    alias elem = ElementType!(RoR[0]);
    alias pred = BinaryFun!_pred;
    struct SetIntersect
    {
        RoR ror;
        elem[lengths] fronts = void;
        elem front = void;
        bool valid = false;
        this(RoR r)
        {
            ror = r;
            populate();
        }
        void populate()
        {
            foreach(i, ref r; ror)
                  fronts[i] = r.front;
        }
        void advance()
        {
            foreach( ref r; ror)
                r.popFront();
        }
        void popFront() { valid = false; }
        bool empty()
        {
            if (valid) return false;
            advance();
            populate();
            if (reduce!(equal)(fronts)) return false;
            elem m = reduce!(pred)(fronts);
            foreach(i, ref r; ror)
            {
                  while(!pred(r.front,m))
                         r.popFront();

            }
            populate();
            valid = reduce!(equal)(fronts));
            return !valid;
        }

    }
}

December 06, 2016
Whoops forgot to add checks for .empty on the ranges, and I don't think reduce!equal work but you get the point.
December 06, 2016
On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:
> Has anybody already done this?

This gives good guidance on three different approaches.

http://stackoverflow.com/questions/2400157/the-intersection-of-two-sorted-arrays/4601106#4601106

Luckily we already have galloping search in Phobos :)

I'll do a couple of benchmarks.