Jump to page: 1 2
Thread overview
Keeping a dynamic sorted range
Nov 07, 2014
bearophile
Nov 07, 2014
Max Klyga
Nov 07, 2014
bearophile
Nov 07, 2014
bearophile
Nov 10, 2014
bearophile
Nov 10, 2014
Vladimir Panteleev
Nov 10, 2014
bearophile
Nov 07, 2014
Ali Çehreli
Nov 07, 2014
bearophile
Nov 08, 2014
Jakob Ovrum
Nov 08, 2014
bearophile
Nov 10, 2014
Sean Kelly
Nov 10, 2014
Sean Kelly
Nov 10, 2014
Sean Kelly
November 07, 2014
(This is a partial repost from a recent D.learn thread.)

In Phobos we have SortedRange and assumeSorted, but I do find them not very good for a common enough use case.

The use case is to keep a sorted array, keep adding items to it (adding larger and larger items at the end. Or sometimes even inserting items in the middle. In both cases I keep the sorting invariant). And while I add items, I also now and then want to perform a binary search on the sorted range.

So sometimes I'd like to do something like this (but a SortedRange doesn't have append):

struct Foo { int x; }
SortedRange!(Foo[], q{ a.x < b.x }) data;
data ~= Foo(5);
immutable n = data.upperBound(Foo(2)).length;

Bye,
bearophile
November 07, 2014
On 2014-11-07 14:11:30 +0000, bearophile said:

> (This is a partial repost from a recent D.learn thread.)
> 
> In Phobos we have SortedRange and assumeSorted, but I do find them not very good for a common enough use case.
> 
> The use case is to keep a sorted array, keep adding items to it (adding larger and larger items at the end. Or sometimes even inserting items in the middle. In both cases I keep the sorting invariant). And while I add items, I also now and then want to perform a binary search on the sorted range.
> 
> So sometimes I'd like to do something like this (but a SortedRange doesn't have append):
> 
> struct Foo { int x; }
> SortedRange!(Foo[], q{ a.x < b.x }) data;
> data ~= Foo(5);
> immutable n = data.upperBound(Foo(2)).length;
> 
> Bye,
> bearophile

Ranges are not container. They are meant for traversing. If you want a sorted range - use an underlying container that preserves ordering (trees, heaps)

November 07, 2014
Max Klyga:

> Ranges are not container. They are meant for traversing. If you want a sorted range - use an underlying container that preserves ordering (trees, heaps)

Let's asssume that the underlying container is a sorted built-in dynamic array (that has more locality than a tree and allows very fast binary searches, unlike a heap).

So what solution do you suggest? A pair of functions that work on a sorted container of that type for Phobos?

Bye,
bearophile
November 07, 2014
> Let's asssume that the underlying container is a sorted built-in dynamic array (that has more locality than a tree and allows very fast binary searches, unlike a heap).

An array, a sorted array, is a simple data structure that often wins in terms of memory, simplicity, and efficiency. Generally it's better to not use a more complex data structure if a simpler one suffices.

Bye,
bearophile
November 07, 2014
On 11/07/2014 06:11 AM, bearophile wrote:
> (This is a partial repost from a recent D.learn thread.)
>
> In Phobos we have SortedRange and assumeSorted, but I do find them not
> very good for a common enough use case.
>
> The use case is to keep a sorted array, keep adding items to it (adding
> larger and larger items at the end. Or sometimes even inserting items in
> the middle. In both cases I keep the sorting invariant). And while I add
> items, I also now and then want to perform a binary search on the sorted
> range.
>
> So sometimes I'd like to do something like this (but a SortedRange
> doesn't have append):
>
> struct Foo { int x; }
> SortedRange!(Foo[], q{ a.x < b.x }) data;
> data ~= Foo(5);
> immutable n = data.upperBound(Foo(2)).length;
>
> Bye,
> bearophile

If an array is sorted every time an element is added, then insertion is N.log(N) and searching is log(N). I don't know when that penalty is better than data locality that an array brings.

A more traditional data structure is a binary tree in this case because it has log(N) for both insertion and search.

On the other hand, array wins if the insertions are batched and then there is a single sort before many searches. As Max Klyga said, that single sort better be applied on the container, not on the range.

Ali

November 07, 2014
Ali Çehreli:

> If an array is sorted every time an element is added,

Items are most times added at the end, and they respect the sortness of the whole array. The array never gets sorted.


> On the other hand, array wins if the insertions are batched and

Insertions are not batched, and they are interspersed with searches. This is the common use case.


> As Max Klyga said, that single sort better be applied
> on the container, not on the range.

The container unfortunately doesn't know it is sorted, so I have to verify the sortness invariant manually, and before the search I need to use an assumeSorted(). This is not good.

Bye,
bearophile
November 08, 2014
On Friday, 7 November 2014 at 14:11:32 UTC, bearophile wrote:
> (This is a partial repost from a recent D.learn thread.)
>
> In Phobos we have SortedRange and assumeSorted, but I do find them not very good for a common enough use case.
>
> The use case is to keep a sorted array, keep adding items to it (adding larger and larger items at the end. Or sometimes even inserting items in the middle. In both cases I keep the sorting invariant). And while I add items, I also now and then want to perform a binary search on the sorted range.
>
> So sometimes I'd like to do something like this (but a SortedRange doesn't have append):
>
> struct Foo { int x; }
> SortedRange!(Foo[], q{ a.x < b.x }) data;
> data ~= Foo(5);
> immutable n = data.upperBound(Foo(2)).length;
>
> Bye,
> bearophile

Facing this same problem, a while ago I started work on a generic, higher-order container that provides insertion, deletion and search while keeping itself sorted:

https://gist.github.com/JakobOvrum/f1738d31bb7ba7a46581

The above is just a WIP; it's not complete. Of course, positional container primitives like `insertFront` and `insertBack` will not be supported.

The implementation is fairly messy due to the lack of traits for containers, as well as due to some deficiencies in `SortedRange`.

It's obviously useful for arrays, and it's kind of clever how it can merge lists efficiently, but I'm not sure if it's really worth all the effort; is it really useful to have something like this that aims to support such a wide range of underlying containers? Is it actually useful in real programs for anything but arrays? So, I stopped working on it...
November 08, 2014
Jakob Ovrum:

> Of course, positional container primitives like `insertFront` and `insertBack` will not be supported.

Sometimes (or often) "insertBack" (or better the ~= operator) is what I want, because I add items larger than ones already present. insertBack has to verify the array is empty, or the last one is smaller (or verifies the sorting predicate) than the item I'm going to append.


> is it really useful to have something like this that aims to support such a wide range of underlying containers? Is it actually useful in real programs for anything but arrays? So, I stopped working on it...

In most cases a built-in dynamic array is fine. Once in a while you want to use an std.array.Array. When they are not enough, I use a sorted tree or something else.

Bye,
bearophile
November 10, 2014
On 11/7/14 7:54 AM, bearophile wrote:
> Max Klyga:
>
>> Ranges are not container. They are meant for traversing. If you want a
>> sorted range - use an underlying container that preserves ordering
>> (trees, heaps)
>
> Let's asssume that the underlying container is a sorted built-in dynamic
> array (that has more locality than a tree and allows very fast binary
> searches, unlike a heap).
>
> So what solution do you suggest? A pair of functions that work on a
> sorted container of that type for Phobos?
>
> Bye,
> bearophile

One possibility is a function sortedChain that takes two or more sorted ranges and returns a lazy sorted range. -- Andrei
November 10, 2014
Andrei Alexandrescu:

> One possibility is a function sortedChain that takes two or more sorted ranges and returns a lazy sorted range. -- Andrei

Perhaps you are answering another thread and another problem. In this problem there is only one range.

Bye,
bearophile
« First   ‹ Prev
1 2