August 05, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Per Nordlöw | On Wednesday, 5 August 2015 at 11:09:29 UTC, Per Nordlöw wrote:
> On Wednesday, 5 August 2015 at 09:04:54 UTC, Nordlöw wrote:
>> This will however duplicate the underlying array aswell, which is probably not what we want. How do we avoid this?
>
> Correction: the underlying storage array *must* be duplicated whenever we want to iterate over it without side effects in the original instance. That's just the way binary heaps work.
Crazy idea: what about a range that lazily copies as it needs to? I.e. copy-on-write
| |||
August 05, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Wednesday, 5 August 2015 at 13:37:19 UTC, John Colvin wrote:
> On Wednesday, 5 August 2015 at 11:09:29 UTC, Per Nordlöw wrote:
>> On Wednesday, 5 August 2015 at 09:04:54 UTC, Nordlöw wrote:
>>> This will however duplicate the underlying array aswell, which is probably not what we want. How do we avoid this?
>>
>> Correction: the underlying storage array *must* be duplicated whenever we want to iterate over it without side effects in the original instance. That's just the way binary heaps work.
>
> Crazy idea: what about a range that lazily copies as it needs to? I.e. copy-on-write
I guess you mean that popFront should copy on demand then. We then an extra bool to keep track of whether the copying has been done then. One problem though:
auto x = PQ;
x.insert(...); // one element
auto y = x; // no copying of underlying storage
x.popFront; // modified both x and y!
y.popFront; // copied on demands, but underlying storage is already empty. oops!
I don't think this is a desired behaviour.
| |||
August 05, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Wednesday, 5 August 2015 at 15:29:39 UTC, Nordlöw wrote:
> On Wednesday, 5 August 2015 at 13:37:19 UTC, John Colvin wrote:
>> On Wednesday, 5 August 2015 at 11:09:29 UTC, Per Nordlöw wrote:
>>> On Wednesday, 5 August 2015 at 09:04:54 UTC, Nordlöw wrote:
>>>> This will however duplicate the underlying array aswell, which is probably not what we want. How do we avoid this?
>>>
>>> Correction: the underlying storage array *must* be duplicated whenever we want to iterate over it without side effects in the original instance. That's just the way binary heaps work.
>>
>> Crazy idea: what about a range that lazily copies as it needs to? I.e. copy-on-write
>
> I guess you mean that popFront should copy on demand then. We then an extra bool to keep track of whether the copying has been done then. One problem though:
>
> auto x = PQ;
> x.insert(...); // one element
> auto y = x; // no copying of underlying storage
> x.popFront; // modified both x and y!
> y.popFront; // copied on demands, but underlying storage is already empty. oops!
>
> I don't think this is a desired behaviour.
in my vision, either x.popFront would also create a copy or you would have to go "auto y = x.nonModifyingView" or similar. What I don't want is something that copies 10000 elements just to use x.take(6)
| |||
August 05, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Wednesday, 5 August 2015 at 09:04:54 UTC, Nordlöw wrote: > On Wednesday, 5 August 2015 at 01:02:56 UTC, DarthCthulhu wrote: >> I must be doing something really stupid here, but I have no clue what it is. Anyone know? > > > For functional behaviour I prefer a postblit that duplicates the underlying BinaryHeap. > > https://github.com/nordlow/justd/blob/master/priority_queue.d > > Now the foreach won't consume the queue. Oh, neat! I stumbled on the same thing (making a .dup of the BinaryHeap), but didn't know about the postblit. That makes things simplier. > > This will however duplicate the underlying array aswell, which is probably not what we want. How do we avoid this? I was wondering that, myself, when I stumbled on the .dup solution. My first thought was to instantiate a templated Array! first, then use BinaryHeap.assume or .acquire to make it a BinaryHeap while also keeping a reference to the underlining array. Then one could just return the array reference in a separate function rather than destructively iterating through the BinaryHeap. My experiments didn't bear this out, however. Maybe I'm misunderstanding what the .assume and .acquire functions do? Incidentally, I also discovered the wonderful opDot function which allows the PriorityQueue to shunt most of the work down to the BinaryHeap directly. // Forward most function calls to the underlying queue. BinaryHeap!(Array!(PV), predicate)* opDot() { return &_q; } > Yeah, I think there is no way to "traverse" a binary heap in order without manipulating it. However, you can print the underlying storage. There's a way to get at the underlining store? I couldn't find any means to do so in the BinaryHeap documentation. >in my vision, either x.popFront would also create a copy or you would have to go "auto y = x.nonModifyingView" or similar. What I don't want is something that copies 10000 elements just to use x.take(6) Yeah, that's it exactly! | |||
August 06, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Wednesday, 5 August 2015 at 18:20:41 UTC, John Colvin wrote: > in my vision, either x.popFront would also create a copy or you would have to go "auto y = x.nonModifyingView" or similar. What I don't want is something that copies 10000 elements just to use x.take(6) I suggest you rehearse on how a binary heap works. A binary heap with array storage trades speed for memory compactness, a bit similar to how quick sort relates to heap sort. See also: https://en.wikipedia.org/wiki/Binary_heap The underlying heap array in D's `BinaryHeap` contains both the data leaves and their inter-orderings in a memory-packed manner. This makes heaps very space-efficient and running `dup` on the underlying array is very cheap (in D), especially for value types. When you pop an element from the heap an in-place reorganisation (balancing) will be performed on the array. This means `popFront` will potentially (in the worst case) require a complete copy of the array in order to not have to modify the original array. AFAIK, the 10000 elements will always have to be copied even when we just need x.take(2). Peeking the front (using `front()`) is O(1), but *popping* the front (to get the next front) in the range may, in the worst cast, have to require reorganisation of *all* the 10000 elements. See also: https://en.wikipedia.org/wiki/Binary_heap#Extract Please correct me if I'm wrong. | |||
August 06, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Per Nordlöw | On Thursday, 6 August 2015 at 08:44:17 UTC, Per Nordlöw wrote:
> On Wednesday, 5 August 2015 at 18:20:41 UTC, John Colvin wrote:
>> [...]
>
> I suggest you rehearse on how a binary heap works. A binary heap with array storage trades speed for memory compactness, a bit similar to how quick sort relates to heap sort.
>
> See also: https://en.wikipedia.org/wiki/Binary_heap
>
> The underlying heap array in D's `BinaryHeap` contains both the data leaves and their inter-orderings in a memory-packed manner. This makes heaps very space-efficient and running `dup` on the underlying array is very cheap (in D), especially for value types.
>
> When you pop an element from the heap an in-place reorganisation (balancing) will be performed on the array. This means `popFront` will potentially (in the worst case) require a complete copy of the array in order to not have to modify the original array.
>
> AFAIK, the 10000 elements will always have to be copied even when we just need x.take(2). Peeking the front (using `front()`) is O(1), but *popping* the front (to get the next front) in the range may, in the worst cast, have to require reorganisation of *all* the 10000 elements.
>
> See also: https://en.wikipedia.org/wiki/Binary_heap#Extract
>
> Please correct me if I'm wrong.
Worst case for getting root off a binary heap is O(log(N)), copying the whole thing is O(N).
In practice the copy may be very cheap, but it does mean more memory usage and won't scale to very large N. Perhaps it's an OK tradeoff, but you want to be careful.
| |||
August 06, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Thursday, 6 August 2015 at 09:08:04 UTC, John Colvin wrote:
> Worst case for getting root off a binary heap is O(log(N)), copying the whole thing is O(N).
Those numbers does not take into account the special properties of *in-array-packed* implementation of a binary heap. They are *theoretical properties* related to a graph implementation of a `BinaryHeap`.
The important thing in our D case is *which elements* in the array that needs to be changed or *moved*.
| |||
August 07, 2015 Re: Creating a Priority Queue: An Adventure | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Per Nordlöw | Okay, so, I decided to scrap the BinaryHeap version of the priority queue, going back to basics and utilizing a simple array. It works, huzzah! Code: module data_structures.priority_queue; import std.array; import std.range: assumeSorted; import std.typecons: Tuple; /* Templated Priority Queue Usage: PriorityQueue!(PRIORITY_TYPE, VALUE_TYPE, OPTIONAL_PREDICATE) */ struct PriorityQueue(P, V, alias predicate = "a < b") { // To make the code a bit more readable alias Tuple!(P, V) PV; PV _q[]; // Forward most function calls to the underlying array. PV[]* opDot() { return &_q; } // Determine if the queue is empty @property bool empty () { return (_q.length == 0); } // Needed so foreach can work @property PV front() { return _q.front; } // Chop off the front of the array @property void popFront() { _q = _q[1 .. $]; } // Insert a record via a template tuple void insert(ref PV rec) { // Empty queue? if (_q.length == 0 ) { // just put the record into the queue _q ~= rec; return; } // Assume the queue is already sorted according to PREDICATE auto a = assumeSorted!(predicate)(_q); // Find a slice containing records with priorities less than the insertion rec auto p = a.lowerBound(rec); int location = p.length; // Insert the record _q.insertInPlace(location, rec); } void insert(PV rec) { insert(rec); } // Insert a record via decomposed priority and value void insert(P priority, V value) { PV rec = PV(priority, value); // Insert the record insert(rec); } // Merge two Priority Queues, returning the merge. // The two queues must obviously be of the same type in Priority and Value, and predicate; ref PriorityQueue!(P, V, predicate) merge(ref PriorityQueue!(P, V, predicate) qmerge) { // Make a copy of this PriorityQueue PriorityQueue!(P, V, predicate)* qreturn = new PriorityQueue!(P, V, predicate); qreturn._q = _q.dup; // Add in all the elements of the merging queue foreach(rec; qmerge) { qreturn.insert(rec); } // Return the resulting merged queue return *qreturn; } } unittest { alias P = int; alias V = string; alias PV = Tuple!(P, V); alias PQ = PriorityQueue!(P, V, "a < b"); PQ pq, pq2, pq3; import std.typecons: tuple; // Test basic insertion pq.insert(10, "HELLO10"); pq.insert(11, "HELLO11"); pq.insert(3, "HELLO3"); pq.insert(31, "HELLO31"); pq.insert(5, "HELLO5"); pq.insert(10, "HELLO10-2"); assert(pq.length == 6); foreach (const e; pq) {} // iteration assert(!pq.empty); // shouldn't consume queue // Copy by value pq2 = pq; foreach (priority, value; pq) { pq.popFront(); } // pq and pq2 should be independent assert( !pq2.empty); assert( pq.empty ); // Test merging pq3.insert(tuple(12, "HELLO12")); pq3.insert(Tuple!(int, string)(17, "HELLO17")); pq3.insert(tuple(7, "HELLO7")); pq = pq2.merge(pq3); assert ( !pq.empty); assert(pq.front == tuple(3, "HELLO3")); pq.popFront; assert(pq.front == tuple(5, "HELLO5")); pq.popFront; assert(pq.front == tuple(7, "HELLO7")); pq.popFront; assert( pq.length == 6 ); } And a little driver main() to illustrate the queue a bit better: main() { PriorityQueue!(int, string) pq, pq2, pq3; pq.insert(10, "HELLO10"); pq.insert(11, "HELLO11"); pq.insert(Tuple!(int, string)(3, "HELLO3")); pq.insert(5, "HELLO5"); pq.insert(Tuple!(int, string)(12, "HELLO12")); pq.insert(Tuple!(int, string)(17, "HELLO17")); pq2.insert(Tuple!(int, string)(15, "HELLO15")); pq2.insert(Tuple!(int, string)(21, "HELLO21")); writefln("\tPQ: %s \n\tPQ2: %s \n\tPQ3: %s", pq, pq2, pq3); pq3 = pq.merge(pq2); foreach(priority, value; pq3) { writefln("Priority: %s \tValue: %s \tLength: %s", priority, value, pq3.length); pq3.popFront(); } } I think I like this version better than the BinaryHeap one. It's a bit simpler and can be extended easier. For instance, I think it would be pretty easy to use mixins to make a seperate PriorityQueueChangable version which allows the alteration of priorities. It's also pretty easy to search for priorities or values of certain requirements. I don't know enough about D to really say if its performant in any way, but it does do the job. It took about a week to get to this final form, coming from someone familiar with programming but not D. A quick review of this project from my perspective: The first day was mostly taken up with reading about D, then doing an abortive attempt with associative arrays until I realized that associative arrays are not and can not be sorted. The next day involved finding BinaryHeap and puzzling out how to use it. I think I got really hung up on the template syntax for a bit. It took a little to get used to. I also over engineered the thing with a separate helper struct until I embarrassingly discovered tuples would do the job easier. The third day had me hitting a wall with not thinking that the BinaryHeap eats its elements on traversal. Coming here was a great help. The fourth day was mostly cleaning up the BinaryHeap version and becoming increasingly dissatisfied with it. And now here we are today, with a new version written with just standard arrays. The big problems I had were screwing up the sorting predicate (I couldn't figure out why everything would work correctly in one direction, but not the other until I realized that assumeSorted needed to be told which predicate to use). The predicate was also needed for the merge portion otherwise changing the direction of the queue would cause a type mismatch. Those two problems plus a few other minor ones later, and the queue is complete. I have to say, I really like D a lot! Thanks to you all for the assistance! | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply