Jump to page: 1 2 3
Thread overview
Re: [OT] Finding longest documents
Oct 14, 2008
bearophile
Oct 14, 2008
Christopher Wright
Oct 15, 2008
bearophile
Oct 14, 2008
Christopher Wright
Oct 15, 2008
Christopher Wright
Oct 15, 2008
Christopher Wright
Oct 15, 2008
Benji Smith
Oct 15, 2008
Benji Smith
Oct 15, 2008
Sergey Gromov
Oct 15, 2008
Sergey Gromov
Oct 15, 2008
Sergey Gromov
Oct 15, 2008
Bill Baxter
Oct 15, 2008
Sergey Gromov
Oct 15, 2008
Bill Baxter
Oct 16, 2008
Sergey Gromov
Oct 16, 2008
bearophile
Oct 16, 2008
Benji Smith
Oct 16, 2008
Benji Smith
Oct 15, 2008
bearophile
October 14, 2008
Christopher Wright:
> http://dsource.org/projects/tango/browser/trunk/tango/util/container/more/Heap.d?rev=3959

Very nice, thank you. Few quick comments (no deep comments, I haven't used it nor read it carefully):

In this lines:
assert (other.pop is 4);

I think it's better to use
assert(other.pop == 4);


This lines show that a bulk addition method may be useful, that accepts any lazy/eagar iterable:
MaxHeap!(uint) h;
h ~= 1;
h ~= 3;
h ~= 2;
h ~= 4;


The following aliases can be moved just below their respective methods, with a /// ditto before them:
alias pop       remove;
alias push      opCatAssign;


This style of comments:
/** Inserts all elements in the given array into the heap. */

Can sometimes be replaced by:
/// Inserts all elements in the given array into the heap.


The class ddoc misses the explanation for the Min template argument:
struct Heap(T, bool Min) {


Some lines of comments are too much long, they may be broken at 80-95 chars long.


An optional 'key' callable can be added; it can be used as sorting key mapper (Swartz-style). If not specified it can be "null" that means the current behaviour.


Few other handy methods may be added:

A merge() (~ and ~=) method can be added, maybe.

heapreplace(heap, item) (pop and return the smallest item from the heap, and also push the new item. The heap size doesn't change)

heapify(iterable)
nlargest(n, iterable)
nsmallest(n, iterable)


The following is not a critic, just a note, I think programs with such low code density are harder to read to me, they seem like fresh snow:

/** Inserts all elements in the given array into the heap. */
void push (T[] array)
{
        while (heap.length < next + array.length)
        {
                heap.length = 2 * heap.length + 32;
        }
        foreach (t; array) push (t);
}


I'd write that as this, allowing me to have more code on the screen:

/// Inserts all elements in the given array into the heap.
void push(T[] array) {
    while (heap.length < next + array.length)
        heap.length = 2 * heap.length + 32;
    foreach (t; array)
        push(t);
}


In the future FibonacciHeap too may be useful to be added to the Std libraries.

Bye,
bearophile
October 14, 2008
bearophile wrote:
> Christopher Wright:
>> http://dsource.org/projects/tango/browser/trunk/tango/util/container/more/Heap.d?rev=3959
> 
> Very nice, thank you. Few quick comments (no deep comments, I haven't used it nor read it carefully):
> 
> In this lines:
> assert (other.pop is 4);
> 
> I think it's better to use
> assert(other.pop == 4);

Any particular reason?

> This lines show that a bulk addition method may be useful, that accepts any lazy/eagar iterable:
> MaxHeap!(uint) h;
> h ~= 1;
> h ~= 3;
> h ~= 2;
> h ~= 4;

Agreed. The implementation that I submitted had a push method that took an array. If I had given it more thought, I would've accepted any Tango container, as well, and maybe have a vararg version. (Except, ugh, the new containers are all structs. That means writing a template method for the task rather than using an interface.)

> The following aliases can be moved just below their respective methods, with a /// ditto before them:
> alias pop       remove;
> alias push      opCatAssign;

That was changed after I submitted the code.

> This style of comments:
> /** Inserts all elements in the given array into the heap. */
> 
> Can sometimes be replaced by:
> /// Inserts all elements in the given array into the heap.

And then it takes slightly more work to change it into a multiline comment.

> The class ddoc misses the explanation for the Min template argument:
> struct Heap(T, bool Min) {

The version I submitted didn't have the Min template argument; it had a virtual comparison method instead. That method was documented. The added argument was not documented.

> Some lines of comments are too much long, they may be broken at 80-95 chars long.

Dunno how that slipped by me.

> An optional 'key' callable can be added; it can be used as sorting key mapper (Swartz-style). If not specified it can be "null" that means the current behaviour.

I actually need this behavior.

The version I submitted had a virtual comparison method, so that would have been reasonably simple to implement:

class ComparatorHeap (T, alias comparator) : Heap!(T)
{
   override int comp (T left, T right) {
      return comparator (left, right);
   }
}

Or:
class HeapMap (TKey, TValue) : Heap!(KeyValuePair!(TKey, TValue))
{
   // add methods for inserting a key and a value without having to
   // construct a KeyValuePair
}

As it stands, you'd need to use composition and forward a lot of methods to an internal struct -- that's just hideous.

Tango has been on a struct rampage. It's part of their campaign against extensibility. I guess making everything private or final just wasn't enough.

I jest. But it is a serious concern of mine. If I see anything labeled "package" rather than "protected", my liver will burst. At the very least, I'll write a script to change anything private to protected, and anything protected or package to public.

> Few other handy methods may be added:
> 
> A merge() (~ and ~=) method can be added, maybe.
> 
> heapreplace(heap, item) (pop and return the smallest item from the heap, and also push the new item. The heap size doesn't change)
> 
> heapify(iterable)
> nlargest(n, iterable) nsmallest(n, iterable) 
> 
> 
> The following is not a critic, just a note, I think programs with such low code density are harder to read to me, they seem like fresh snow:
<snip>
> I'd write that as this, allowing me to have more code on the screen:

I use a large monitor. Whitespace makes things more readable for me. Though the giant tabs are annoying; I use a tabstop of four. Still, this is only a display issue.

> In the future FibonacciHeap too may be useful to be added to the Std libraries.
> 
> Bye,
> bearophile

Thanks for pointing out those issues. I'll correct those I can, but even after that, the Tango heap will not suffice for my needs.
October 14, 2008
Actually, I STRONGLY urge that nobody use this heap.

It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.

If it were a class, this would not be possible. As I implemented it, it would not be possible.

Containers as value types is such an idiotic idea for precisely this reason.
October 15, 2008
"Christopher Wright" wrote
> Actually, I STRONGLY urge that nobody use this heap.
>
> It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.
>
> If it were a class, this would not be possible. As I implemented it, it would not be possible.
>
> Containers as value types is such an idiotic idea for precisely this reason.

Unless such value types have copy constructors which make a copy of their elements...

Or are you calling the authors of STL idiots ;)

-Steve


October 15, 2008
Christopher Wright wrote:
> Actually, I STRONGLY urge that nobody use this heap.
> 
> It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.
> 
> If it were a class, this would not be possible. As I implemented it, it would not be possible.
> 
> Containers as value types is such an idiotic idea for precisely this reason.

Correction: as partial value types. If they were completely value types, it'd be stupid for a much different reason -- namely, that they're very difficult to resize. But this class of bugs would not be an issue.

D's builtin arrays have the same problem. In their case, the only real way around the issue is setting the first sizeof(size_t) bytes to the length rather than putting that on the stack.

Any situation in which only part of your state is shared is a potential for getting data structures into an erroneous state.
October 15, 2008
Christopher Wright wrote:
> Christopher Wright wrote:
>> Actually, I STRONGLY urge that nobody use this heap.
>>
>> It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.
>>
>> If it were a class, this would not be possible. As I implemented it, it would not be possible.
>>
>> Containers as value types is such an idiotic idea for precisely this reason.
> 
> Correction: as partial value types. If they were completely value types, it'd be stupid for a much different reason -- namely, that they're very difficult to resize. But this class of bugs would not be an issue.

Argh, I'm making more of an idiot of myself. If they're value types by merit of being entirely on the stack, yes. If they're value types by merit of the behavior of assignment, then I'm full of crap and you should pay me no mind.
October 15, 2008
Christopher Wright wrote:
> Actually, I STRONGLY urge that nobody use this heap.
> 
> It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.
> 
> If it were a class, this would not be possible. As I implemented it, it would not be possible.
> 
> Containers as value types is such an idiotic idea for precisely this reason.

I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea.

Andrei
October 15, 2008
Andrei Alexandrescu wrote:
> Christopher Wright wrote:
>> Actually, I STRONGLY urge that nobody use this heap.
>>
>> It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.
>>
>> If it were a class, this would not be possible. As I implemented it, it would not be possible.
>>
>> Containers as value types is such an idiotic idea for precisely this reason.
> 
> I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea.
> 
> Andrei

I'll bite.

Why is it a good idea that containers are implements as structs?

I too was pretty astonished to discover this pattern in tango. And why just for a few container types? Stack, Heap, and Vector are structs, while all other container types are classes.

--benji
October 15, 2008
Benji Smith wrote:
> Andrei Alexandrescu wrote:
>> Christopher Wright wrote:
>>> Actually, I STRONGLY urge that nobody use this heap.
>>>
>>> It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap.
>>>
>>> If it were a class, this would not be possible. As I implemented it, it would not be possible.
>>>
>>> Containers as value types is such an idiotic idea for precisely this reason.
>>
>> I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea.
>>
>> Andrei
> 
> I'll bite.
> 
> Why is it a good idea that containers are implements as structs?

As the Mexican would say: why not?

Andrei
October 15, 2008
Andrei Alexandrescu wrote:
> Benji Smith wrote:
>> Andrei Alexandrescu wrote:
>>> Christopher Wright wrote:
>>>> Containers as value types is such an idiotic idea for precisely this reason.
>>>
>>> I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea.
>>
>> I'll bite.
>>
>> Why is it a good idea that containers are implements as structs?
> 
> As the Mexican would say: why not?

Well... here are a few reasons why not:

Structs can't implement interfaces, and a collection API is the poster-child of good interface usage.

Structs can't inherit from abstract classes. A good deal of functionality in a collection class can be implemented once in an abstract base class, making it simpler for subclass authors to implement the API without duplicating a bunch of boilerplate.

Conceptually -- and there's no hard and fast rule about this -- structs usually either represent small logically-atomic values (DateTime), or some fixed-size collection of logically-atomic values (Vector3). Using a struct as an arbitrarily-sized container seems, on the face of it, to break those conventions.

--benji
« First   ‹ Prev
1 2 3