March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 05 Mar 2010 11:53:47 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > Steven Schveighoffer wrote: >> On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: >> >>> >>> Needless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too. >> As you know, you and I disagree on some aspects of containers. However, I think our philosophies are converging. > [snip] > > To summarize my understanding of your points: > > (a) When you only want to tweak a container's behavior in small ways, inheritance is great. > > (b) Pass by reference is more natural for containers than pass by value. > > (c) Proprietary code hiding is better with interfaces. > > (d) Dynamically-linked code works best with interfaces. > > (e) Interface-based code requires less recompilation and may save on generated code. > > (f) Interfaces simplify understanding. > > At the same time, you argue that ranges aren't fit as interfaces because: > > (a) Inlinining is difficult/impossible > > (b) Value semantics is more natural for ranges A range is not a container. A range is a small reference-like construct that points to a containers elements. I stand by what I said, a container is more natural as a reference, ranges are more natural with value semantics. If I pass an abstract range into something like std.algorithm.sort, the performance is going to be much worse than using a value-type range. The implementation has the option of doing the latter, so why not make it an interface function? In addition, std.algorithm.sort is going to expect value-type ranges, so you have to make copies everywhere, incurring more performance hits. > One other thing I'm worried about, and was an absolute pain in my preliminary tests of container implementations, is this issue of "iterators/ranges/cursors are not part of the container interface". It is absolutely fundamental that you can get an abstract range from an abstract container because most of what you can do to a container you must do with ranges. Otherwise the duplication of functionality is horrid. This failure alone is a reason to abandon the container hierarchy design. It depends on what you want to do with the container. std.algorithm is not the only reason to have containers. Duplication of functionality is unavoidable if you do not have abstract ranges. However, by "duplication" it most likely means forwarding calls to range-accepting functions. For example, a List interface may have a sort function. The implementation can forward to std.algorithm.sort if the list is an array, or to an internal merge sort if it is a linked list (in fact, this is similar to what I do now in dcollections). I don't really know how you would do this without knowing one is a linked list and one isn't. It's possible to take a flat non-interfaced collection library and build an interface on top of it, I don't see why it detracts from the original non-interface API. -Steve | |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Michel Fortin wrote:
> On 2010-03-05 11:32:27 -0500, Jonathan M Davis <jmdavisProg@gmail.com> said:
>
>>> Why couldn't a container have the same interface as a range, only with the added ability to push elements into it?
>>
>> Probably because it would become way too easy to accidentally use the container as a range, which in many cases would end up emptying it when passing it to one of the std algorithms.
>
> Yes, I understand that.
>
> My point is that the problem already exists with ranges: by exposing a stream as a range, you'll empty the stream when iterating over it. If that's acceptable for streams, why is it not for containers?
>
> And perhaps you actually want to empty the container as you iterate over it. If you use your container as a queue, that's what you'll expect. In fact, a stream is just like a queue, and you want algorithms to work on streams don't you?
>
It would probably be better in such a case to have a range wrap the container. That way, you can have range semantics for the container if you want them, but you don't risk using them on a container where you don't.
As for streams, I don't know. I do like the idea of using ranges with streams, but I haven't really looked at what the downsides to that might be. It depends a lot on what the std algorithms do. With some it might be fine while with others it would be a really bad idea. I'd have to look into it more to say, but at least with a stream, it's expected that iterating over it will alter it. With a container, that's not usually the case. And in cases where it is, either that container could be a special case and be a range as well, or it you could have a wrapper range do it. Personally, I'd prefer the wrapper idea. It would probably allow you to create generic wrappers to use on pretty much any container (or at least containers with a particular set of functions) rather than special casing particular containers. Regardless, I think that making containers ranges in the general case is a bad idea.
- Jonathan M Davis
| |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu wrote:
> (b) Pass by reference is more natural for containers than pass by value.
I concur with this. It rarely makes sense to outright copy containers, and when you do, you're explicitly looking to do so. If a container were a struct, then returning it from a function would be expensive. And it could get really messy trying to have one container as an element in another - particularly if you can't get a reference to the element in the container. You'd have to copy it out of the container, modify it, and copy it back in again. That's way too much unnecessary copying. And it would be easy to think that after you'd changed the element after getting it out of the container that it had changed the one in the container when you hadn't.
Container!(Container!(E)) cont;
...
auto element = cont.get(x);
element.add(someValue);
//With a class, the element in the container would now be altered.
vs
Container!(Container!(E)) cont;
...
auto element = cont.get(x); //this involves a copy
element.add(someValue);
cont.set(x, element); //this involves a copy
Now, by getting at the elements of a container by reference, that reduces the problem, but generally speaking, it makes no sense to be copying containers, and it's expensive to do so. I'd vote to just make them final classes. You gain the efficiency of inlining and have reference semantics which is really what typically makes sense for containers. If you want it copied, you can clone it or construct a new one with the elements of the first one. I really don't see what would be gained by making a container a struct.
- Jonathan M Davis
| |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Michel Fortin wrote:
> On 2010-03-05 11:32:27 -0500, Jonathan M Davis <jmdavisProg@gmail.com> said:
>
>>> Why couldn't a container have the same interface as a range, only with
>>> the added ability to push elements into it?
>>
>> Probably because it would become way too easy to accidentally use the
>> container as a range, which in many cases would end up emptying it when
>> passing it to one of the std algorithms.
>
> Yes, I understand that.
>
> My point is that the problem already exists with ranges: by exposing a stream as a range, you'll empty the stream when iterating over it. If that's acceptable for streams, why is it not for containers?
>
> And perhaps you actually want to empty the container as you iterate over it. If you use your container as a queue, that's what you'll expect. In fact, a stream is just like a queue, and you want algorithms to work on streams don't you?
A range is a view on a container. As such, popping off a range does not affect the topology of the underlying container. Consider, for contrast, a balanced binary tree. Removing an element triggers a rebalance of the tree, but popping an element off an iterator in that tree does not affect the tree itself.
Fair enough?
Andrei
| |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > On Fri, 05 Mar 2010 11:53:47 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > >> Steven Schveighoffer wrote: >>> On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: >>> >>>> >>>> Needless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too. >>> As you know, you and I disagree on some aspects of containers. However, I think our philosophies are converging. >> [snip] >> >> To summarize my understanding of your points: >> >> (a) When you only want to tweak a container's behavior in small ways, inheritance is great. >> >> (b) Pass by reference is more natural for containers than pass by value. >> >> (c) Proprietary code hiding is better with interfaces. >> >> (d) Dynamically-linked code works best with interfaces. >> >> (e) Interface-based code requires less recompilation and may save on generated code. >> >> (f) Interfaces simplify understanding. >> >> At the same time, you argue that ranges aren't fit as interfaces because: >> >> (a) Inlinining is difficult/impossible >> >> (b) Value semantics is more natural for ranges > > A range is not a container. A range is a small reference-like construct that points to a containers elements. I stand by what I said, a container is more natural as a reference, ranges are more natural with value semantics. I wasn't debating the point, just organizing ideas. > If I pass an abstract range into something like std.algorithm.sort, the performance is going to be much worse than using a value-type range. The implementation has the option of doing the latter, so why not make it an interface function? In addition, std.algorithm.sort is going to expect value-type ranges, so you have to make copies everywhere, incurring more performance hits. I agree. Interface ranges are unfortunately a poor practical proposition. >> One other thing I'm worried about, and was an absolute pain in my preliminary tests of container implementations, is this issue of "iterators/ranges/cursors are not part of the container interface". It is absolutely fundamental that you can get an abstract range from an abstract container because most of what you can do to a container you must do with ranges. Otherwise the duplication of functionality is horrid. This failure alone is a reason to abandon the container hierarchy design. > > It depends on what you want to do with the container. std.algorithm is not the only reason to have containers. It's not the only reason, but it is the primordial reason - just like "transportation" is a reason for buying a car. It would be a failure to define containers that can't work with std.algorithm. > Duplication of functionality is unavoidable if you do not have abstract ranges. However, by "duplication" it most likely means forwarding calls to range-accepting functions. For example, a List interface may have a sort function. The implementation can forward to std.algorithm.sort if the list is an array, or to an internal merge sort if it is a linked list (in fact, this is similar to what I do now in dcollections). I don't really know how you would do this without knowing one is a linked list and one isn't. The sort example is good, but I wonder how many of those are out there. > It's possible to take a flat non-interfaced collection library and build an interface on top of it, I don't see why it detracts from the original non-interface API. Perfect. So I take it we can focus on the federation of containers approach? Andrei | |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 05 Mar 2010 16:16:15 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > Steven Schveighoffer wrote: >> It depends on what you want to do with the container. std.algorithm is not the only reason to have containers. > > It's not the only reason, but it is the primordial reason - just like "transportation" is a reason for buying a car. It would be a failure to define containers that can't work with std.algorithm. I disagree. I generally use containers for collecting items for later lookup. That's generally a container-specific function that can be part of an interface. Other functions I usually use are iteration, which again can be part of the interface. Besides, making containers have interfaces does not preclude them from using std.algorithm. I agree std.algorithm is definitely a reason to have containers, but that's not mutually exclusive with interfaces. -Steve | |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 2010-03-05 16:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > A range is a view on a container. As such, popping off a range does not affect the topology of the underlying container. Consider, for contrast, a balanced binary tree. Removing an element triggers a rebalance of the tree, but popping an element off an iterator in that tree does not affect the tree itself. > > Fair enough? That's a fine definition, except for the word "topology". Whether it changes the topology should be an implementation detail. Popping of a C++ vector or a hash table doesn't change the topology, will you make them ranges? That doesn't make much sense. Also, do you consider that popping off a stream affects its topology? If your stream is buffered, you're probably advancing a pointer through a buffer with 'pop', and deallocating old buffers from time to time. So this definition would forbid a stream from being a range. That doesn't make much sense. And similarly you could have a queue container where you insert things at the back and pop things from the front. But that's basically the same thing as a stream. Is a queue a container or a range? I think this definition is more on the spot: a range is something you consume, a container is something used to hold data. Those definition are not mutually exclusive, but is this really a problem? -- Michel Fortin michel.fortin@michelf.com http://michelf.com/ | |||
March 05, 2010 Re: Container hierarchy vs. container types | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Michel Fortin wrote: > On 2010-03-05 16:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > >> A range is a view on a container. As such, popping off a range does not affect the topology of the underlying container. Consider, for contrast, a balanced binary tree. Removing an element triggers a rebalance of the tree, but popping an element off an iterator in that tree does not affect the tree itself. >> >> Fair enough? > > That's a fine definition, except for the word "topology". > > Whether it changes the topology should be an implementation detail. Popping of a C++ vector or a hash table doesn't change the topology, will you make them ranges? That doesn't make much sense. It's essential. The way ranges work, they can never change the topology of the collection they operate on. That is a fundamental aspect. A container operation may or may not affect its topology. > Also, do you consider that popping off a stream affects its topology? A stream has no topology I can think of. Within this context, I define topology loosely as the way the data of a container is stored and interconnected in memory. > If your stream is buffered, you're probably advancing a pointer through a buffer with 'pop', and deallocating old buffers from time to time. So this definition would forbid a stream from being a range. That doesn't make much sense. > > And similarly you could have a queue container where you insert things at the back and pop things from the front. But that's basically the same thing as a stream. Is a queue a container or a range? > > I think this definition is more on the spot: a range is something you consume, a container is something used to hold data. Those definition are not mutually exclusive, but is this really a problem? I think it is. For starters, copying a forward range copies its state, whereas copying a container and then removing stuff from it results in two containers with less net state. By and large, clearly there are important distinctions between ranges and containers. My perception is that you do have that understanding, but are doing a Plato on me. I'm a bit tired of having a Plato done on me, particularly in this case when I don't have the answers. Would be great to strengthen the weak definitions we have now instead of challenging them. Andrei | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply