October 21, 2015
On Wednesday, 21 October 2015 at 18:38:10 UTC, Jonathan M Davis wrote:
>
> A static array is of a fixed size, which almost no other containers are. It also lives entirely on the stack, which almost no other containers do. If there's a container that lives entirely on the stack, then maybe it would make sense for it to be a value type, but _very_ few containers fall in that category, and all of the classic containers like vector, linked list, map, etc. have no business being value types IMHO. It's just error-prone. Heck, static arrays are quite error-prone thanks to the fact that they convert to dynamic arrays, but they do serve a purpose. So, maybe there are containers that fall in the same category, but I expect that such containers are pretty obviously value types and not reference types, because their nature makes them that way. Regardless, I don't see how it's reasonable in general to make a container be a value type. It's just asking for trouble. If there's any question at all whether a container should be a value type or a reference type, IMHO, it should be a reference type.
>

I do find myself mixing up static and dynamic arrays...

I'll generally defer to you. I suppose within the context of Andrei working on containers with std.allocator, then containers that only live on the stack seems beyond the scope anyway.

Your statement just had struck me as weird as I had thought of people like manu saying at various points that they only kept data structures on the stack. My next thought was that static arrays are usually on the stack and are value types. So I figured that whatever the performance people are using were similar.
October 21, 2015
On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
>
>
> I'll attempt to implement a few versions of each and see what they look
> like. The question here is what containers are of interest for D's
> standard library.

There should probably also be wrappers that implement optimizations for small containers.
October 21, 2015
On Wednesday, 21 October 2015 at 20:13:31 UTC, jmh530 wrote:
> On Wednesday, 21 October 2015 at 18:38:10 UTC, Jonathan M Davis wrote:
>>
>> A static array is of a fixed size, which almost no other containers are. It also lives entirely on the stack, which almost no other containers do. If there's a container that lives entirely on the stack, then maybe it would make sense for it to be a value type, but _very_ few containers fall in that category, and all of the classic containers like vector, linked list, map, etc. have no business being value types IMHO. It's just error-prone. Heck, static arrays are quite error-prone thanks to the fact that they convert to dynamic arrays, but they do serve a purpose. So, maybe there are containers that fall in the same category, but I expect that such containers are pretty obviously value types and not reference types, because their nature makes them that way. Regardless, I don't see how it's reasonable in general to make a container be a value type. It's just asking for trouble. If there's any question at all whether a container should be a value type or a reference type, IMHO, it should be a reference type.
>>
>
> I do find myself mixing up static and dynamic arrays...
>
> I'll generally defer to you. I suppose within the context of Andrei working on containers with std.allocator, then containers that only live on the stack seems beyond the scope anyway.
>
> Your statement just had struck me as weird as I had thought of people like manu saying at various points that they only kept data structures on the stack. My next thought was that static arrays are usually on the stack and are value types. So I figured that whatever the performance people are using were similar.

Static arrays are a bit of a special case, because they're essentially a chunk of memory on the stack. They can be extremely useful and efficient, but you have to be careful with them. Most containers don't act anything like them.

But like I said, there are some container types which inherently make sense on a stack, but most don't. For the most part, the kinds of containers that we're talking about here - vectors/array lists, linked lists, maps, sets, etc. - have to have most of their guts living on the heap. You'd have to _really_ contort things to put them entirely on the stack, and you'd be seriously limiting how many elements that they could hold if you did. If you make a vector or a linked list a value type, then it'll have a few members on the stack, but most of them will be on the heap, and it'll be a value type only because its postblit constructor explicitly copies the portion that's on the heap. It's not really naturally a value type. It's naturally more of a reference type or pseudo-reference type.

If the vector is implemented as a reference type, then the few member variables that were on the stack in the value type version end up on the heap with the rest of the container's guts, so you use slightly more memory for them, but it's much more natural that way, since the main guts of the container were on the heap anyway. And if the container is a value type, you risk copying unnecessarily and/or keeping around references/pointers/iterators/ranges to its elements after it's been destroyed, whereas with a reference type, it'll only be copied when you specifically request it. Treating them as value types is inefficient, error-prone, and unnatural.

Now, folks like Manu do do some crazy stuff eke out every bit of performance that they can, and they do sometimes use stuff like alloca to put stuff that would otherwise be on the heap on the stack. But a lot of what they do isn't very general purpose either, and they generally avoid stuff like the containers in standard libraries, because they use the heap too much and/or aren't tailored to their needs. They also tend to do stuff that's way more error-prone in an effort to eke out that extra bit of performance.

If we can do stuff to support them, then great, but in general - especially when you're talking about the library and not the language - the kinds of stuff that they want don't work for other folks. If someone like Manu wants an efficient container, he _might_ want one that's not a full-on reference type simply so that less of it is on the heap, but he sure isn't going to want to copy it except when he absolutely has to. So, the fact that it's a value type isn't necessarily useful. Having it simply live where it's constructed (be it on the stack or as a member variable) with only the stuff that has to be on the heap on the heap but have the container have a disabled postblit would probably be just as useful in most cases, and scoped or region allactors should be able to help with that. I'm not super familiar with all of the tricks that they pull to make their container use efficient, but I have a hard time seeing how std::vector's semantics are desirable to them. Certainly, when they care about efficiency to the point that they're using static arrays everywhere, the fact that vector has its guts on the heap means that it's in a different class of containers entirely from static arrays and does not have the same desirable properties, even if it is a value type like static arrays. And even then, the fact that static arrays are value types isn't really the gain. It's more that they're on the stack, and that naturally turns them into value types.

In any case, I'm pretty sure that we do have a region allocator in std.experimental.allocator, so if someone wants to try and put a container that normally lives on the heap on the stack, it should be possible (at least, with a limited number of elements). And it might actually be easier to do that with a container that's entirely on the heap instead of part of it living on the stack and part of it living on the heap (as occurs with the C++ containers).

If Manu (or others like him) have strong feelings about what kinds of containers would be best for them, they can speak up and point out what they need, but I don't see how having value type containers would generally be useful to them except in the cases where the container actually, fully lives on the stack (which is almost never the case), and it seems even more questionable for everyone else.

- Jonathan M Davis
October 21, 2015
On Wednesday, 21 October 2015 at 18:55:10 UTC, Andrei Alexandrescu wrote:
> On 10/21/2015 01:42 PM, Ulrich Kuettler wrote:
>> On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
>>> I'm finally getting the cycles to get to thinking about Design by
>>> Introspection containers. First off, there are several general
>>> categories of containers. I think D should support all properly. One
>>> question is which we go for in the standard library.
>>>
>>> 1. Functional containers.
>>
>> Very much in favor of functional (or persistent) containers. Immutable.
>> Value semantics.
>>
>> This might take some getting used to, but if properly designed these
>> containers would be killer for D. A dream come true.
>
> I agree they're cool.
>
>>> These are immutable; once created, neither their topology nor their
>>> elements may be observably changed. Manipulating a container entails
>>> creating an entire new container, often based on an existing container
>>> (e.g. append takes a container and an element and creates a whole new
>>> container).
>>>
>>> Internally, functional containers take advantage of common
>>> substructure and immutability to share actual data. The classic
>>> resource for defining and implementing functional containers is
>>> http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
>>>
>>>
>>
>> An insanely beautiful example is clojure's PersistantVector. AFAIK it
>> made its way into Scala, too.
>
> Far as I understand from http://hypirion.com/musings/understanding-persistent-vector-pt-1, it's that tree thing with carefully chosen slot sizes for cache friendliness?

Yes. It is about cache friendliness, memory efficiency and speed. Since each node consists of 32 pointers, there are just three inner levels to store 1M values.

The bit operations to navigate to the leaf node are a thing of beauty:

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L147-L163

It is a remarkable design on the JVM. Of course, it serves a very particular purpose. D's design space is very different. Obviously.

October 21, 2015
On 10/21/2015 11:02 PM, Jonathan M Davis wrote:
>
> Static arrays are a bit of a special case, because they're essentially a
> chunk of memory on the stack. They can be extremely useful and
> efficient, but you have to be careful with them. Most containers don't
> act anything like them.
>
> But like I said, there are some container types which inherently make
> sense on a stack, but most don't. For the most part, the kinds of
> containers that we're talking about here - vectors/array lists, linked
> lists, maps, sets, etc. - have to have most of their guts living on the
> heap. You'd have to _really_ contort things to put them entirely on the
> stack,

In many applications, most containers are small. Small containers can have all their guts on the stack.

> and you'd be seriously limiting how many elements that they could
> hold if you did.

Not really. Just allocate on the heap only if there are too many elements.

> If you make a vector or a linked list a value type,
> then it'll have a few members on the stack, but most of them will be on
> the heap, and it'll be a value type only because its postblit
> constructor explicitly copies the portion that's on the heap. It's not
> really naturally a value type.

Value semantics does not mean slow copy, I assume you conflate value semantics and eager copy?

> It's naturally more of a reference type
> or pseudo-reference type.

What you describe is "naturally" (which I interpret to mean: if postblit must be O(1)) a unique reference type. I.e. the most natural way out would be to add @disable this(this). But implicit copy can be convenient. Maybe have implicit copy as a wrapper?
October 21, 2015
On Wednesday, 21 October 2015 at 20:19:27 UTC, Timon Gehr wrote:
> On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
>>
>>
>> I'll attempt to implement a few versions of each and see what they look
>> like. The question here is what containers are of interest for D's
>> standard library.
>
> There should probably also be wrappers that implement optimizations for small containers.

I think the allocators can probably deal with that detail. If I remember correctly there is an in-place allocator.

I've been using Boost.Container's small_vector lately (basically a generalization of the now ubiquitous short string optimization...though I'm not sure if it repurposes the pointer when the item count is sufficiently small). It's a nice option to have. The Container library now also has static_vector (fixed capacity, changeable size) that fills a use gap between std::array and std::vector;
October 21, 2015
On Wednesday, 21 October 2015 at 20:06:41 UTC, Andrei Alexandrescu wrote:
> "More Catholic than the Pope" = overdoing something all too pedantically. What I mean is: let's get things started and if there's any need for isXxx we'll add them. -- Andrei

+1
October 21, 2015
On Wednesday, 21 October 2015 at 21:53:11 UTC, Brad Anderson wrote:
> to have. The Container library now also has static_vector (fixed capacity, changeable size) that fills a use gap between std::array and std::vector;

Yes, in C++ I reimplement this over and over... Trivial, but boring.


October 21, 2015
On 10/21/2015 11:53 PM, Brad Anderson wrote:
> On Wednesday, 21 October 2015 at 20:19:27 UTC, Timon Gehr wrote:
>> On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
>>>
>>>
>>> I'll attempt to implement a few versions of each and see what they look
>>> like. The question here is what containers are of interest for D's
>>> standard library.
>>
>> There should probably also be wrappers that implement optimizations
>> for small containers.
>
> I think the allocators can probably deal with that detail. If I remember
> correctly there is an in-place allocator.
> ...

Good point, but might it not be the case that alternative implementations of some operations are faster for a small number of elements?

October 21, 2015
On Wednesday, 21 October 2015 at 19:17:52 UTC, Andrei Alexandrescu wrote:
> On 10/21/2015 02:18 PM, Zz wrote:
>>
>> While looking at containers take a look at Jiri Soukup's work some good
>> ideas could come from there.
>>
>> http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&qid=1386946808&sr=8-1&keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank
>>
>>
>> Intrusive Data Structures.
>> http://www.codefarms.com/home
>> http://www.codefarms.com/dol
>> http://www.codefarms.com/ppf
>> http://www.codefarms.com/ptl
>
> Ah, Codefarms is still around. Nice. I remember I've been reading Soukup's articles back in the days of the C++ Report magazine. I couldn't make sense of them, nor could any of my friends. -- Andrei

The following links give an out line of DOL (The main library), most of the articles talked about the newer PTL (at that time).

For DOL
http://www.codefarms.com/dolclasses
http://www.codefarms.com/docs/dol/index.htm for the list of available organisations.

As for PTL the following gives an outline.
http://www.codefarms.com/docs/ptl/chap4.htm

DOL was Macro based while PTL used templates.

Zz