Pandas [was Kinds of containers] (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Pandas [was Kinds of containers] (page 4)

October 21, 2015

Re: Pandas [was Kinds of containers]

Posted by jmh530
in reply to Russel Winder

jmh530

Posted in reply to Russel Winder

On Wednesday, 21 October 2015 at 15:44:36 UTC, Russel Winder wrote:
> On Wed, 2015-10-21 at 09:11 -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> 
> […]
>> My finance folks also rave about Pandas. I wish I could fork myself to look into it.
>
> Pandas is what makes Python a viable competitor to R. Most data science people will use one or the other these days. Though some (with budgets) will use Matlab or Mathematica.

Yeah, R has had the functionality of what Pandas provides for a while. Data frames and zoo/xts cover most of Pandas. I think there has made some effort to get some of the functionality of dplyr as well (blaze).

October 21, 2015

Re: Kinds of containers

Posted by Jonathan M Davis
in reply to Brad Anderson

Jonathan M Davis

Posted in reply to Brad Anderson

On Wednesday, 21 October 2015 at 16:36:46 UTC, Brad Anderson wrote:
> On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
> [snip]
>> 2. Reference containers.
>>
>> These have classic reference semantics (à la Java). Internally, they may be implemented either as class objects or as reference counted structs.
>>
>> They're by default mutable. Qualifiers should apply to them gracefully.
>>
>> 3. Eager value containers.
>>
>> These are STL-style. Somewhat surprisingly I think these are the worst of the pack; they expensively duplicate at the drop of a hat and need to be carefully passed around by reference lest performance silently drops. Nevertheless, when used as members inside other data structures value semantics might be the appropriate choice. Also, thinking of them as values often makes code simpler.
>>
>> By default eager value containers are mutable. They should support immutable and const meaningfully.
>
> Having both reference and value semantics for containers would be great. I don't understand why reference semantics would be implemented by the container themselves though. Why not a general purpose RC! (or RefCounted! if the current design is deemed sufficient) that can apply to anything, including containers? Then you'd only need to implement the value semantic containers (and maybe throw in some RC version aliases to promote the use of the RC versions so the option isn't overlooked). It seems kind of crazy that anything in D that wants to be reference counted would need to implement the logic themselves.
>
> If there are performance advantages (I haven't thought of any but perhaps there are) to bake the RC right into the container it might also be possible to use DbI take advantage of it in RC! when appropriate.
>
> It just seems so wrong to implement reference counting dozens of times independently, especially when that means implementing all the containers twice too.

If we had value type containers and reference type containers, I would assume that they would at least share implementation, and maybe the reference types would just be wrappers around the value types. However, I completely fail to understand why you'd ever want a container that was a value type. In my experience, it's very error-prone and adds no value. It just makes it too easy to accidentally copy a container, and it can be pretty easy to have an iterator, range, etc. referring to a container that's already been destroyed (similar to having a dynamic array referring to a static array that's left scope). As long as the containers have a dup method (or whatever we call it) so that they can be copied when you do want to copy them, I would think that that was more than enough. What do you get with a value type container that you consider better than a reference type? It's not like it lives on the stack as a value type. Most of the container's guts are on the heap regardless.

- Jonathan M Davis

October 21, 2015

Re: Kinds of containers

Posted by jmh530
in reply to Jonathan M Davis

jmh530

Posted in reply to Jonathan M Davis

On Wednesday, 21 October 2015 at 18:05:07 UTC, Jonathan M Davis wrote:
>
> However, I completely fail to understand why you'd ever want a container that was a value type. In my experience, it's very error-prone and adds no value.

Are you saying there isn't a reason to use static arrays?

October 21, 2015

Re: Kinds of containers

Posted by Zz
in reply to Andrei Alexandrescu

Zz

Posted in reply to Andrei Alexandrescu

On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
> I'm finally getting the cycles to get to thinking about Design by Introspection containers. First off, there are several general categories of containers. I think D should support all properly. One question is which we go for in the standard library.
>
> 1. Functional containers.
>
> These are immutable; once created, neither their topology nor their elements may be observably changed. Manipulating a container entails creating an entire new container, often based on an existing container (e.g. append takes a container and an element and creates a whole new container).
>
> Internally, functional containers take advantage of common substructure and immutability to share actual data. The classic resource for defining and implementing functional containers is http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
>
> 2. Reference containers.
>
> These have classic reference semantics (à la Java). Internally, they may be implemented either as class objects or as reference counted structs.
>
> They're by default mutable. Qualifiers should apply to them gracefully.
>
> 3. Eager value containers.
>
> These are STL-style. Somewhat surprisingly I think these are the worst of the pack; they expensively duplicate at the drop of a hat and need to be carefully passed around by reference lest performance silently drops. Nevertheless, when used as members inside other data structures value semantics might be the appropriate choice. Also, thinking of them as values often makes code simpler.
>
> By default eager value containers are mutable. They should support immutable and const meaningfully.
>
> 4. Copy-on-write containers.
>
> These combine the advantages of value and reference containers: you get to think of them as values, yet they're not expensive to copy. Copying only occurs by necessity upon the first attempt to change them.
>
> The disadvantage is implementations get somewhat complicated. Also, they are shunned in C++ because there is no proper support for COW; for example, COW strings have been banned starting with C++11 which is quite the bummer.
>
> Together with Scott Meyers, Walter figured out a way to change D to support COW properly. The language change consists of two attributes.
>
> =======
>
> I'll attempt to implement a few versions of each and see what they look like. The question here is what containers are of interest for D's standard library.
>
>
> Andrei

While looking at containers take a look at Jiri Soukup's work some good ideas could come from there.

http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&amp;qid=1386946808&amp;sr=8-1&amp;keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank

Intrusive Data Structures.
http://www.codefarms.com/home
http://www.codefarms.com/dol
http://www.codefarms.com/ppf
http://www.codefarms.com/ptl

Zz

October 21, 2015

Re: Kinds of containers

Posted by Robert burner Schadek
in reply to Andrei Alexandrescu

Robert burner Schadek

Posted in reply to Andrei Alexandrescu

On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei Alexandrescu wrote:
> Even simpler, hasMethod!(Container, "append") -- Andrei

I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").

October 21, 2015

Re: Kinds of containers

Posted by Jonathan M Davis
in reply to Robert burner Schadek

Jonathan M Davis

Posted in reply to Robert burner Schadek

On Wednesday, 21 October 2015 at 18:24:30 UTC, Robert burner Schadek wrote:
> On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei Alexandrescu wrote:
>> Even simpler, hasMethod!(Container, "append") -- Andrei
>
> I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").

The other concern is that hasMethod isn't going to be checking anything other than the name, whereas a hasAppend could actually check that the function accepts the right arguments and returns the correct type.

And it's not like the list of the functions to check for is going to be infinite. It needs to be a known list of functions where each of those functions has a signature that meets certain requirements. You can't just be checking for a random function name and expect it to do what you want. So, having a list of explicit traits to use for testing for the expected set of functions will both allow for the tests to be more thorough than simply checking the name, _and_ it will serve as a way to help document what the list of expected functions is for a particular domain.

I really think that using hasMethod by itself like that is getting too ad hoc and doesn't really benefit us over having an explicit list of traits for the specific domain that we're dealing with. I totally buy that testing for each of the specific functions rather than trying to put all of that information in the type itself (e.g. ContainerWithAppendAndRemove) simply isn't going to scale, but I don't at all buy that using hasMethod to test for the existence of member functions is a good way to go.

- Jonathan M Davis

October 21, 2015

Re: Kinds of containers

Posted by Jonathan M Davis
in reply to jmh530

Jonathan M Davis

Posted in reply to jmh530

On Wednesday, 21 October 2015 at 18:14:39 UTC, jmh530 wrote:
> On Wednesday, 21 October 2015 at 18:05:07 UTC, Jonathan M Davis wrote:
>>
>> However, I completely fail to understand why you'd ever want a container that was a value type. In my experience, it's very error-prone and adds no value.
>
> Are you saying there isn't a reason to use static arrays?

A static array is of a fixed size, which almost no other containers are. It also lives entirely on the stack, which almost no other containers do. If there's a container that lives entirely on the stack, then maybe it would make sense for it to be a value type, but _very_ few containers fall in that category, and all of the classic containers like vector, linked list, map, etc. have no business being value types IMHO. It's just error-prone. Heck, static arrays are quite error-prone thanks to the fact that they convert to dynamic arrays, but they do serve a purpose. So, maybe there are containers that fall in the same category, but I expect that such containers are pretty obviously value types and not reference types, because their nature makes them that way. Regardless, I don't see how it's reasonable in general to make a container be a value type. It's just asking for trouble. If there's any question at all whether a container should be a value type or a reference type, IMHO, it should be a reference type.

- Jonathan M Davis

October 21, 2015

Re: Kinds of containers

Posted by Andrei Alexandrescu
in reply to Timon Gehr

Andrei Alexandrescu

Posted in reply to Timon Gehr

On 10/21/2015 11:58 AM, Timon Gehr wrote:
> For which containers we want to support is "2." not a (wrapper around a)
> pointer to "3."?

For those that need reference counting. -- Andrei

October 21, 2015

Re: Kinds of containers

Posted by H. S. Teoh
in reply to Jonathan M Davis

H. S. Teoh

Posted in reply to Jonathan M Davis

On Wed, Oct 21, 2015 at 06:38:08PM +0000, Jonathan M Davis via Digitalmars-d wrote:
> On Wednesday, 21 October 2015 at 18:14:39 UTC, jmh530 wrote:
[...]
> >Are you saying there isn't a reason to use static arrays?
> 
> A static array is of a fixed size, which almost no other containers are. It also lives entirely on the stack, which almost no other containers do.
[...]

You forget that static arrays can also be struct/class members, and in the latter case they are not necessarily on the stack. You do have a point that they are of fixed size, though, and as such are of limited use as containers.

T

-- 
"I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly

October 21, 2015

Re: Kinds of containers

Posted by Andrei Alexandrescu
in reply to H. S. Teoh

Andrei Alexandrescu

Posted in reply to H. S. Teoh

On 10/21/2015 12:12 PM, H. S. Teoh via Digitalmars-d wrote:
> On Wed, Oct 21, 2015 at 09:11:38AM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> [...]
>> My finance folks also rave about Pandas. I wish I could fork myself to
>> look into it.
> [...]
>
> Unfortunately, forking a human process takes 9 months to spawn the new
> process (slow OS, y'know), and the new process's module constructor
> takes about 18 or so years to run before it's ready for use. :-D  Also,
> the specs are ambiguous in some key areas of the semantics, so
> implementations in practice don't quite manage to replicate the original
> process exactly.

As one with two spawned forks underway, I totally agree. Besides, early versions tend to consume a lot of resources, emit many nonmaskable interrupts, and leak a lot. -- Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation