September 07, 2016
On Wednesday, 7 September 2016 at 21:01:59 UTC, deXtoRious wrote:
> That's just typical press nonsense, and even they quote Bezanson saying how Julia isn't at all suited to a whole host of applications. Julia certainly has (justifiable, imho, though only time will tell) ...

Don't get me wrong, I still think Julia is a very cool language. My opinion is that we should have more languages.


September 07, 2016
On Wednesday, 7 September 2016 at 21:01:59 UTC, deXtoRious wrote:
>
> That's just typical press nonsense, and even they quote Bezanson saying how Julia isn't at all suited to a whole host of applications. Julia certainly has (justifiable, imho, though only time will tell) aspirations of being useful in certain areas of general computing, not just scientific code, but they are far from universal applicability, let alone optimality. If nothing else, it's an interesting example of thinking rather far outside the usual box of language design, one with demonstrable real world applications.

It's also from 2014...
September 07, 2016
On Wednesday, 7 September 2016 at 21:07:20 UTC, data pulverizer wrote:
> Don't get me wrong, I still think Julia is a very cool language. My opinion is that we should have more languages.

Let me correct myself ... I think that hyper-meta-programming as in Sparrow could certainly revolutionize computing. I think that a big deal.
September 07, 2016
On Wednesday, 7 September 2016 at 20:49:42 UTC, data pulverizer wrote:
>
> You're quite right that D doesn't need to change at all to implement something like pandas or dataframes in R, but I am thinking of how to got further. Very often in data science applications types will turn up that are required but are not currently configured for your table. The choice you have is to have to modify the code or as scala does give programmers the ability to write their own interface to the type so that the it can be stored in their DataFrame.

I think part of the difficulty is that you're thinking in terms of everything being dynamic. If all your data is statically typed in the first place, then I don't see what the issue is.

Consider a potential use case. You have an existing data frame and you want to add a column of data to it that has a different type than the existing frame. I imagine the function call would look something like:
auto newFrame = oldFrame.addCol(newData);
So you just need to ensure that the data frame struct or class has an addCol method that returns a new frame with the correct type when you add a column.

I'm not familiar with Scala's data frames.

> The best solution is that the data table is able to cope with arbitrary number of types which can be done in Sparrow.

D has support for an arbitrary number of types (tuple, variant, algebraic). It's just a matter of putting it together.

Anyway, given that Sparrow is still in its early stages, if you actually want to get some work done, D might be a better fit.

On Wednesday, 7 September 2016 at 20:52:26 UTC, data pulverizer wrote:
>
> p.s. it goes beyond just tables, ... having dynamic capability in a static compiled language really does take computing to a different place indeed.

There are some dynamic capabilities in D, such as variant/algebraic and Adam Ruppe's jsvar. I only wonder if you would lose performance if wanted something fully dynamic. A static approach is a good starting place.
September 07, 2016
On Wednesday, 7 September 2016 at 20:29:42 UTC, jmh530 wrote:
> Thanks for the reply. It looks like an interesting idea. You might consider adding this (or a modified version) to a read me in the range subfolder.

Fuck it, I took an hour to document the most significant modules.

https://github.com/pineapplemachine/mach.d/tree/master/mach/range

September 07, 2016
On Wednesday, 7 September 2016 at 21:25:30 UTC, jmh530 wrote:
> Consider a potential use case. You have an existing data frame and you want to add a column of data to it that has a different type than the existing frame. I imagine the function call would look something like:
> auto newFrame = oldFrame.addCol(newData);

Yes, but from a usability point of view this would be very poor - forcing the user to create a new variable each time they modified a table. I am aware that databases do this but it is hidden away.

> ... I only wonder if you would lose performance if wanted something fully dynamic. A static approach is a good starting place.

Yes you would, which is why I see the hyper-meta route as being the potential solution to this issue.


September 07, 2016
On Wednesday, 7 September 2016 at 20:57:15 UTC, bachmeier wrote:
>> I too come from the R world and I have been playing the game of flitting between R and C++; using C++ (through RCpp) to speed up slow things in R for some time and I have been looking for a better solution.
>
> What are you doing with Rcpp that you can't do with D?

Sorry I'll correct myself again! Because R is a dynamic programming language, you could do things that you could not do in D, however they would be very inefficient. hyper-meta-programming takes this barrier away.
September 07, 2016
On Wednesday, 7 September 2016 at 22:11:05 UTC, data pulverizer wrote:
> On Wednesday, 7 September 2016 at 20:57:15 UTC, bachmeier wrote:
>>> I too come from the R world and I have been playing the game of flitting between R and C++; using C++ (through RCpp) to speed up slow things in R for some time and I have been looking for a better solution.
>>
>> What are you doing with Rcpp that you can't do with D?
>
> Sorry I'll correct myself again! Because R is a dynamic programming language, you could do things that you could not do in D, however they would be very inefficient. hyper-meta-programming takes this barrier away.

I meant use a combination of R + D rather than R + C++. Any bottlenecks can be handled in D as easily as C++. However, if you want to go beyond what you can do with Rcpp, it's a different story.
September 07, 2016
On Wednesday, 7 September 2016 at 21:33:25 UTC, pineapple wrote:
>
> Fuck it, I took an hour to document the most significant modules.
>
> https://github.com/pineapplemachine/mach.d/tree/master/mach/range

Looks like a step in the right direction!
September 07, 2016
On Wednesday, 7 September 2016 at 21:41:20 UTC, data pulverizer wrote:
>
> Yes, but from a usability point of view this would be very poor - forcing the user to create a new variable each time they modified a table. I am aware that databases do this but it is hidden away.
>

To be fair, you can still mutate values within the table. In this approach, it's only appending new columns (or inserting them or something) that requires a new variable. It shouldn't be an issue for adding rows, assuming the underlying table is made from slices. It might be possible to do this without creating a variable, but I haven't thought about it that carefully.

Moreover, if you're working with slices, then it's a reference type. This means that the new variable is not a copy of the old. It shouldn't take up much space.