Thread overview
What D Needs…
Jun 21, 2015
Russel Winder
Jun 22, 2015
Laeeth Isharc
Jun 26, 2015
Russel Winder
Jun 26, 2015
John Colvin
Jun 26, 2015
jmh530
Jun 27, 2015
stax76
June 21, 2015
Contributing to the "What D needs to get traction" debate ongoing in various threads, a bit of feedback from the PyData London 2015 day yesterday (I couldn't get there Friday or today).

Data science folk use Python because of NumPy/SciPy/Matplotlib/Pandas. And IPython (soon to be Jupyter). Julia is on the radar, but…

NumPy is actually relatively easy to crack (it is just an n-dimensional array type with algorithms), which means most of SciPy is straightforward (it just adds stuff on NumPy). Matplotlib cannot be competed against so D needs to ensure it can very trivially interwork with Python and Matplotlib. C-linkage and CFFI attacks much of this, PyD attack much of the rest. This leaves Pandas (which is about time series and n-dimensional equivalents) and Jupyter (which is about creating Markdown or LaTeX documents with embedded executable code fragments).

If D had a library that attacked the capabilities offered by Pandas and could be a language usable in Jupyter, there is an angle for serious usage as long as D performs orders of magnitude faster than NumPy and faster than Cython code.

At the heart of all this is a review of std.parallelism to make sure we
can get better performance than we currently do.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


June 22, 2015
On Sunday, 21 June 2015 at 16:17:57 UTC, Russel Winder wrote:
> Contributing to the "What D needs to get traction" debate ongoing in various threads, a bit of feedback from the PyData London 2015 day yesterday (I couldn't get there Friday or today).
>
> Data science folk use Python because of NumPy/SciPy/Matplotlib/Pandas. And IPython (soon to be Jupyter). Julia is on the radar, but…
>
> NumPy is actually relatively easy to crack (it is just an n-dimensional array type with algorithms), which means most of SciPy is straightforward (it just adds stuff on NumPy). Matplotlib cannot be competed against so D needs to ensure it can very trivially interwork with Python and Matplotlib. C-linkage and CFFI attacks much of this, PyD attack much of the rest. This leaves Pandas (which is about time series and n-dimensional equivalents) and Jupyter (which is about creating Markdown or LaTeX documents with embedded executable code fragments).
>
> If D had a library that attacked the capabilities offered by Pandas and could be a language usable in Jupyter, there is an angle for serious usage as long as D performs orders of magnitude faster than NumPy and faster than Cython code.
>
> At the heart of all this is a review of std.parallelism to make sure we
> can get better performance than we currently do.


Thanks for the colour, Russell.

I agree about NumPy and Pandas - the foundations are not so hard to replicate (but better!)  John Colvin and Ilya seem to be working on this now (and Vlad Levenfeld's stuff too).

I don't know about matplotlib.  It's pretty easy to use D to chart using it, but I didn't find it the friendliest library for what I wanted to do.  And bokeh is nice for interactivity (which is easy to talk to via python, but wouldn't be hard to write a D wrapper for - something I made a start on - since it is only object representation, and no real hard work on the server side).

Is matplotlib better than mathgl?  (I don't have enough experience of either to have a view).

But D is a language usable in Jupyter - I have been playing with it for a few days now.  Main thing missing for it to be very usable is seeing the compiler output in a pretty manner (well, actually just making it visible, would be a start) and making a nice way to be able to use dub with PyD/PyDmagic.

If you review std.parallelism, would it be worth adding fork/processes there as seems like for some purposes that may be better than threading?


Laeeth.
June 26, 2015
On Mon, 2015-06-22 at 02:40 +0000, Laeeth Isharc via Digitalmars-d wrote:
> 
[…]
> I agree about NumPy and Pandas - the foundations are not so hard to replicate (but better!)  John Colvin and Ilya seem to be working on this now (and Vlad Levenfeld's stuff too).

I'll see if I can check this out.

> I don't know about matplotlib.  It's pretty easy to use D to chart using it, but I didn't find it the friendliest library for what I wanted to do.  And bokeh is nice for interactivity (which is easy to talk to via python, but wouldn't be hard to write a D wrapper for - something I made a start on - since it is only object representation, and no real hard work on the server side).

Bokeh is also getting traction in the data science world, but matplotlib remains the major player.

> Is matplotlib better than mathgl?  (I don't have enough experience of either to have a view).

I haven't tried mathgl. It is now on my agenda.

> But D is a language usable in Jupyter - I have been playing with it for a few days now.  Main thing missing for it to be very usable is seeing the compiler output in a pretty manner (well, actually just making it visible, would be a start) and making a nice way to be able to use dub with PyD/PyDmagic.

I wonder if Buck is a better route. Facebook use this and it has D support.

I suspect anything to do with PyD will have to go via setuptools.

> If you review std.parallelism, would it be worth adding fork/processes there as seems like for some purposes that may be better than threading?

I think I may not be doing that review now as I have started on trying to get Chapel working well with Python 3. C++, D, Rust, Go are millennia behind Chapel in terms of managing local and cluster parallelism: Chapel is a PGAS language so handles clusters of muli -multicore processors in a single program.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


June 26, 2015
On Friday, 26 June 2015 at 09:12:04 UTC, Russel Winder wrote:
> On Mon, 2015-06-22 at 02:40 +0000, Laeeth Isharc via Digitalmars-d wrote:
>> [...]
> […]
>> [...]
>
> I'll see if I can check this out.

n-dimensional slices: https://github.com/D-Programming-Language/phobos/pull/3397

DlangScience: https://dlangscience.github.io/

Early days, limited developer time, but things are happening.

See also https://github.com/DlangScience and http://
>> [...]
>
> Bokeh is also getting traction in the data science world, but matplotlib remains the major player.
>
>> [...]
>
> I haven't tried mathgl. It is now on my agenda.

http://vispy.org/ is also possibly interesting

>> [...]
>
> I wonder if Buck is a better route. Facebook use this and it has D support.
>
> I suspect anything to do with PyD will have to go via setuptools.

The problems with compiler output are just because my bad code, not anything more fundametal.

>> [...]
>
> I think I may not be doing that review now as I have started on trying to get Chapel working well with Python 3. C++, D, Rust, Go are millennia behind Chapel in terms of managing local and cluster parallelism: Chapel is a PGAS language so handles clusters of muli -multicore processors in a single program.
June 26, 2015
On Friday, 26 June 2015 at 09:12:04 UTC, Russel Winder wrote:
>
> I think I may not be doing that review now as I have started on trying to get Chapel working well with Python 3. C++, D, Rust, Go are millennia behind Chapel in terms of managing local and cluster parallelism: Chapel is a PGAS language so handles clusters of muli -multicore processors in a single program.


I'm not really that familiar with Chapel (I read your thread a few weeks ago here and a simple tutorial but that's it). What do you think it does better?
June 27, 2015
What prevents me personally most from using new or alternative languages is probably lack of 100% complete IntelliSense.