Jump to page: 1 26  
Page
Thread overview
Pandas like features
Oct 23, 2020
bioinfornatics
Oct 23, 2020
Imperatorn
Oct 23, 2020
mw
Oct 23, 2020
mw
Oct 23, 2020
mw
Oct 23, 2020
mw
Oct 25, 2020
jmh530
Oct 23, 2020
bachmeier
Oct 23, 2020
bioinfornatics
Oct 24, 2020
Russel Winder
Oct 24, 2020
bioinfornatics
Oct 24, 2020
Russel Winder
Oct 24, 2020
Andre Pany
Oct 27, 2020
Paulo Pinto
Oct 30, 2020
mw
Oct 30, 2020
mw
Oct 24, 2020
9il
Oct 25, 2020
jmh530
Oct 26, 2020
jmh530
Oct 24, 2020
data pulverizer
Oct 24, 2020
James Blachly
Oct 27, 2020
glis-glis
Oct 25, 2020
jmh530
Oct 25, 2020
bachmeier
Oct 26, 2020
jmh530
Oct 26, 2020
Paul Backus
Oct 26, 2020
bachmeier
Oct 29, 2020
jmh530
Oct 29, 2020
Russel Winder
Oct 30, 2020
jmh530
Oct 30, 2020
Russel Winder
Oct 30, 2020
Abdulhaq
Oct 30, 2020
bachmeier
Nov 03, 2020
Laeeth Isharc
Nov 05, 2020
data pulverizer
Nov 05, 2020
bachmeier
Nov 05, 2020
jmh530
Nov 05, 2020
bachmeier
Nov 05, 2020
data pulverizer
Nov 05, 2020
data pulverizer
Nov 05, 2020
jmh530
Nov 05, 2020
data pulverizer
Nov 05, 2020
bachmeier
Nov 05, 2020
data pulverizer
Nov 05, 2020
data pulverizer
Nov 12, 2020
bachmeier
Nov 12, 2020
bachmeier
Nov 13, 2020
data pulverizer
Nov 13, 2020
bachmeier
Nov 14, 2020
data pulverizer
Nov 14, 2020
Timon Gehr
October 23, 2020
As a researcher in BioInformatics I use a lot python numpy pandas and scipy. But I am bored by the slowness of python even with cpython code thanks to the GIL and un-optimized tail recursion.

So I thinks really that D could play a big role in this field with MIR and dcompute.

1/ what is the state of Magpie which was a GSoC 2019:
 - Mir Data Analysis and Processing Library

2/ does the scientific computing field is something that D language want to grow ?

Thanks

Best regards
October 23, 2020
On Friday, 23 October 2020 at 19:31:08 UTC, bioinfornatics wrote:
> As a researcher in BioInformatics I use a lot python numpy pandas and scipy. But I am bored by the slowness of python even with cpython code thanks to the GIL and un-optimized tail recursion.
>
> So I thinks really that D could play a big role in this field with MIR and dcompute.
>
> 1/ what is the state of Magpie which was a GSoC 2019:
>  - Mir Data Analysis and Processing Library
>
> 2/ does the scientific computing field is something that D language want to grow ?
>
> Thanks
>
> Best regards

2. Yes!
October 23, 2020
On Friday, 23 October 2020 at 19:31:08 UTC, bioinfornatics wrote:
> As a researcher in BioInformatics I use a lot python numpy pandas and scipy. But I am bored by the slowness of python even with cpython code thanks to the GIL and un-optimized tail recursion.
>
> So I thinks really that D could play a big role in this field with MIR and dcompute.
>
> 1/ what is the state of Magpie which was a GSoC 2019:
>  - Mir Data Analysis and Processing Library
>
> 2/ does the scientific computing field is something that D language want to grow ?
>

I think it's definitely the biggest area and opportunities for D to become more popular. GIL, lack of performance, and huge memory bloat are such pain in Python.

Probably the best way to move forward is to provide libmir as a Numpy/Pandas *drop-in* replacement. (And I've suggested to rename Mir as NumD from a marketing / promotional perspective).

For the time being, from the language/lib user's perspective, we can just use D/libmir to pre-process the data, and maybe save the result as csv/npz for further processing (by ... Python). Build or wrap something like tensorflow, I think will need much more resource than the D community current have, also I'm not sure if it worth the effort.


And from the language perspective, maybe D should adopt Python/Numpy's array indexing syntax, specifically:

1) use Python's arr[start:end], in addition to D's arr[start..end]

2) and also allow negative index, instead of [$-1]. (This $ is an improvement of Java/C++'s arr[arr.length -1], but still is less convenient than Python’s negative index syntax).

Python gained such popularity in scientific computing in the past ~10 years is not an accident, actually Guido made that happen by extending Python's syntax:

https://en.wikipedia.org/wiki/NumPy#History

"""
The Python programming language was not originally designed for numerical computing, but attracted the attention of the scientific and engineering community early on. In 1995 the special interest group (SIG) matrix-sig was founded with the aim of defining an array computing package; among its members was Python designer and maintainer Guido van Rossum, who extended Python's syntax (in particular the indexing syntax) to make array computing easier.[6]
"""

Maybe Walter should join one of such SIGs as well :-)

October 23, 2020
On Friday, 23 October 2020 at 22:38:39 UTC, mw wrote:
> And from the language perspective, maybe D should adopt Python/Numpy's array indexing syntax, specifically:
>
> 1) use Python's arr[start:end], in addition to D's arr[start..end]
>
> 2) and also allow negative index, instead of [$-1]. (This $ is an improvement of Java/C++'s arr[arr.length -1], but still is less convenient than Python’s negative index syntax).
>
> Python gained such popularity in scientific computing in the past ~10 years is not an accident, actually Guido made that happen by extending Python's syntax:
>
> https://en.wikipedia.org/wiki/NumPy#History
>
> """
> The Python programming language was not originally designed for numerical computing, but attracted the attention of the scientific and engineering community early on. In 1995 the special interest group (SIG) matrix-sig was founded with the aim of defining an array computing package; among its members was Python designer and maintainer Guido van Rossum, who extended Python's syntax (in particular the indexing syntax) to make array computing easier.[6]
> """
>
> Maybe Walter should join one of such SIGs as well :-)

Let me further quote from [6]

"""
During these early years, there was considerable interaction between the standard and scientific Python communities. In fact, Guido van Rossum, Python's Benevolent Dictator For Life (BDFL), was an active member of the matrix-sig. This close interaction resulted in Python gaining new features and syntax specifically needed by the scientific Python community. While there were miscellaneous changes, such as the addition of complex numbers, many changes focused on providing a more succinct and easier to read syntax for array manipulation. For instance, the parenthesis around tuples were made optional so that array elements could be accessed through, for example, a[0,1] instead of a[(0,1)]. The slice syntax gained a step argument— a[::2] instead of just a[:], for example—and an ellipsis operator, which is useful when dealing with multidimensional data structures.
"""


[6] https://www.computer.org/csdl/magazine/cs/2011/02/mcs2011020009/13rRUx0xPMx


October 23, 2020
On Friday, 23 October 2020 at 19:31:08 UTC, bioinfornatics wrote:
> As a researcher in BioInformatics I use a lot python numpy pandas and scipy. But I am bored by the slowness of python even with cpython code thanks to the GIL and un-optimized tail recursion.
>
> So I thinks really that D could play a big role in this field with MIR and dcompute.
>
> 1/ what is the state of Magpie which was a GSoC 2019:
>  - Mir Data Analysis and Processing Library
>
> 2/ does the scientific computing field is something that D language want to grow ?
>
> Thanks
>
> Best regards

There is some activity in this space:
https://code.dlang.org/?sort=updated&category=library.scientific

This project doesn't seem too active, but it was an earlier attempt:
http://dlangscience.github.io/
October 23, 2020
On Friday, 23 October 2020 at 22:38:39 UTC, mw wrote:
> 1) use Python's arr[start:end], in addition to D's arr[start..end]

BTW, in Python arr[start:end:step], how / if it's possible for this `step` in now D?
October 23, 2020
On Friday, 23 October 2020 at 22:48:16 UTC, bachmeier wrote:
> On Friday, 23 October 2020 at 19:31:08 UTC, bioinfornatics wrote:
>> As a researcher in BioInformatics I use a lot python numpy pandas and scipy. But I am bored by the slowness of python even with cpython code thanks to the GIL and un-optimized tail recursion.
>>
>> So I thinks really that D could play a big role in this field with MIR and dcompute.
>>
>> 1/ what is the state of Magpie which was a GSoC 2019:
>>  - Mir Data Analysis and Processing Library
>>
>> 2/ does the scientific computing field is something that D language want to grow ?
>>
>> Thanks
>>
>> Best regards
>
> There is some activity in this space:
> https://code.dlang.org/?sort=updated&category=library.scientific
>
> This project doesn't seem too active, but it was an earlier attempt:
> http://dlangscience.github.io/

To me a scientific library need to be HPC oriented, able
- to perform // computation on CPU or GPU
- to use divide and conquer strategy in order to compute over multinode
- to have dataframe features
- to have scipy features
A such library would be awesome as at these time python slowness become more and more important as data grow exponentially year after year
October 23, 2020
On Friday, 23 October 2020 at 22:53:29 UTC, mw wrote:
> On Friday, 23 October 2020 at 22:38:39 UTC, mw wrote:
>> 1) use Python's arr[start:end], in addition to D's arr[start..end]
>
> BTW, in Python arr[start:end:step], how / if it's possible for this `step` in now D?


(Today I'm in the mood of a language historian :-)

Some of Guido's early discussion of Python array index:

Slices
https://mail.python.org/pipermail/matrix-sig/1996-April/000553.html

Pseudo Indices
https://mail.python.org/pipermail/matrix-sig/1996-January/000331.html

Mutli-dimensional indexing and other comments
https://mail.python.org/pipermail/matrix-sig/1995-October/000077.html

A problem with slicing
https://mail.python.org/pipermail/matrix-sig/1995-September/000042.html
October 24, 2020
On Fri, 2020-10-23 at 23:00 +0000, bioinfornatics via Digitalmars-d wrote: […]
> To me a scientific library need to be HPC oriented, able
> - to perform // computation on CPU or GPU
> - to use divide and conquer strategy in order to compute over
> multinode
> - to have dataframe features
> - to have scipy features
> A such library would be awesome as at these time python slowness
> become more and more important as data grow exponentially year
> after year

Acting somewhat as "Devil's Advocate"…

Why not just use Chapel https://chapel-lang.org/ – it is a programming language designed to run in parallel contexts and has an awful lot of the stuff other (invariable sequential, cf. C++, D, Rust) programming language have trouble providing.

I am not sure Chapel has pandas style data frames explicitly but I'll bet something equivalent is already in there.

-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



October 24, 2020
On Saturday, 24 October 2020 at 09:29:46 UTC, Russel Winder wrote:
> On Fri, 2020-10-23 at 23:00 +0000, bioinfornatics via Digitalmars-d wrote: […]
>> To me a scientific library need to be HPC oriented, able
>> - to perform // computation on CPU or GPU
>> - to use divide and conquer strategy in order to compute over
>> multinode
>> - to have dataframe features
>> - to have scipy features
>> A such library would be awesome as at these time python slowness
>> become more and more important as data grow exponentially year
>> after year
>
> Acting somewhat as "Devil's Advocate"…
>
> Why not just use Chapel https://chapel-lang.org/ – it is a programming language designed to run in parallel contexts and has an awful lot of the stuff other (invariable sequential, cf. C++, D, Rust) programming language have trouble providing.
>
> I am not sure Chapel has pandas style data frames explicitly but I'll bet something equivalent is already in there.

Maybe, anyway since years D search the killer app. Really I thanks thisr area it is perfect for D.
Data Business analysis is so important in this day in science, economy and other D could be a good choice.
« First   ‹ Prev
1 2 3 4 5 6