Scientific computing and parallel computing C++23/C++26 (page 8)

Settings

Help

Index » General » Scientific computing and parallel computing C++23/C++26 (page 8)

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by bachmeier
in reply to sfp

Permalink

bachmeier

Posted in reply to sfp

Permalink

On Tuesday, 18 January 2022 at 17:03:33 UTC, sfp wrote:

> You must also consider that the items that bioinfornatics listed are all somewhat contingent on each other. In isolation they aren't nearly as useful. You might have a numpy/scipy clone, but if you don't also have a matplotlib clone (or some other means of doing data visualization from D) their utility is a bit limited.

To my knowledge pyd still works. There's not much to be gained from rewriting a plotting library from scratch. It's not common that you're plotting 100 million times for each run of your program.

I see too much NIH syndrome here. If you can call another language, all you need to do is write convenience wrappers on top of the many thousands of hours of work done in that language. You can replace the pieces where it makes sense to do so. The goal of the D program is whatever analysis you're doing on top of those libraries, not the libraries themselves.

We call C libraries all the time. Nobody thinks that's a problem. A bunch of effort has gone into calling C++ libraries and there's tons of support for that effort. When it comes to calling any other language, even for things that don't require performance, there's no interest. The ability to interoperate with other languages is the number one reason I started using D and the main reason I still use it.

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by H. S. Teoh
in reply to bachmeier

Permalink

H. S. Teoh

Posted in reply to bachmeier

Permalink

On Tue, Jan 18, 2022 at 08:28:52PM +0000, bachmeier via Digitalmars-d wrote:
> On Tuesday, 18 January 2022 at 17:03:33 UTC, sfp wrote:
> 
> > You must also consider that the items that bioinfornatics listed are all somewhat contingent on each other. In isolation they aren't nearly as useful. You might have a numpy/scipy clone, but if you don't also have a matplotlib clone (or some other means of doing data visualization from D) their utility is a bit limited.
> 
> To my knowledge pyd still works. There's not much to be gained from rewriting a plotting library from scratch. It's not common that you're plotting 100 million times for each run of your program.
>
> I see too much NIH syndrome here. If you can call another language, all you need to do is write convenience wrappers on top of the many thousands of hours of work done in that language. You can replace the pieces where it makes sense to do so. The goal of the D program is whatever analysis you're doing on top of those libraries, not the libraries themselves.
[...]

+1.  Why do we need to reinvent numpy/scipy? One of the advantages conferred by D's metaprogramming capabilities is easier integration with other languages.  Adam Ruppe's jni.d is one prime example of how metaprogramming can abstract away the nasty amounts of boilerplate you're otherwise forced to write when interfacing with Java via JNI. D's C ABI compatibility also means you can leverage the tons of C libraries out there right now, instead of waiting for somebody to reinvent the same libraries in D years down the road.  D's capabilities makes it very amenable to being a "glue" language for interfacing with other languages.


T

-- 
Caffeine underflow. Brain dumped.

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by bachmeier
in reply to H. S. Teoh

Permalink

bachmeier

Posted in reply to H. S. Teoh

Permalink

On Tuesday, 18 January 2022 at 21:22:11 UTC, H. S. Teoh wrote:
> On Tue, Jan 18, 2022 at 08:28:52PM +0000, bachmeier via Digitalmars-d wrote:
>> On Tuesday, 18 January 2022 at 17:03:33 UTC, sfp wrote:
>> 
>> > You must also consider that the items that bioinfornatics listed are all somewhat contingent on each other. In isolation they aren't nearly as useful. You might have a numpy/scipy clone, but if you don't also have a matplotlib clone (or some other means of doing data visualization from D) their utility is a bit limited.
>> 
>> To my knowledge pyd still works. There's not much to be gained from rewriting a plotting library from scratch. It's not common that you're plotting 100 million times for each run of your program.
>>
>> I see too much NIH syndrome here. If you can call another language, all you need to do is write convenience wrappers on top of the many thousands of hours of work done in that language. You can replace the pieces where it makes sense to do so. The goal of the D program is whatever analysis you're doing on top of those libraries, not the libraries themselves.
> [...]
>
> +1.  Why do we need to reinvent numpy/scipy? One of the advantages conferred by D's metaprogramming capabilities is easier integration with other languages.  Adam Ruppe's jni.d is one prime example of how metaprogramming can abstract away the nasty amounts of boilerplate you're otherwise forced to write when interfacing with Java via JNI. D's C ABI compatibility also means you can leverage the tons of C libraries out there right now, instead of waiting for somebody to reinvent the same libraries in D years down the road.  D's capabilities makes it very amenable to being a "glue" language for interfacing with other languages.

The next release of my embedr library (which I've been able to do now that my work life is finally returning to normal) will make it trivial to call D functions from R. What I mean by that is that you write a file of D functions and by the magic of metaprogramming, you don't need to write any boilerplate at all.

Example:

```
import mir.random;
import mir.random.variable;

RVector rngexample(int n) {
	auto gen = Random(unpredictableSeed);
	auto rv = uniformVar(-10, 10); // [-10, 10]
	auto result = RVector(n);
	foreach(ii; 0..n) {
		result[ii] = rv(gen);
	}
	return result;
}
mixin(createRFunction!rngexample);
```

The only way you can do better is if someone else writes the program for you. But then it doesn't make much difference which language is used.

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by sfp
in reply to bachmeier

Permalink

sfp

Posted in reply to bachmeier

Permalink

On Tuesday, 18 January 2022 at 22:00:42 UTC, bachmeier wrote:
> On Tuesday, 18 January 2022 at 21:22:11 UTC, H. S. Teoh wrote:
>> [...]
>
> The next release of my embedr library (which I've been able to do now that my work life is finally returning to normal) will make it trivial to call D functions from R. What I mean by that is that you write a file of D functions and by the magic of metaprogramming, you don't need to write any boilerplate at all.
>
> Example:
>
> ```
> import mir.random;
> import mir.random.variable;
>
> RVector rngexample(int n) {
> 	auto gen = Random(unpredictableSeed);
> 	auto rv = uniformVar(-10, 10); // [-10, 10]
> 	auto result = RVector(n);
> 	foreach(ii; 0..n) {
> 		result[ii] = rv(gen);
> 	}
> 	return result;
> }
> mixin(createRFunction!rngexample);
> ```
>
> The only way you can do better is if someone else writes the program for you. But then it doesn't make much difference which language is used.

This is all news to me. It's a shame these libraries and their capabilities aren't advertised my prominently.

How hard would it be to automatically wrap a D library and expose it to Python, MATLAB, and Julia simultaneously? Say the library even has a simple C-style API, or a very simple single-inheritance OO hierarchy with no templates.

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by jmh530
in reply to H. S. Teoh

Permalink

jmh530

Posted in reply to H. S. Teoh

Permalink

On Tuesday, 18 January 2022 at 21:22:11 UTC, H. S. Teoh wrote:
> [snip]
>
> +1.  Why do we need to reinvent numpy/scipy? One of the advantages conferred by D's metaprogramming capabilities is easier integration with other languages.  Adam Ruppe's jni.d is one prime example of how metaprogramming can abstract away the nasty amounts of boilerplate you're otherwise forced to write when interfacing with Java via JNI. D's C ABI compatibility also means you can leverage the tons of C libraries out there right now, instead of waiting for somebody to reinvent the same libraries in D years down the road.  D's capabilities makes it very amenable to being a "glue" language for interfacing with other languages.
>
>
> T

I'm all for leveraging C libraries in D, but if you have code that needs to be performant then you may run into limitations with python. If you're building one chart with Matplotlib, then it's probably fine. If you have some D code that takes longer to run (e.g. a simulation that deals with a lot of data and many threads), then you might be a little more careful about what python code to incorporate and how. I don't know the technical details needed to get the best performance in that situation (are there benchmarks?), but I saw some work done about using python buffer protocol when calling D functions from python.

In addition, the python code might itself be calling the same C libraries that D can (e.g. LAPACK) (though potentially with different defaults, trading off performance vs. accuracy, resulting in python being faster in some cases than D). In that case, python is also a glue language. Taking the same approach in D can simplify your code base a little bit and you don't need to worry about any additional overhead or limitations from GIL that might get introduced.

Again, not something you need to worry about when performance is not a big issue.

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by Ola Fosheim Grøstad
in reply to bachmeier

Permalink

Ola Fosheim Grøstad

Posted in reply to bachmeier

Permalink

On Tuesday, 18 January 2022 at 20:28:52 UTC, bachmeier wrote:

from rewriting a plotting library from scratch. It's not common that you're plotting 100 million times for each run of your program.

It is not uncommon to interact with plots that are too big for matplotlib to handle well. The python visualization solutions are very primitive. Having something better than numpy+matplotlib is obviously an advantage, a selling point for other offerings.

Having the exact same thing? Not so much.

You can replace the pieces where it makes sense to do so. The goal of the D program is whatever analysis you're doing on top of those libraries, not the libraries themselves.

You don't get a unified API with good usability by collecting a hodge podge of libraries. You also don't get any performance or quality advantage over other solutions. Borrowing is ok, replicating APIs? Probably not. What is then the argument for not using the original language directly?

The reason for moving to a new language (like Julia or Python) is that you get something that better fits what you want to do and that transitioning provides a smoother work flow in the end.

If everything you achieve by switching is replacing one set of trade offs with another set of trade offs, then you are generally better off using the more mainstream, supported and well documented alternative.

So where do you start? With a niche, e.g. signal processing or some other "mainstream" niche.

We call C libraries all the time. Nobody thinks that's a problem. A bunch of effort has gone into calling C++ libraries and there's tons of support for that effort.

So, libraries are often written in C in order to support other languages and they are structured in a very basic way as far as C code goes. C-only libraries are sometimes not as easy to interface with as they rely heavily on macros, dedicated runtimes or specifics of the underlying platform.

I also think the C++ interop D offers is a bit clunky. It is more suitable for people who write C-like C++ than people who try to write idiomatic C++. D has to align itself more with C++ semantics for this to be a good selling point.

I am somewhat impressed that Python has many solutions for binding to C++ though, even when Python is semantically a very poor fit for C++… (e.g. Binder). D's potential strength here is not so much in being able to bind to C++ in a limited fashion (like Python), but being able to port C++ to D and improve on it. To get there you need feature parity, which is what this thread is about.

We now know that C++ will eventually get more powerful parallel computing abilities built into the language, supported by the hardware manufacturer Nvidia for their hardware (nvc++). That said Apple has shown little interest in making their versino of C++ work well with parallel computing and the C++ standard lib is not very good for numeric operations. Like, the simd code I wrote for inner product (using generic llvm SIMD) turned out to be 3 times faster than the generic C++ standard library solution.

Yet, we see "change is coming" written on the horizon, I think.

So either D has to move in a different direction than competing head-to-head with C++ or one has be more strategic in how the development process is structured. Or well, just more strategic in general.

January 18, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by sfp
in reply to Ola Fosheim Grøstad

Permalink

sfp

Posted in reply to Ola Fosheim Grøstad

Permalink

On Tuesday, 18 January 2022 at 22:21:40 UTC, Ola Fosheim Grøstad wrote:

On Tuesday, 18 January 2022 at 20:28:52 UTC, bachmeier wrote:

from rewriting a plotting library from scratch. It's not common that you're plotting 100 million times for each run of your program.

To add to this: matplotlib has many pain points. It has an inconsistent API, it is very slow, its 3D plotting is hacked together (and very slow). Making animations isn't straightforward (and very slow). Making just several hundred plots typically takes several minutes (at least). It should take <1s. That said, matplotlib is very powerful and handles essentially all important use cases. There is definitely room for improvement. If someone with NIH syndrome came along and wrote a plotting library which actually improves on matplotlib significantly, it would be to D's benefit, especially since it would be trivial to consume from other languages which would be interested in using it.

January 19, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by Tejas
in reply to Ola Fosheim Grøstad

Permalink

Tejas

Posted in reply to Ola Fosheim Grøstad

Permalink

On Tuesday, 18 January 2022 at 22:21:40 UTC, Ola Fosheim Grøstad wrote:

Wow, this is the first time I've read that matplotlib is inadequate. Can you please give an example of a visualisation library(any language) which you consider good?

January 19, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by forkit
in reply to Ola Fosheim Grøstad

Permalink

forkit

Posted in reply to Ola Fosheim Grøstad

Permalink

On Tuesday, 18 January 2022 at 22:21:40 UTC, Ola Fosheim Grøstad wrote:
> ...D's potential strength here is not so much in being able to bind to C++ in a limited fashion (like Python), but being able to port C++ to D and improve on it. To get there you need feature parity, which is what this thread is about.

Not just 'feature' parity, but 'performance' parity too:

"Broad adoption of high-level languages by the scientific community is unlikely without compiler optimizations to mitigate the performance penalties these languages abstractions impose." - https://www.cs.rice.edu/~vs3/PDF/Joyner-MainThesis.pdf

January 19, 2022

Re: Scientific computing and parallel computing C++23/C++26

Posted by Paulo Pinto
in reply to forkit

Permalink

Paulo Pinto

Posted in reply to forkit

Permalink

On Wednesday, 19 January 2022 at 04:45:20 UTC, forkit wrote:
> On Tuesday, 18 January 2022 at 22:21:40 UTC, Ola Fosheim Grøstad wrote:
>> ...D's potential strength here is not so much in being able to bind to C++ in a limited fashion (like Python), but being able to port C++ to D and improve on it. To get there you need feature parity, which is what this thread is about.
>
> Not just 'feature' parity, but 'performance' parity too:
>
> "Broad adoption of high-level languages by the scientific community is unlikely without compiler optimizations to mitigate the performance penalties these languages abstractions impose." - https://www.cs.rice.edu/~vs3/PDF/Joyner-MainThesis.pdf

That paper is from 2008, meanwhile in 2021,

https://www.hpcwire.com/off-the-wire/julia-joins-petaflop-club//

This is what D has to compete against, not only C++ with the existing SYSCL/CUDA tooling and their ongoing integration into ISO C++.

Top | Forum index | About this forum

Forums