D for project in computational chemistry (page 4)

On Thursday, 6 August 2015 at 08:11:49 UTC, Gerald Jansen wrote: > Is the Dscience github project an adequate platform? How can other people get involved? Is a dedicated discussion group needed? Can we develop a plan of some sort rather than just a scatter of individual efforts? If a group do go ahead with this it would be good to add a discussion group on this forum for it to maximize visibility and involvement.

On Thursday, 6 August 2015 at 08:11:49 UTC, Gerald Jansen wrote: > On Wednesday, 5 August 2015 at 18:49:21 UTC, bachmeier wrote: > >> Yes. The question is whether we can put together a group of developers to build the infrastructure, which is a lot more than just code. That means, in particular, good documentation and using it for our own projects. > > Right on! I would be willing to help with documentation if there were a concerted effort in this direction. There have been a number of failed individual efforts over the years. So how can a group effort be promoted? > > Is the Dscience github project an adequate platform? How can other people get involved? Is a dedicated discussion group needed? Can we develop a plan of some sort rather than just a scatter of individual efforts? Yes, come join https://github.com/DlangScience. Ilya and I have a plan of sorts, but it needs formally writing down. For now we have been using a private Gitter room for discussion, which has been OK for now but likely won't scale. Anyone serious about getting involved, drop a message here: https://gitter.im/DlangScience/public and we can start to build a picture of what expertise we have and what we're missing. A proper public forum could be great, but I don't personally have time to set up something like that at the moment.

Good afternoon, gentlemen, just want to describe my very limited experience. I have re-written about half of my Python code into D. I got it faster by 6 times. This is a good news. However, I was amazed by performance of D vs Python for following simple nested loops (see below). D was faster by 2 order of magnitude! Bearing in mind that Python is really used in computational chemistry/bioinformatics, I am sure D can be a good option in this field. In the modern strategy for the computational software python is used as a glue language and the number crunching parts are usually written in Fortran or C/C++. Apparently, with D one language can be used to write the entire code. Please, also look at this article: http://www.worldcomp-proceedings.com/proc/p2012/PDP3426.pdf Also, I wander about the results of this internship: http://forum.dlang.org/post/laha9j$pc$1@digitalmars.com With kind regards, Yury Python: #!/usr/bin/python import sys, string, os, glob, random from math import * a = 0 l = 1000 for i in range(l): for j in range(l): for m in range(l): a = a +i*i*0.7+j*j*0.8+m*m*0.9 print a D: import std.stdio; // command line argument import std.getopt; import std.string; import std.array; import std.conv; import std.math; // main program starts here void main(string[] args) { int l = 1000; double a = 0; for (auto i=0;i<l;i++){ for (auto j=0;j<l;j++) { for (auto m=0;m<l;m++) { a = a + i*i*0.7+j*j*0.8+m*m*0.9; } } } writeln(a); }

On 17/08/2015 1:11 a.m., Yura wrote: > Good afternoon, gentlemen, > > just want to describe my very limited experience. I have re-written > about half of my Python code into D. I got it faster by 6 times. This is > a good news. > > However, I was amazed by performance of D vs Python for following simple > nested loops (see below). D was faster by 2 order of magnitude! > > Bearing in mind that Python is really used in computational > chemistry/bioinformatics, I am sure D can be a good option in this > field. In the modern strategy for the computational software python is > used as a glue language and the number crunching parts are usually > written in Fortran or C/C++. Apparently, with D one language can be used > to write the entire code. Please, also look at this article: > > http://www.worldcomp-proceedings.com/proc/p2012/PDP3426.pdf > > Also, I wander about the results of this internship: > > http://forum.dlang.org/post/laha9j$pc$1@digitalmars.com > > With kind regards, > Yury > > > Python: > > #!/usr/bin/python > import sys, string, os, glob, random > from math import * > > a = 0 > > l = 1000 > > for i in range(l): > for j in range(l): > for m in range(l): > a = a +i*i*0.7+j*j*0.8+m*m*0.9 > > print a > > D: > > import std.stdio; > // command line argument > import std.getopt; > import std.string; > import std.array; > import std.conv; > import std.math; > > // main program starts here > void main(string[] args) { > > > int l = 1000; > double a = 0; > for (auto i=0;i<l;i++){ > for (auto j=0;j<l;j++) { > for (auto m=0;m<l;m++) { > a = a + i*i*0.7+j*j*0.8+m*m*0.9; > } > > } > } > writeln(a); > } Any chance for when you get the time/content, to create a research paper using your use case? It would be amazing publicity and even more so to get it published! Otherwise, we could always do with another user story :)

August 16, 2015

Re: D for project in computational chemistry

Posted by Idan Arye
in reply to Yura

Permalink

Idan Arye

Posted in reply to Yura

Permalink

On Sunday, 16 August 2015 at 13:11:12 UTC, Yura wrote:
> Good afternoon, gentlemen,
>
> just want to describe my very limited experience. I have re-written about half of my Python code into D. I got it faster by 6 times. This is a good news.
>
> However, I was amazed by performance of D vs Python for following simple nested loops (see below). D was faster by 2 order of magnitude!
>
> Bearing in mind that Python is really used in computational chemistry/bioinformatics, I am sure D can be a good option in this field. In the modern strategy for the computational software python is used as a glue language and the number crunching parts are usually written in Fortran or C/C++. Apparently, with D one language can be used to write the entire code. Please, also look at this article:
>
> http://www.worldcomp-proceedings.com/proc/p2012/PDP3426.pdf
>
> Also, I wander about the results of this internship:
>
> http://forum.dlang.org/post/laha9j$pc$1@digitalmars.com
>
> With kind regards,
> Yury
>
>
> Python:
>
> #!/usr/bin/python
> import sys, string, os, glob, random
> from math import *
>
> a = 0
>
> l = 1000
>
> for i in range(l):
>         for j in range(l):
>                 for m in range(l):
>                         a = a +i*i*0.7+j*j*0.8+m*m*0.9
>
> print a
>
> D:
>
> import std.stdio;
> // command line argument
> import std.getopt;
> import std.string;
> import std.array;
> import std.conv;
> import std.math;
>
> // main program starts here
> void main(string[] args) {
>
>
> int l = 1000;
> double a = 0;
> for (auto i=0;i<l;i++){
>         for (auto j=0;j<l;j++) {
>                 for (auto m=0;m<l;m++) {
>                         a = a + i*i*0.7+j*j*0.8+m*m*0.9;
>                         }
>
>         }
> }
> writeln(a);
> }

Initially I thought the Python version is so slow because it uses `range` instead of `xrange`, but I tried them both and they both take about the same, so I guess the Python JIT(or even interpreter!) can optimize these allocations away.

BTW - if you want to iterate over a range of numbers in D, you can use a foreach loop:

    foreach (i; 0 .. l) {
        foreach (j; 0 .. l) {
            foreach (m; 0 .. l) {
                a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9;
            }

        }
    }

Or, to make it look more like the Python version, you can iterate over a range-returning function:

    import std.range : iota;
    foreach (i; iota(l)) {
        foreach (j; iota(l)) {
            foreach (m; iota(l)) {
                a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9;
            }

        }
    }

There are also functions for building ranges from other ranges:

    import std.algorithm : cartesianProduct;
    import std.range : iota;
    foreach (i, j, m; cartesianProduct(iota(l), iota(l), iota(l))) {
        a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9;
    }

Keep in mind though that using these functions, while making the code more readable(to those with some experience in D, at least), is bad for performance - for my first version I got about 5 seconds when building with DMD in debug mode, while for the last version I get 13 seconds when building with LDC in release mode.

On Sunday, 16 August 2015 at 13:11:12 UTC, Yura wrote: > > > Python: > > #!/usr/bin/python > import sys, string, os, glob, random > from math import * > > a = 0 > > l = 1000 > > for i in range(l): > for j in range(l): > for m in range(l): > a = a +i*i*0.7+j*j*0.8+m*m*0.9 > > print a > While starting over with D might make a better framework for going forward, it might be less work than you think to speed-up your existing Python code base. Loops in Python are notoriously slow. The code you're using seems like a classic example of something that could be sped up. You could write the slow parts in C with Cython. Alternately, you could play with Numba's @jit.

On Sunday, 16 August 2015 at 13:59:33 UTC, Idan Arye wrote: > Initially I thought the Python version is so slow because it uses `range` instead of `xrange`, but I tried them both and they both take about the same, so I guess the Python JIT(or even interpreter!) can optimize these allocations away. > > BTW - if you want to iterate over a range of numbers in D, you can use a foreach loop: > > foreach (i; 0 .. l) { > foreach (j; 0 .. l) { > foreach (m; 0 .. l) { > a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9; > } > > } > } > > Or, to make it look more like the Python version, you can iterate over a range-returning function: > > import std.range : iota; > foreach (i; iota(l)) { > foreach (j; iota(l)) { > foreach (m; iota(l)) { > a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9; > } > > } > } > > There are also functions for building ranges from other ranges: > > import std.algorithm : cartesianProduct; > import std.range : iota; > foreach (i, j, m; cartesianProduct(iota(l), iota(l), iota(l))) { > a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9; > } > > Keep in mind though that using these functions, while making the code more readable(to those with some experience in D, at least), is bad for performance - for my first version I got about 5 seconds when building with DMD in debug mode, while for the last version I get 13 seconds when building with LDC in release mode. There is a new implementation of cartesianProduct that makes the performance difference disappear for me with ldc and dmd. It's not in ldc's phobos yet so I had to copy it manually, but hopefully it will be in the next release.

Forums