On 2 June 2013 21:46, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:
On 06/02/2013 08:33 AM, Manu wrote:
> Most of these guys are mathematicians and physicists first, and programmers second.

You've hit the nail on the head, but it's also a question of priorities.  It's
_essential_ that the maths or physics be understood and done right.

Well this is another classic point actually. I've been asked by my friends at Cambridge to give their code a once-over for them on many occasions, and while I may not understand exactly what their code does, I can often spot boat-loads of simple functional errors. Like basic programming bugs; out-by-ones, pointer logic fails, clear lack of understanding of floating point, or logical structure that will clearly lead to incorrect/unexpected edge cases.
And it blows my mind that they then run this code on their big sets of data, write some big analysis/conclusions, and present this statistical data in some journal somewhere, and are generally accepted as an authority and taken seriously!

*brain asplode*

I can tell you I usually offer more in the way of fixing basic logical errors than actually making it run faster ;)
And they don't come to me with everything, just the occasional thing that they have a hunch should probably be faster than it is.

I hope my experience there isn't too common, but I actually get the feeling it's more common that you'd like to think!
This is a crowd I'd actually love to promote D to! But the tools they need aren't all there yet...


It's essential that the programs correctly reflect that maths or physics.  It's
merely _desirable_ that the programs run as fast as possible, or be well
designed from a maintenance point of view, or any of the other things that
matter to trained software developers.  (In my day job I have to continually
force myself to _not_ refactor or optimize my code, even though I'd get a lot of
pleasure out of doing so, because it's working adequately and my work priority
is to get results out of it.)

That in turn leads to a hiring situation where the preference is to have
mathematicians or physicists who can program, rather than programmers who can
learn the maths.  It doesn't help that because of the way academic funding is
made available, the pay scales mean that it's not really possible to attract
top-level developers (unless they have some kind of keen moral desire to work on
academic research); in addition, you usually have to hire them as PhD students
or postdocs or so on (I've also seen masters' students roped in to this end),
which obviously constrains the range of people that you can hire and the range
of skills that will be available, and also the degree of commitment these people
can put into long-term vision and maintenance of the codebase.

There's also a training problem -- in my experience, most physics undergraduates
are given a crash course in C++ in their first year and not much in the way of
real computer science or development training.  In my case as a maths
undergraduate the first opportunity to learn programming was in the 3rd year of
my degree course, and it was a crash course in a very narrow subset of C
dedicated towards numerical programming.  And if (like me) you then go on into
research, you largely have to self-teach, which can lead to some very
idiosyncratic approaches.

Yeah, this is an interesting point. These friends of mine all write C code, not even C++.
Why is that?
I guess it's promoted, because they're supposed to be into the whole 'HPC' thing, but C is really not a good language for doing maths!

I see stuff like this:
float ***cubicMatrix = (float***)malloc(sizeof(float**)depth);
for(int z=0; z<width; z++)
{
  cubicMatrix[z] = (float**)malloc(sizeof(float**)*height);
  for(int y=0; y<height; y++)
  {
    cubicMatrix[z][y] = (float*)malloc(sizeof(float*)*width);
  }
}

Seriously, float***. Each 1d row is an individual allocation!
And then somewhere later on they want to iterate a column rather than a row, and get confused about the pointer arithmetic (well, maybe not precisely that, but you get the idea).


I hope that this will change, because programming is now an absolutely essential
part of just about every avenue of scientific research.  But as it stands, it's
a serious problem.