Thread overview
D 2 target architectures
Jan 17, 2011
new2d
Jan 18, 2011
Trass3r
Jan 18, 2011
new2d
Jan 18, 2011
Trass3r
Jan 18, 2011
Ellery Newcomer
January 17, 2011
Hi

I heard about D recently from a thread in the Gentoo Forums. My question is, if I'm mostly interested in shared memory high performance computing, do the radical new D 2 concurrency features help me at all? What I have in mind is doing automatic map/reduce style of programming. I'm going to buy a new hardware providing the AVX instruction set, possibly a two socket system. This should improve loop vectorization a lot. How should it all work is via the "native" foreach and Phobos 2 library features. Is this possible? Does Phobos 2 also use the three level cache architecture in Sandy Bridge efficiently? I was considering D because I don't want to mess with low level assembly and use a high level modern language instead. C++ is also getting there, in GCC 4.6 the loop vectorization is already good.

I considered this competitor too http://www.scala-lang.org/node/8579. But they have lot of work to do still. D 2 might get there first and be faster because of no slow VM. Do you think the great minds of Phobos developers will beat Scala's coming parallel features easily? Scala hasn't even implemented non-null references and pure yet. You're probably using Google's SoC money to build Phobos the next two summers and D has much more volunteers from Amazon and Facebook doing code developing for free.
January 18, 2011
> I heard about D recently from a thread in the Gentoo Forums. My question
> is, if I'm mostly interested in shared memory high performance
> computing, do the radical new D 2 concurrency features help me at all?
> What I have in mind is doing automatic map/reduce style of programming.

Sounds like http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html will be your friend.
It's currently in the review stage, see thread "David Simcha's std.parallelism".


> I'm going to buy a new hardware providing the AVX instruction set, possibly a two socket system. This should improve loop vectorization a lot. How should it all work is via the "native" foreach and Phobos 2 library features. Is this possible? Does Phobos 2 also use the three level cache architecture in Sandy Bridge efficiently? I was considering D because I don't want to mess with low level assembly and use a high level modern language instead. C++ is also getting there, in GCC 4.6 the loop vectorization is already good.

Well array (or "vector") operations are currently optimized to leverage SSE - but this is no compiler optimization, it's hand-tuned code.
Apart from that there's GDC (using the gcc backend) and LDC (using LLVM as the backend) which potentially optimize a lot.
January 18, 2011
Trass3r Wrote:

> > I heard about D recently from a thread in the Gentoo Forums. My question is, if I'm mostly interested in shared memory high performance computing, do the radical new D 2 concurrency features help me at all? What I have in mind is doing automatic map/reduce style of programming.
> 
> Sounds like http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html will
> be your friend.
> It's currently in the review stage, see thread "David Simcha's
> std.parallelism".

Thanks! The documentation didn't help much in solving the low level details. It isn't clear to me how it uses SSE or AVX. But.. gotta give it a go.

> > I'm going to buy a new hardware providing the AVX instruction set, possibly a two socket system. This should improve loop vectorization a lot. How should it all work is via the "native" foreach and Phobos 2 library features. Is this possible? Does Phobos 2 also use the three level cache architecture in Sandy Bridge efficiently? I was considering D because I don't want to mess with low level assembly and use a high level modern language instead. C++ is also getting there, in GCC 4.6 the loop vectorization is already good.
> 
> Well array (or "vector") operations are currently optimized to leverage
> SSE - but this is no compiler optimization, it's hand-tuned code.
> Apart from that there's GDC (using the gcc backend) and LDC (using LLVM as
> the backend) which potentially optimize a lot.

Why doesn't the "official" D compiler generate fast code? Is the main priority reliability or standards conformance? But.. I'll test how well GDC works with GCC 4.5 and 4.6.
January 18, 2011
> Thanks! The documentation didn't help much in solving the low level details. It
isn't clear to me how it uses SSE or AVX. But.. gotta give it a go.

I think std.parallelism will be more about thread-level parallelism rather than SIMD.

> Why doesn't the "official" D compiler generate fast code? Is the main priority
reliability or standards conformance? But.. I'll test how well GDC works with GCC 4.5 and 4.6.

dmd's backend isn't that advanced. But it provides a reference implementation of the language whose frontend is reused in gdc and ldc.
January 18, 2011
On 01/17/2011 10:24 PM, new2d wrote:
>
> Why doesn't the "official" D compiler generate fast code? Is the main priority reliability or standards conformance? But.. I'll test how well GDC works with GCC 4.5 and 4.6.

My perception of DMD is that it may be adequately described as "Walter's playground". One of his main priorities (and points of pride) is efficiency of compilation, which is rather antithetical to reliability, as is the fact that he's maintaining it nearly on his own.

Concerning standards, they exist more completely in Walter's head and in the heads of others in the community than they do on paper or pixels anywhere that I'm aware of.