April 09, 2008
CPU with 4-8 and more cores will be around soon, so pure functions are useful, but "simpler" forms of parallel processing are useful too. OpenMP syntax is not easy, while the syntax of Intel Ct is very short and to me it looks nice enough (it's a complex set of libs for C++):

More info: http://techresearch.intel.com/articles/Tera-Scale/1514.htm

It contains few functions that allow things like:
sumReduce([1, 2, 3, 4]) = [10]
sumReduce([[1, 2], [3, 4, 5]]) = [3, 12]
sumReduce([(1 -> 1), (2 -> 1), (1-> 2)]) = [(1->3), (2->1)]
Pack([a, b, c, d, e, f], [0, 1, 1, 0, 1, 0]) = [b, c, e]
ShiftRight([a, b, c, d, e, f], [1], i) = [i, a, b, c, d, e]
RotateLeft([a, b, c, d, e, f], [2]) = [c, d, e, f, a, b]

Note that they allow much more than the + - / * among arrays, as in the D specs, they work on many kinds of collections, associative arrays too.

Such things may be written at user-level code, but they require a better inlining and the use of SIMD instructions by the compiler; and to be used well, they may enjoy some syntax sugar too (that Intel Ct has already almost enough sugar).
Such things are already built-in in the syntax of languages like Sisal, ParallelPascal, and the future Fortress. They don't solve all parallel processing problems, but they allow to solve some numerically intensive ones.
Soon many different forms of parallel processing will become essential for any high-performance language.

April 10, 2008
Interestingly, Ct copies all its data to a special protected area so it can avoid having any aliases to the data. It also garbage collects (reference counts) this memory, and uses pass-by-value to further avoid aliasing problems.

As far as that goes, D's foundation of gc and invariant types should serve this quite well.