Which D features to emphasize for academic review article (page 6)

August 13, 2012

Re: Which D features to emphasize for academic review article

Posted by F i L
in reply to Andrei Alexandrescu

Permalink

F i L

Posted in reply to Andrei Alexandrescu

Permalink

Andrei Alexandrescu wrote:
> * Efficiency - D generates native code for floating point operations and has control over data layout and allocation. Speed of generated code is dependent on the compiler, and the reference compiler (dmd) does a poorer job at it than the gnu-based compiler (gdc) compiler.

I'd like to add to this. Right now I'm reworking some libraries to include Simd support using DMD on Linux 64bit. A simple benchmark between DMD and GCC of 2 million simd vector addition/subtractions actually runs faster with my DMD D code than the GCC C code. Only by ~0.8 ms, and that could be due to a difference between D's sdt.datetime.StopWatch() and C's time.h/clock(), but it's consistently faster none-the-less, which is impressive.

That said, it's also much easier to "accidentally slow that figure down significantly in DMD, whereas GCC usually always optimizes very well.


Also, and I'm not sure this isn't just me, but I ran a DMD (v2.057 T think) vector test (no simd) against Mono C# a few moths back where DMD got only ~10 ms improvement over C# (~79ms vs ~88ms). Now a similar test compiled with DMD 2.060 runs at ~22ms vs C#'s 80ms, so I believe there's been some definite optimization improvements in the internal DMD compiler over the last few version.

On 12/08/12 18:22, dsimcha wrote: > For people with more advanced CS/programming knowledge, though, this is an > advantage of D. I find Matlab and R incredibly frustrating to use for anything > but very standard matrix/statistics computations on data that's already > structured the way I like it. This is mostly because the standard CS concepts > you mention are at best awkward and at worst impossible to express and, being > aware of them, I naturally want to take advantage of them. The main use-case and advantage of both R and MATLAB/Octave seems to me to be the plotting functionality -- I've seen some exceptionally beautiful stuff done with R in particular, although I've not personally explored its capabilities too far. The annoyance of R in particular is the impenetrable thicket of dependencies that can arise among contributed packages; it feels very much like some are thrown over the wall and then built on without much concern for organization. :-(

On Monday, 13 August 2012 at 01:52:28 UTC, Joseph Rushton Wakeling wrote: > The main use-case and advantage of both R and MATLAB/Octave seems to me to be the plotting functionality -- I've seen some exceptionally beautiful stuff done with R in particular, although I've not personally explored its capabilities too far. > > The annoyance of R in particular is the impenetrable thicket of dependencies that can arise among contributed packages; it feels very much like some are thrown over the wall and then built on without much concern for organization. :-( I've addressed that, too :). https://github.com/dsimcha/Plot2kill Obviously this is a one-man project without nearly the same number of features that R and Matlab have, but like Dstats and SciD, it has probably the 20% of functionality that handles 80% of use cases. I've used it for the figures in scientific articles that I've submitted for publication and in my Ph.D. proposal and dissertation. Unlike SciD and Dstats, Plot2kill doesn't highlight D's modeling capabilities that much, but it does get the job done for simple 2D plots.

On 12/08/12 01:31, Walter Bright wrote: > On 8/11/2012 3:01 PM, F i L wrote: >> Walter Bright wrote: >>> I'd rather have a 100 easy to find bugs than 1 unnoticed one that >>> went out in >>> the field. >> >> That's just the thing, bugs are arguably easier to hunt down when >> things default >> to a consistent, usable value. > > Many, many programming bugs trace back to assumptions that floating > point numbers act like ints. There's just no way to avoid knowing and > understanding the differences. Exactly. I have come to believe that there are very few algorithms originally designed for integers, which also work correctly for floating point. Integer code nearly always assumes things like, x + 1 != x, x == x, (x + y) - y == x. for (y = x; y < x + 10; y = y + 1) { .... } How many times does it loop?

On 13/08/12 11:11, Don Clugston wrote: > Exactly. I have come to believe that there are very few algorithms originally > designed for integers, which also work correctly for floating point. //////// import std.stdio; void main() { real x = 1.0/9.0; writefln("x = %.128g", x); writefln("9x = %.128g", 9.0*x); } //////// ... well, that doesn't work, does it? Looks like some sort of cheat in place to make sure that the successive division and multiplication will revert to the original number. > Integer code nearly always assumes things like, x + 1 != x, x == x, > (x + y) - y == x. There's always good old "if(x==0)" :-)

Don Clugston: > I have come to believe that there are very few algorithms originally designed for integers, which also work correctly for floating point. And JavaScript programs that use integers? Bye, bearophile

On 8/13/2012 5:38 AM, Joseph Rushton Wakeling wrote: > Looks like some sort of cheat in place to > make sure that the successive division and multiplication will revert to the > original number. That's called "rounding". But rounding always implies some, small, error that can accumulate into being a very large error.

On 8/12/2012 6:38 PM, F i L wrote: > Also, and I'm not sure this isn't just me, but I ran a DMD (v2.057 T think) > vector test (no simd) against Mono C# a few moths back where DMD got only ~10 ms > improvement over C# (~79ms vs ~88ms). Now a similar test compiled with DMD 2.060 > runs at ~22ms vs C#'s 80ms, so I believe there's been some definite optimization > improvements in the internal DMD compiler over the last few version. There's a fair amount of low hanging optimization fruit that D makes possible that dmd does not take advantage of. I hope to get to this. One thing is I suspect that D can generate much better SIMD code than C/C++ can without compiler extensions. Another is that D allows values to be moved without needing a copyconstruct/destruct operation.

On 13/08/12 20:04, Walter Bright wrote: > That's called "rounding". But rounding always implies some, small, error that > can accumulate into being a very large error. Well, yes. I was just remarking on the choice of rounding and the motivation behind it. After all, you _could_ round it instead as, x = 1.0/9.0 == 0.11111111111111 ... 111 [finite number of decimal places] but then 9*x == 0.999999999999 ... 9999 [i.e. doesn't multiply back to 1.0]. ... and this is probably more likely to result in undesirable error than the other rounding scheme. (I think the calculator app on Windows used to have this behaviour some years back.)

On Monday, 13 August 2012 at 10:11:06 UTC, Don Clugston wrote: > ... I have come to believe that there are very few algorithms originally designed for integers, which also work correctly for floating point. > > Integer code nearly always assumes things like, x + 1 != x, x == x, > (x + y) - y == x. > > > for (y = x; y < x + 10; y = y + 1) { .... } > > How many times does it loop? Don, I would appreciate your thoughts on the issue of re-implementing numeric codes like BLAS and LAPACK in pure D to benefit from the many nice features listed in this discussion. Is it feasible? Worthwhile? Thanks, TJB

Forums