July 24, 2012
On 07/24/12 16:34, Andrei Alexandrescu wrote:
> I was talking to Walter on how to define a good study of D's compilation speed. We figured that we clearly need a good baseline, otherwise numbers have little meaning.

I agree.

> One idea would be to take a real, non-trivial application, written in both D and another compiled language. We then can measure build times for both applications, and also measure the relative speeds of the generated executables.

Well I kind of did exactly that.

I was planning to start a Blog ("you know the blog you should really really start but can't find time to do so") with such a comparison. I started it a few months ago and can't finish the post so it's still there, lying half finished. But as the subject pops out of the NG it would be stupid not to talk about it.

I intended to add relevant numbers and go from deterministic measurable facts to more subjective remarks ( was it fun ? is it more maintainable ? ) but I really just did a bit of the the first part :(

Anyway, so for people interested in my "findings" here is the half finished post : http://goo.gl/16Yrb

This could serve as a basis of do's and don'ts for a more relevant comparison as Andrei proposed. For instance it could be interesting to compare the performance of several C++ and D compilers to get a measure of the performance standard deviation expected within the language.

Also I think the D code could have been more idiomatic and optimized further more : it was just a quick test ( yet quite time consuming ).

Both projects are opensource, one is endorsed by the company I'm working for (https://github.com/mikrosimage/sequenceparser), the other one is just a personal project for the purpose of the comparison ( https://github.com/gchatelet/d_sequence_parser )

By the way, it reminds me of the 'Computer Language Benchmarks Game' (http://shootout.alioth.debian.org/). I know D is not welcome aboard but couldn't we try do run the game for ourself so to have some more data ?

--
Guillaume
July 24, 2012
On Tue, 24 Jul 2012 18:53:25 +0200
Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:

> On 7/24/12, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> > snip
> 
> I've got a codebase where it takes DMD 15 seconds to output an error message to stdout. The error message is 3000 lines long. (and people thought C++ errors were bad!). It's all thanks to this bug: http://d.puremagic.com/issues/show_bug.cgi?id=8082
> 
> The codebase isn't public yet so I can't help you with comparisons. Non-release full builds take 16 seconds for a template-heavy ~12k codebase (without counting lines of external dependencies). I use a lot of static foreach loops btw.
> 
> Personally I think full builds are very fast compared to C++, although the transition from a small codebase which takes less than a second to compile to a bigger codebase which takes over a dozen seconds to compile is an unpleasant experience. I'd love to see DMD speed up its compile-time features like templates, mixins, static foreach, etc.

Yea. Programs using Goldie ( semitwist.com/goldie ) take a long time to
compile (by D standards, not by C++ standards). I tried to benchmark
it a while back, and was never really confident in the results I was
getting or my understanding of the DMD source, so I never brought it up
before. But it *seemed* to be template matching that was the big
bottleneck (ie, IIUC, determining which template to instantiate,
and I think the function was actually called "match" or something like
that). Goldie does make use of a *lot* of that sort of thing.

July 24, 2012
On Tuesday, July 24, 2012 15:49:38 Nick Sabalausky wrote:
> Yea. Programs using Goldie ( semitwist.com/goldie ) take a long time to
> compile (by D standards, not by C++ standards). I tried to benchmark
> it a while back, and was never really confident in the results I was
> getting or my understanding of the DMD source, so I never brought it up
> before. But it *seemed* to be template matching that was the big
> bottleneck (ie, IIUC, determining which template to instantiate,
> and I think the function was actually called "match" or something like
> that). Goldie does make use of a *lot* of that sort of thing.

I don't have any hard evidence for it, but I've always gotten the impression that it was templates, mixins, and CTFE which really slowed down compilation. Certainly, they increase the memory consumption of the compiler by quite a bit. My guess would be that if we were looking to improve the compiler's performance, that's where we'd need to focus. But we'd have to actually profile the compiler on a variety of projects to be sure of that (which is at least partially related to what Andrei is suggesting).

- Jonathan M Davis
July 24, 2012
On Tuesday, 24 July 2012 at 22:19:07 UTC, Jonathan M Davis wrote:
> On Tuesday, July 24, 2012 15:49:38 Nick Sabalausky wrote:
>> Yea. Programs using Goldie ( semitwist.com/goldie ) take a long time to
>> compile (by D standards, not by C++ standards). I tried to benchmark
>> it a while back, and was never really confident in the results I was
>> getting or my understanding of the DMD source, so I never brought it up
>> before. But it *seemed* to be template matching that was the big
>> bottleneck (ie, IIUC, determining which template to instantiate,
>> and I think the function was actually called "match" or something like
>> that). Goldie does make use of a *lot* of that sort of thing.
>
> I don't have any hard evidence for it, but I've always gotten the impression
> that it was templates, mixins, and CTFE which really slowed down compilation.
> Certainly, they increase the memory consumption of the compiler by quite a
> bit. My guess would be that if we were looking to improve the compiler's
> performance, that's where we'd need to focus. But we'd have to actually profile
> the compiler on a variety of projects to be sure of that (which is at least
> partially related to what Andrei is suggesting).
>
> - Jonathan M Davis

There's also the nasty O(n^2) optimiser issue.

http://d.puremagic.com/issues/show_bug.cgi?id=7157
July 24, 2012
On 24/07/12 15:34, Andrei Alexandrescu wrote:
> One idea would be to take a real, non-trivial application, written in both D and
> another compiled language. We then can measure build times for both
> applications, and also measure the relative speeds of the generated executables.

Suggest that this gets done with all 3 of the main D compilers, not just DMD. I'd like to see the tradeoff between compilation speed and executable speed that one gets between them.

I do have some pretty much equivalent simulation code written in both D and C++.  For a rough comparison:

   Language     Compiler        Compile time (s)        Runtime (s)
     D            GDC              1.5                    25.3
     D            DMD              0.4                    52.1
     C++          g++              2.3                    21.8
     C++          Clang++          1.8                    27.6

DMD used is a fairly recent pull from GitHub; GDC is the 4.6.3 package found in Ubuntu 12.04.  I don't have a working LDC2 compiler on my system. :-(

The C++ has a template-based policy class design, while the D code uses template mixins to similar effect.  The D code can be found here:
https://github.com/WebDrake/Dregs

While I'm happy to also share the C++ code, I confess I'm shy to do so given that it probably represents a travesty of the beautiful ideas Andrei developed on policy class design ... :-)

Best wishes,

    -- Joe
July 25, 2012
On 7/24/2012 8:06 AM, Andrei Alexandrescu wrote:
> Nevertheless, I think there is value in the study. We're looking at a real
> nontrivial application that wasn't written for a study, but for actual use, and
> that implements the same design and same functionality in both languages.

The translation is also just that, a line-by-line translation that started by copying the .c source files to .d.

It's probably as good as you're going to get in comparing compile speed.


July 25, 2012
On 7/24/2012 3:18 PM, Jonathan M Davis wrote:
> But we'd have to actually profile
> the compiler on a variety of projects to be sure of that (which is at least
> partially related to what Andrei is suggesting).

I wouldn't be a bit surprised to find that there are some O(n*n) or worse algorithms embedded in the compiler that can be triggered by some types of code.

Profiling is the way to root them out.
July 25, 2012
On 7/24/2012 7:58 AM, Paulo Pinto wrote:
>> "Roman D. Boiko"  wrote in message news:hpibxcqsmlpmgyngjzwp@forum.dlang.org...
>> On Tuesday, 24 July 2012 at 14:34:58 UTC, Andrei Alexandrescu wrote:
>>> the D source is in D1 and should be adjusted to compile with D2),
>>
>> That would provide performance (compilation and run-time) for D1 only (with D2
>> compiler). Performance of a typical D2 app would likely be different.
>
> Still, is a good starting point.

The reality is, no matter what such benchmark is chosen, it will be attacked as being biased. There is no such thing as a perfect apples-apples comparison between languages, and there'll be no shortage of criticism of any shortcomings, valid and invalid.

That doesn't mean we shouldn't do it.

Heck, I've even been accused of "sabotaging" the Digital Mars C++ compiler in order to make D look good!


July 25, 2012
On 7/24/2012 11:02 AM, Guillaume Chatelet wrote:
> By the way, it reminds me of the 'Computer Language Benchmarks Game'
> (http://shootout.alioth.debian.org/). I know D is not welcome aboard but
> couldn't we try do run the game for ourself so to have some more data ?

Small programs are completely inadequate for getting any reasonable measure of compiler speed. Even worse, they can be terribly wrong.

(Back in the olden days, when men were men and and the sun revolved about the earth, everyone raved about Borland's compilation speed. In tests I ran myself, I found that it was fast, right up until you hit a certain size of source code, maybe about 5000 lines. Then, it fell off a cliff, and compile speed was terrible. But hey, it looked great in those tiny benchmarks.)

The people who care about compile speed are compiling gigantic programs, and smallish ones can and do exhibit a very different performance profile.

DMDScript is a medium sized program, not a gigantic one, but it's the best we've got for comparison.


July 25, 2012
On 7/24/12 8:20 PM, Walter Bright wrote:
> On 7/24/2012 11:02 AM, Guillaume Chatelet wrote:
>> By the way, it reminds me of the 'Computer Language Benchmarks Game'
>> (http://shootout.alioth.debian.org/). I know D is not welcome aboard but
>> couldn't we try do run the game for ourself so to have some more data ?
>
> Small programs are completely inadequate for getting any reasonable
> measure of compiler speed. Even worse, they can be terribly wrong.

Nevertheless there's value in the shootout. Yes, if someone is up for it that would be great. I also think if we have the setup ready we could convince the site maintainer to integrate D into the suite.

Andrei