January 25, 2013
On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
> On 1/24/2013 1:13 PM, H. S. Teoh wrote:
> >On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
> >>On 1/24/2013 8:36 AM, H. S. Teoh wrote:
> >>>Nevertheless, I also have made the same observation that code produced by gdc consistently outperforms code produced by dmd. Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a pretty big discrepancy for me, esp. when I'm doing compute-intensive geometric computations.
> >>
> >>Do you mean floating point code? 32 or 64 bit?
> >
> >Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.
> 
> Next, are you using floats, doubles, or reals?

Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around:

Test program #1 (iterating 2-variable function over grid), uses reals:
- Test case with n=400:
	Using DMD:	~8 seconds (consistently)
	Using GDC:	~6 seconds (consistently)
	* So the DMD version is 33% slower than the GDC version.
	  (That is, 8/6*100 = 133%, so 33% slower.)

- Test case with n=600:
	Using DMD:	~27 seconds (consistently)
	Using GDC:	~19 seconds (consistently)
	* So the DMD version is 42% slower than the GDC version.


Test program #2 (terrain generation simulator), uses floats:
(The running time of this one depends on the RNG, so I fixed the seed
value in order to make a fair comparison.)
- Test case with seed=380170304, n=20 with water & wind simulation:
	Using DMD:	~10 seconds (consistently)
	Using GDC:	~7 seconds (consistently)
	* So the DMD version is 42% slower than the GDC version.

- Test case with seed=380170304, n=25 with water & wind simulation:
	Using DMD:	~14 seconds (consistently)
	Using GDC:	~9 seconds (consistently)
	* So the DMD version is 55% slower than the GDC version.


Test program #3 (enumeration of coordinates of n-dimensional polytopes),
uses reals:
- All permutations and changes of sign of <1,2,3,4,5,6,7>:
	Using DMD:	~4 seconds (consistently)
	Using GDC:	~3 seconds (consistently)
	* So the DMD version is 33% slower than the GDC version.

- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
	Using DMD:	~41 seconds (consistently)
	Using GDC:	~27 seconds (consistently)
	* So the DMD version is 51% slower than the GDC version.

- Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
	Using DMD:	~40 seconds (consistently)
	Using GDC:	~27 seconds (consistently)
	* So the DMD version is 48% slower than the GDC version.


All test programs were compiled with dmd -O for the DMD version, and gdc -O3 for the GDC version. The source code is unchanged between the two compilers, and there are no version()'s that depend on a particular compiler. The measurements stated above are averages of about 3-4 runs.

As you can see, the performance difference is between the two is pretty clear.  I'm pretty sure this isn't only because of floating point operations, because the above test programs all use a lot of inner loops, and GDC does some pretty sophisticated loop unrolling and other such optimizations.


T

-- 
Two wrongs don't make a right; but three rights do make a left...
January 25, 2013
On Friday, 25 January 2013 at 00:25:46 UTC, Joseph Rushton Wakeling wrote:
> If I remove the writef statements, leaving just the number-crunching part, it runs in about 4s with gdmd, 7s with ldmd2 and 14s (!) with dmd.

From my experience, writef and friends are substantially slower than printf. I wouldn't recommend using them for output-intensive applications. And of course, the best option is to avoid any format string parsing altogether, using only fwrite calls.
January 25, 2013
On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
> On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
>> On 1/24/2013 1:13 PM, H. S. Teoh wrote:
>> >On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
>> >>On 1/24/2013 8:36 AM, H. S. Teoh wrote:
>> >>>Nevertheless, I also have made the same observation that code
>> >>>produced by gdc consistently outperforms code produced by dmd.
>> >>>Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
>> >>>pretty big discrepancy for me, esp. when I'm doing compute-intensive
>> >>>geometric computations.
>> >>
>> >>Do you mean floating point code? 32 or 64 bit?
>> >
>> >Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.
>> 
>> Next, are you using floats, doubles, or reals?
>
> Both reals and floats. Well, let's get some real measurements. Here's a
> quick run-through of various test programs I have lying around:
>
> Test program #1 (iterating 2-variable function over grid), uses reals:
> - Test case with n=400:
> 	Using DMD:	~8 seconds (consistently)
> 	Using GDC:	~6 seconds (consistently)
> 	* So the DMD version is 33% slower than the GDC version.
> 	  (That is, 8/6*100 = 133%, so 33% slower.)
>
> - Test case with n=600:
> 	Using DMD:	~27 seconds (consistently)
> 	Using GDC:	~19 seconds (consistently)
> 	* So the DMD version is 42% slower than the GDC version.
>
>
> Test program #2 (terrain generation simulator), uses floats:
> (The running time of this one depends on the RNG, so I fixed the seed
> value in order to make a fair comparison.)
> - Test case with seed=380170304, n=20 with water & wind simulation:
> 	Using DMD:	~10 seconds (consistently)
> 	Using GDC:	~7 seconds (consistently)
> 	* So the DMD version is 42% slower than the GDC version.
>
> - Test case with seed=380170304, n=25 with water & wind simulation:
> 	Using DMD:	~14 seconds (consistently)
> 	Using GDC:	~9 seconds (consistently)
> 	* So the DMD version is 55% slower than the GDC version.
>
>
> Test program #3 (enumeration of coordinates of n-dimensional polytopes),
> uses reals:
> - All permutations and changes of sign of <1,2,3,4,5,6,7>:
> 	Using DMD:	~4 seconds (consistently)
> 	Using GDC:	~3 seconds (consistently)
> 	* So the DMD version is 33% slower than the GDC version.
>
> - All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
> 	Using DMD:	~41 seconds (consistently)
> 	Using GDC:	~27 seconds (consistently)
> 	* So the DMD version is 51% slower than the GDC version.
>
> - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
> 	Using DMD:	~40 seconds (consistently)
> 	Using GDC:	~27 seconds (consistently)
> 	* So the DMD version is 48% slower than the GDC version.
>
>
> All test programs were compiled with dmd -O for the DMD version, and gdc
> -O3 for the GDC version. The source code is unchanged between the two
> compilers, and there are no version()'s that depend on a particular
> compiler. The measurements stated above are averages of about 3-4 runs.
>
> As you can see, the performance difference is between the two is pretty
> clear.  I'm pretty sure this isn't only because of floating point
> operations, because the above test programs all use a lot of inner
> loops, and GDC does some pretty sophisticated loop unrolling and other
> such optimizations.
>
>
> T

Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is more comparable.
January 25, 2013
On 25 January 2013 10:27, John Colvin <john.loughran.colvin@gmail.com>wrote:

> On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
>
>> On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
>>
>>> On 1/24/2013 1:13 PM, H. S. Teoh wrote:
>>> >On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
>>> >>On 1/24/2013 8:36 AM, H. S. Teoh wrote:
>>> >>>Nevertheless, I also have made the same observation that >>>code
>>> >>>produced by gdc consistently outperforms code produced by >>>dmd.
>>> >>>Usually by about 20-30%, sometimes as much as 50-60%, IME. >>>That's a
>>> >>>pretty big discrepancy for me, esp. when I'm doing
>>> >>>compute-intensive
>>> >>>geometric computations.
>>> >>
>>> >>Do you mean floating point code? 32 or 64 bit?
>>> >
>>> >Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.
>>>
>>> Next, are you using floats, doubles, or reals?
>>>
>>
>> Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around:
>>
>> Test program #1 (iterating 2-variable function over grid), uses reals:
>> - Test case with n=400:
>>         Using DMD:      ~8 seconds (consistently)
>>         Using GDC:      ~6 seconds (consistently)
>>         * So the DMD version is 33% slower than the GDC version.
>>           (That is, 8/6*100 = 133%, so 33% slower.)
>>
>> - Test case with n=600:
>>         Using DMD:      ~27 seconds (consistently)
>>         Using GDC:      ~19 seconds (consistently)
>>         * So the DMD version is 42% slower than the GDC version.
>>
>>
>> Test program #2 (terrain generation simulator), uses floats:
>> (The running time of this one depends on the RNG, so I fixed the seed
>> value in order to make a fair comparison.)
>> - Test case with seed=380170304, n=20 with water & wind simulation:
>>         Using DMD:      ~10 seconds (consistently)
>>         Using GDC:      ~7 seconds (consistently)
>>         * So the DMD version is 42% slower than the GDC version.
>>
>> - Test case with seed=380170304, n=25 with water & wind simulation:
>>         Using DMD:      ~14 seconds (consistently)
>>         Using GDC:      ~9 seconds (consistently)
>>         * So the DMD version is 55% slower than the GDC version.
>>
>>
>> Test program #3 (enumeration of coordinates of n-dimensional polytopes),
>> uses reals:
>> - All permutations and changes of sign of <1,2,3,4,5,6,7>:
>>         Using DMD:      ~4 seconds (consistently)
>>         Using GDC:      ~3 seconds (consistently)
>>         * So the DMD version is 33% slower than the GDC version.
>>
>> - All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
>>         Using DMD:      ~41 seconds (consistently)
>>         Using GDC:      ~27 seconds (consistently)
>>         * So the DMD version is 51% slower than the GDC version.
>>
>> - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
>>         Using DMD:      ~40 seconds (consistently)
>>         Using GDC:      ~27 seconds (consistently)
>>         * So the DMD version is 48% slower than the GDC version.
>>
>>
>> All test programs were compiled with dmd -O for the DMD version, and gdc -O3 for the GDC version. The source code is unchanged between the two compilers, and there are no version()'s that depend on a particular compiler. The measurements stated above are averages of about 3-4 runs.
>>
>> As you can see, the performance difference is between the two is pretty clear.  I'm pretty sure this isn't only because of floating point operations, because the above test programs all use a lot of inner loops, and GDC does some pretty sophisticated loop unrolling and other such optimizations.
>>
>>
>> T
>>
>
> Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is more comparable.
>


But then you'd have to do gdc -O3 -frelease. :-)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


January 25, 2013
On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
> On 25 January 2013 10:27, John Colvin <john.loughran.colvin@gmail.com>wrote:
>
>> On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
>>
>>> On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
>>>
>>>> On 1/24/2013 1:13 PM, H. S. Teoh wrote:
>>>> >On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
>>>> >>On 1/24/2013 8:36 AM, H. S. Teoh wrote:
>>>> >>>Nevertheless, I also have made the same observation that
>>>> >>>>>>code
>>>> >>>produced by gdc consistently outperforms code produced by
>>>> >>>>>>dmd.
>>>> >>>Usually by about 20-30%, sometimes as much as 50-60%, IME. >>>That's a
>>>> >>>pretty big discrepancy for me, esp. when I'm doing
>>>> >>>compute-intensive
>>>> >>>geometric computations.
>>>> >>
>>>> >>Do you mean floating point code? 32 or 64 bit?
>>>> >
>>>> >Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.
>>>>
>>>> Next, are you using floats, doubles, or reals?
>>>>
>>>
>>> Both reals and floats. Well, let's get some real measurements. Here's a
>>> quick run-through of various test programs I have lying around:
>>>
>>> Test program #1 (iterating 2-variable function over grid), uses reals:
>>> - Test case with n=400:
>>>         Using DMD:      ~8 seconds (consistently)
>>>         Using GDC:      ~6 seconds (consistently)
>>>         * So the DMD version is 33% slower than the GDC version.
>>>           (That is, 8/6*100 = 133%, so 33% slower.)
>>>
>>> - Test case with n=600:
>>>         Using DMD:      ~27 seconds (consistently)
>>>         Using GDC:      ~19 seconds (consistently)
>>>         * So the DMD version is 42% slower than the GDC version.
>>>
>>>
>>> Test program #2 (terrain generation simulator), uses floats:
>>> (The running time of this one depends on the RNG, so I fixed the seed
>>> value in order to make a fair comparison.)
>>> - Test case with seed=380170304, n=20 with water & wind simulation:
>>>         Using DMD:      ~10 seconds (consistently)
>>>         Using GDC:      ~7 seconds (consistently)
>>>         * So the DMD version is 42% slower than the GDC version.
>>>
>>> - Test case with seed=380170304, n=25 with water & wind simulation:
>>>         Using DMD:      ~14 seconds (consistently)
>>>         Using GDC:      ~9 seconds (consistently)
>>>         * So the DMD version is 55% slower than the GDC version.
>>>
>>>
>>> Test program #3 (enumeration of coordinates of n-dimensional polytopes),
>>> uses reals:
>>> - All permutations and changes of sign of <1,2,3,4,5,6,7>:
>>>         Using DMD:      ~4 seconds (consistently)
>>>         Using GDC:      ~3 seconds (consistently)
>>>         * So the DMD version is 33% slower than the GDC version.
>>>
>>> - All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
>>>         Using DMD:      ~41 seconds (consistently)
>>>         Using GDC:      ~27 seconds (consistently)
>>>         * So the DMD version is 51% slower than the GDC version.
>>>
>>> - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
>>>         Using DMD:      ~40 seconds (consistently)
>>>         Using GDC:      ~27 seconds (consistently)
>>>         * So the DMD version is 48% slower than the GDC version.
>>>
>>>
>>> All test programs were compiled with dmd -O for the DMD version, and gdc
>>> -O3 for the GDC version. The source code is unchanged between the two
>>> compilers, and there are no version()'s that depend on a particular
>>> compiler. The measurements stated above are averages of about 3-4 runs.
>>>
>>> As you can see, the performance difference is between the two is pretty
>>> clear.  I'm pretty sure this isn't only because of floating point
>>> operations, because the above test programs all use a lot of inner
>>> loops, and GDC does some pretty sophisticated loop unrolling and other
>>> such optimizations.
>>>
>>>
>>> T
>>>
>>
>> Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is
>> more comparable.
>>
>
>
> But then you'd have to do gdc -O3 -frelease. :-)

Ah yes, of course :)
January 25, 2013
On Fri, Jan 25, 2013 at 04:09:25PM +0100, John Colvin wrote:
> On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
> >On 25 January 2013 10:27, John Colvin <john.loughran.colvin@gmail.com>wrote:
> >
[...]
> >>Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is more comparable.
> >>
> >
> >
> >But then you'd have to do gdc -O3 -frelease. :-)
> 
> Ah yes, of course :)

Hmm. I didn't realize that dmd has a separate switch for function inlining. Well, here's the updated numbers:


> >>On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
> >>>Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around:
> >>>
> >>>Test program #1 (iterating 2-variable function over grid),
> >>>uses reals:
> >>>- Test case with n=400:
> >>>        Using DMD:      ~8 seconds (consistently)
> >>>        Using GDC:      ~6 seconds (consistently)
> >>>        * So the DMD version is 33% slower than the GDC
> >>>version.
> >>>          (That is, 8/6*100 = 133%, so 33% slower.)

Updated: DMD version with -inline takes ~7 seconds consistently, so we have 7/6*100 = 116%, so 16% slower.


> >>>- Test case with n=600:
> >>>        Using DMD:      ~27 seconds (consistently)
> >>>        Using GDC:      ~19 seconds (consistently)
> >>>        * So the DMD version is 42% slower than the GDC
> >>>version.

Updated: DMD version with -inline takes ~24 seconds consistently, so 26% slower.


> >>>Test program #2 (terrain generation simulator), uses floats:
> >>>(The running time of this one depends on the RNG, so I fixed
> >>>the seed
> >>>value in order to make a fair comparison.)
> >>>- Test case with seed=380170304, n=20 with water & wind
> >>>simulation:
> >>>        Using DMD:      ~10 seconds (consistently)
> >>>        Using GDC:      ~7 seconds (consistently)
> >>>        * So the DMD version is 42% slower than the GDC
> >>>version.

Updated: DMD version with -inline takes ~8 seconds consistently, so 14% slower.


> >>>- Test case with seed=380170304, n=25 with water & wind simulation:
> >>>        Using DMD:      ~14 seconds (consistently)
> >>>        Using GDC:      ~9 seconds (consistently)
> >>>        * So the DMD version is 55% slower than the GDC
> >>>version.

Updated: DMD version with -inline takes ~11 seconds consistently, so 22% slower.


> >>>Test program #3 (enumeration of coordinates of n-dimensional
> >>>polytopes),
> >>>uses reals:
> >>>- All permutations and changes of sign of <1,2,3,4,5,6,7>:
> >>>        Using DMD:      ~4 seconds (consistently)
> >>>        Using GDC:      ~3 seconds (consistently)
> >>>        * So the DMD version is 33% slower than the GDC
> >>>version.

Updated: DMD version with -inline still takes ~4 seconds, so no significant change here.


> >>>- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
> >>>        Using DMD:      ~41 seconds (consistently)
> >>>        Using GDC:      ~27 seconds (consistently)
> >>>        * So the DMD version is 51% slower than the GDC
> >>>version.

Updated: DMD version with -inline takes about 36 seconds on average, so about 33% slower.


> >>>- Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
> >>>        Using DMD:      ~40 seconds (consistently)
> >>>        Using GDC:      ~27 seconds (consistently)
> >>>        * So the DMD version is 48% slower than the GDC
> >>>version.

Updated: DMD version with -inline takes about 38 seconds, so 41% slower.

Conclusions:
- The performance gap is smaller than previously thought, but it's still
  present.
- I will be using -inline with dmd aggressively.
- What other dmd options am I missing that will bring dmd on par with
  gdc -O3 (if there are any)?


T

-- 
Written on the window of a clothing store: No shirt, no shoes, no service.
January 25, 2013
On Friday, 25 January 2013 at 16:09:00 UTC, H. S. Teoh wrote:
> On Fri, Jan 25, 2013 at 04:09:25PM +0100, John Colvin wrote:
>> On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
>> >On 25 January 2013 10:27, John Colvin
>> ><john.loughran.colvin@gmail.com>wrote:
>> >
> [...]
>> >>Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release
>> >>-inline -O" is more comparable.
>> >>
>> >
>> >
>> >But then you'd have to do gdc -O3 -frelease. :-)
>> 
>> Ah yes, of course :)
>
> Hmm. I didn't realize that dmd has a separate switch for function
> inlining. Well, here's the updated numbers:
>
>
>> >>On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
>> >>>Both reals and floats. Well, let's get some real measurements.
>> >>>Here's a quick run-through of various test programs I have lying
>> >>>around:
>> >>>
>> >>>Test program #1 (iterating 2-variable function over grid),
>> >>>uses reals:
>> >>>- Test case with n=400:
>> >>>        Using DMD:      ~8 seconds (consistently)
>> >>>        Using GDC:      ~6 seconds (consistently)
>> >>>        * So the DMD version is 33% slower than the GDC
>> >>>version.
>> >>>          (That is, 8/6*100 = 133%, so 33% slower.)
>
> Updated: DMD version with -inline takes ~7 seconds consistently, so we
> have 7/6*100 = 116%, so 16% slower.
>
>
>> >>>- Test case with n=600:
>> >>>        Using DMD:      ~27 seconds (consistently)
>> >>>        Using GDC:      ~19 seconds (consistently)
>> >>>        * So the DMD version is 42% slower than the GDC
>> >>>version.
>
> Updated: DMD version with -inline takes ~24 seconds consistently, so 26%
> slower.
>
>
>> >>>Test program #2 (terrain generation simulator), uses floats:
>> >>>(The running time of this one depends on the RNG, so I fixed
>> >>>the seed
>> >>>value in order to make a fair comparison.)
>> >>>- Test case with seed=380170304, n=20 with water & wind
>> >>>simulation:
>> >>>        Using DMD:      ~10 seconds (consistently)
>> >>>        Using GDC:      ~7 seconds (consistently)
>> >>>        * So the DMD version is 42% slower than the GDC
>> >>>version.
>
> Updated: DMD version with -inline takes ~8 seconds consistently, so 14%
> slower.
>
>
>> >>>- Test case with seed=380170304, n=25 with water & wind
>> >>>simulation:
>> >>>        Using DMD:      ~14 seconds (consistently)
>> >>>        Using GDC:      ~9 seconds (consistently)
>> >>>        * So the DMD version is 55% slower than the GDC
>> >>>version.
>
> Updated: DMD version with -inline takes ~11 seconds consistently, so
> 22% slower.
>
>
>> >>>Test program #3 (enumeration of coordinates of n-dimensional
>> >>>polytopes),
>> >>>uses reals:
>> >>>- All permutations and changes of sign of <1,2,3,4,5,6,7>:
>> >>>        Using DMD:      ~4 seconds (consistently)
>> >>>        Using GDC:      ~3 seconds (consistently)
>> >>>        * So the DMD version is 33% slower than the GDC
>> >>>version.
>
> Updated: DMD version with -inline still takes ~4 seconds, so no
> significant change here.
>
>
>> >>>- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
>> >>>        Using DMD:      ~41 seconds (consistently)
>> >>>        Using GDC:      ~27 seconds (consistently)
>> >>>        * So the DMD version is 51% slower than the GDC
>> >>>version.
>
> Updated: DMD version with -inline takes about 36 seconds on average, so
> about 33% slower.
>
>
>> >>>- Even permutations and all changes of sign of
>> >>><1,2,3,4,5,6,7,8>:
>> >>>        Using DMD:      ~40 seconds (consistently)
>> >>>        Using GDC:      ~27 seconds (consistently)
>> >>>        * So the DMD version is 48% slower than the GDC
>> >>>version.
>
> Updated: DMD version with -inline takes about 38 seconds, so 41% slower.
>
> Conclusions:
> - The performance gap is smaller than previously thought, but it's still
>   present.
> - I will be using -inline with dmd aggressively.
> - What other dmd options am I missing that will bring dmd on par with
>   gdc -O3 (if there are any)?
>
>
> T

I have sometimes found that using -release and -noboundscheck made a bigger difference to dmd than to gdc. The corresponding gdc options are -frelease and -fno-bounds-check

Comparing performance without -release isn't that meaningful.
January 25, 2013
On Thursday, 24 January 2013 at 10:17:50 UTC, Walter Bright wrote:
> On 1/23/2013 6:36 PM, Rob T wrote:
>> BTW the D version of my sqlite3 lib is at least 1/3 smaller than the C++
>> version, and not only is it smaller, but it is far more flexible due to the use
>> of templates (I just could not make much use out of C++ templates). A reduction
>> like that is very significant. For large projects. it's a drastic reduction in
>> development costs and perhaps more so in long term maintenance costs.
>
> Interesting. I found the same percentage reduction in translating C++ code to D.

I wonder what the main reasons are for the reduction? I did make my D version of the sqlite3 lib slightly better by removing some redundancies, but that had only a ~100 line effect on the size difference. I know that the basic design is pretty much the same, so there's no radical design change that would account for the difference.

It could be that I did a better job in subtle ways when converting over from C++ because of the experience gained from the original work, for example I think the error detection and reporting I have in the D version is much simpler, and likely accounts for some of the size difference. The question though, is could I have implemented the same changes in the C++ version just as easily? I'm not so sure about that because when I program in D, it "feels" better in terms of being much less tedious to work with, so there must be more going on than just a few design choices. I also find that I get into these "ah ha" moments, where I realize that I don't have to do much of anything extra to make something new work - hard to explain without real examples, but I know I run into these when working with D more so than when working with C++.

An interesting test would be to translate a D program into a C++ one, to see if the C++ version will shrink due to subtle improvements, but I think that would be very difficult to do if there are templates involved. You just cannot make heavy use out of templates in C++ like you can in D.

Have you ever translated from D to C++?

--rt
January 25, 2013
On 1/25/2013 9:45 AM, Rob T wrote:
> I wonder what the main reasons are for the reduction?

Some reasons:

1. elimination of .h files
2. array & string handling was so much more straightforward
3. elimination of need for many constructors and code to initialize things
4. easier cleanup with scope statement
5. templates are much more concise
6. a lot of boilerplate member functions are simply unnecessary in D
7. static if eliminates a lot of template source bloat

> Have you ever translated from D to C++?

Haven't tried that!

January 25, 2013
On Fri, Jan 25, 2013 at 05:50:21PM +0100, John Colvin wrote:
> On Friday, 25 January 2013 at 16:09:00 UTC, H. S. Teoh wrote:
[...]
> >Conclusions:
> >- The performance gap is smaller than previously thought, but it's
> >  still present.
> >- I will be using -inline with dmd aggressively.
> >- What other dmd options am I missing that will bring dmd on par
> >  with gdc -O3 (if there are any)?
> >
> >
> >T
> 
> I have sometimes found that using -release and -noboundscheck made a bigger difference to dmd than to gdc. The corresponding gdc options are -frelease and -fno-bounds-check
> 
> Comparing performance without -release isn't that meaningful.

Alright. So to make the comparison fair(er), I recompiled test program
#1 (iterating 2-variable function on grid) with:

	dmd -O -inline -m64 -release -nobounds check
	gdc -O3 -m64 -frelease -fno-bounds-check

Here are the new results for test program #1, using n=600:

	With DMD: 15 seconds (average of 4 runs)
	With GDC: 11 seconds (average of 4 runs)

There's still a 36% performance difference.

I did the same thing for test program #2 (terrain generation simulation), using seed=380170304, with wind & water simulation, and n=30 (I increased the iteration count to make measurement noise less prominent). Here's the new results:

	With DMD: 11 seconds (average of 4 runs)
	With GDC: 9 seconds (average of 4 runs)

So a gap of 22% is still present.

I'm running into a DMD bug for test program #3 (linker error when compiling with -release -O -noboundscheck -inline), so I don't have the test results for that yet. I'll try to figure out what's causing the linker error and post the results later.

In the meantime, it's clear that GDC is still showing significant performance improvement over DMD.  There is a _consistent_ 20-30% difference in performance in all of the tests so far. So I think at this point it's fair to say that GDC's back end produces superior code in terms of performance.  (I will note, though, that GDC produces larger executables than DMD, sometimes much larger, so space-wise, there is some price to pay.)


T

-- 
Chance favours the prepared mind. -- Louis Pasteur