December 15, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Mon, Dec 15, 2008 at 2:13 PM, Walter Bright <newshound1@digitalmars.com> wrote: > Jason House wrote: >> >> I have already hit long division related speed issues in my D code. Sometimes simple things can dominate a benchmark, but those same simple things can dominate user code too! > > I completely agree, and I'm in the process of fixing the long division. My point was it has nothing to do with the code generator, and that drawing conclusions from a benchmark result can be tricky. > That was fast! http://www.dsource.org/projects/phobos/changeset/884 --bb | |||
December 15, 2008 Re: LDC Windows exception handling | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christian Kamm | Christian Kamm wrote:
>> Speaking of LDC, any chance that the exception handling on Win32 gets
>> fixed in the near future?
>
> No, unfortunately.
>
> It's a problem with LLVM only supporting Dwarf2 exception handling. I'm
> pretty sure it'd work if we used ELF for the object files and GCC for
> linking, but Windows people tell me this is hardly acceptable.
>
> We won't get 'real' exceptions working on Windows until someone adds SEH
> support to LLVM.
>
> Volunteers?
>
It's in progress for GCC so maybe it will help to get them on LLVM
| |||
December 15, 2008 Re: LDC Windows exception handling | ||||
|---|---|---|---|---|
| ||||
Posted in reply to dsimcha | dsimcha pisze:
> == Quote from Christian Kamm (kamm-incasoftware@removethis.de)'s article
>>> Speaking of LDC, any chance that the exception handling on Win32 gets
>>> fixed in the near future?
>> No, unfortunately.
>> It's a problem with LLVM only supporting Dwarf2 exception handling. I'm
>> pretty sure it'd work if we used ELF for the object files and GCC for
>> linking, but Windows people tell me this is hardly acceptable.
>
> I think this solution is much better than nothing. I assume it would at least
> work ok on standalone-type projects.
>
Yeah... Also my thoughts...
Additionally maybe there are 3rd party object files converters, and "Windows people" work could be done using them as workaround?
BR
Marcin Kuszczak
(aarti_pl)
| |||
December 16, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Jarrett Billingsley | Jarrett Billingsley wrote:
> On Sat, Dec 13, 2008 at 11:16 AM, Tomas Lindquist Olsen
> <tomas@famolsen.dk> wrote:
>> I tried this out with Tango + DMD 1.033, Tango + LDC r847 and GCC 4.3.2, my
>> timings are as follows, best of three:
>>
>> $ dmd bench.d -O -release -inline
>> long arith: 55630 ms
>> nested loop: 5090 ms
>>
>>
>> $ ldc bench.d -O3 -release -inline
>> long arith: 13870 ms
>> nested loop: 120 ms
>>
>>
>> $ gcc bench.c -O3 -s -fomit-frame-pointer
>> long arith: 13600 ms
>> nested loop: 170 ms
>>
>>
>> My cpu is: Athlon64 X2 3800+
>>
>
> Go LDC!
>
> I hope bearophile will eventually understand that DMD is not good at
> optimizing code, and so comparing its output to GCC's is ultimately
> meaningless.
I must have missed the memo. How is dmd not good at optimizing code? Without knowing many details about it, my understanding is that dmd performs common optimization reasonably well and that this particular problem has to do with the long division routine.
Andrei
| |||
December 16, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Tue, Dec 16, 2008 at 11:09 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > Jarrett Billingsley wrote: >> >> On Sat, Dec 13, 2008 at 11:16 AM, Tomas Lindquist Olsen <tomas@famolsen.dk> wrote: >>> >>> I tried this out with Tango + DMD 1.033, Tango + LDC r847 and GCC 4.3.2, >>> my >>> timings are as follows, best of three: >>> >>> $ dmd bench.d -O -release -inline >>> long arith: 55630 ms >>> nested loop: 5090 ms >>> >>> >>> $ ldc bench.d -O3 -release -inline >>> long arith: 13870 ms >>> nested loop: 120 ms >>> >>> >>> $ gcc bench.c -O3 -s -fomit-frame-pointer >>> long arith: 13600 ms >>> nested loop: 170 ms >>> >>> >>> My cpu is: Athlon64 X2 3800+ >>> >> >> Go LDC! >> >> I hope bearophile will eventually understand that DMD is not good at optimizing code, and so comparing its output to GCC's is ultimately meaningless. > > I must have missed the memo. How is dmd not good at optimizing code? Without knowing many details about it, my understanding is that dmd performs common optimization reasonably well and that this particular problem has to do with the long division routine. It's pretty well proven that for floating point code, DMD tends to generate code about 50% slower than GCC. --bb | |||
December 16, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | On Tue, 16 Dec 2008 05:28:16 +0300, Bill Baxter <wbaxter@gmail.com> wrote:
> On Tue, Dec 16, 2008 at 11:09 AM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Jarrett Billingsley wrote:
>>>
>>> On Sat, Dec 13, 2008 at 11:16 AM, Tomas Lindquist Olsen
>>> <tomas@famolsen.dk> wrote:
>>>>
>>>> I tried this out with Tango + DMD 1.033, Tango + LDC r847 and GCC 4.3.2,
>>>> my
>>>> timings are as follows, best of three:
>>>>
>>>> $ dmd bench.d -O -release -inline
>>>> long arith: 55630 ms
>>>> nested loop: 5090 ms
>>>>
>>>>
>>>> $ ldc bench.d -O3 -release -inline
>>>> long arith: 13870 ms
>>>> nested loop: 120 ms
>>>>
>>>>
>>>> $ gcc bench.c -O3 -s -fomit-frame-pointer
>>>> long arith: 13600 ms
>>>> nested loop: 170 ms
>>>>
>>>>
>>>> My cpu is: Athlon64 X2 3800+
>>>>
>>>
>>> Go LDC!
>>>
>>> I hope bearophile will eventually understand that DMD is not good at
>>> optimizing code, and so comparing its output to GCC's is ultimately
>>> meaningless.
>>
>> I must have missed the memo. How is dmd not good at optimizing code? Without
>> knowing many details about it, my understanding is that dmd performs common
>> optimization reasonably well and that this particular problem has to do with
>> the long division routine.
>
> It's pretty well proven that for floating point code, DMD tends to
> generate code about 50% slower than GCC.
>
> --bb
But other than that it is pretty good.
And man, it is so fast!
| |||
December 16, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Denis Koroskin | On Tue, Dec 16, 2008 at 12:00 PM, Denis Koroskin <2korden@gmail.com> wrote: > On Tue, 16 Dec 2008 05:28:16 +0300, Bill Baxter <wbaxter@gmail.com> wrote: > >> On Tue, Dec 16, 2008 at 11:09 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: >>> >>> Jarrett Billingsley wrote: >>>> >>>> On Sat, Dec 13, 2008 at 11:16 AM, Tomas Lindquist Olsen <tomas@famolsen.dk> wrote: >>>>> >>>>> I tried this out with Tango + DMD 1.033, Tango + LDC r847 and GCC >>>>> 4.3.2, >>>>> my >>>>> timings are as follows, best of three: >>>>> >>>>> $ dmd bench.d -O -release -inline >>>>> long arith: 55630 ms >>>>> nested loop: 5090 ms >>>>> >>>>> >>>>> $ ldc bench.d -O3 -release -inline >>>>> long arith: 13870 ms >>>>> nested loop: 120 ms >>>>> >>>>> >>>>> $ gcc bench.c -O3 -s -fomit-frame-pointer >>>>> long arith: 13600 ms >>>>> nested loop: 170 ms >>>>> >>>>> >>>>> My cpu is: Athlon64 X2 3800+ >>>>> >>>> >>>> Go LDC! >>>> >>>> I hope bearophile will eventually understand that DMD is not good at optimizing code, and so comparing its output to GCC's is ultimately meaningless. >>> >>> I must have missed the memo. How is dmd not good at optimizing code? >>> Without >>> knowing many details about it, my understanding is that dmd performs >>> common >>> optimization reasonably well and that this particular problem has to do >>> with >>> the long division routine. >> >> It's pretty well proven that for floating point code, DMD tends to generate code about 50% slower than GCC. >> >> --bb > > But other than that it is pretty good. Yep, it's more than 100x faster than a straightforward Python ports of similar code, for instance. (I did some benchmarking using a D port of the Laplace solver here http://www.scipy.org/PerformancePython -- I think bearophile did these comparisons again himself more recently, too). There I saw DMD about 50% slower than g++. But I've seen figures in the neighborhood of 50% come up a few times since then in other float-intensive benchmarks, like the raytracer that someone ported from c++. So it is certainly fast. But one of the draws of D is precisely that, that it is fast. If you're after code that runs as fast as possible, 50% slower than the competition is plenty justification for to go look elsewhere for your high-performance language. A 50% hit may not really be relevant at the end of the day, but I know I used to avoid g++ like the plague because even it's output isn't that fast compared to MSVC++ or Intel's compiler, even though the difference is maybe only 10% or so. I was working on interactive fluid simulation, so I wanted every bit of speed I could get out of the processor. With interactive stuff, a 10% difference really can matter, I think. > And man, it is so fast! You mean compile times? --bb | |||
December 16, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | On Tue, 16 Dec 2008 06:23:14 +0300, Bill Baxter <wbaxter@gmail.com> wrote:
> On Tue, Dec 16, 2008 at 12:00 PM, Denis Koroskin <2korden@gmail.com> wrote:
>> On Tue, 16 Dec 2008 05:28:16 +0300, Bill Baxter <wbaxter@gmail.com> wrote:
>>
>>> On Tue, Dec 16, 2008 at 11:09 AM, Andrei Alexandrescu
>>> <SeeWebsiteForEmail@erdani.org> wrote:
>>>>
>>>> Jarrett Billingsley wrote:
>>>>>
>>>>> On Sat, Dec 13, 2008 at 11:16 AM, Tomas Lindquist Olsen
>>>>> <tomas@famolsen.dk> wrote:
>>>>>>
>>>>>> I tried this out with Tango + DMD 1.033, Tango + LDC r847 and GCC
>>>>>> 4.3.2,
>>>>>> my
>>>>>> timings are as follows, best of three:
>>>>>>
>>>>>> $ dmd bench.d -O -release -inline
>>>>>> long arith: 55630 ms
>>>>>> nested loop: 5090 ms
>>>>>>
>>>>>>
>>>>>> $ ldc bench.d -O3 -release -inline
>>>>>> long arith: 13870 ms
>>>>>> nested loop: 120 ms
>>>>>>
>>>>>>
>>>>>> $ gcc bench.c -O3 -s -fomit-frame-pointer
>>>>>> long arith: 13600 ms
>>>>>> nested loop: 170 ms
>>>>>>
>>>>>>
>>>>>> My cpu is: Athlon64 X2 3800+
>>>>>>
>>>>>
>>>>> Go LDC!
>>>>>
>>>>> I hope bearophile will eventually understand that DMD is not good at
>>>>> optimizing code, and so comparing its output to GCC's is ultimately
>>>>> meaningless.
>>>>
>>>> I must have missed the memo. How is dmd not good at optimizing code?
>>>> Without
>>>> knowing many details about it, my understanding is that dmd performs
>>>> common
>>>> optimization reasonably well and that this particular problem has to do
>>>> with
>>>> the long division routine.
>>>
>>> It's pretty well proven that for floating point code, DMD tends to
>>> generate code about 50% slower than GCC.
>>>
>>> --bb
>>
>> But other than that it is pretty good.
>
> Yep, it's more than 100x faster than a straightforward Python ports of
> similar code, for instance. (I did some benchmarking using a D port
> of the Laplace solver here http://www.scipy.org/PerformancePython --
> I think bearophile did these comparisons again himself more recently,
> too). There I saw DMD about 50% slower than g++. But I've seen
> figures in the neighborhood of 50% come up a few times since then in
> other float-intensive benchmarks, like the raytracer that someone
> ported from c++.
>
> So it is certainly fast. But one of the draws of D is precisely that,
> that it is fast. If you're after code that runs as fast as possible,
> 50% slower than the competition is plenty justification for to go look
> elsewhere for your high-performance language. A 50% hit may not
> really be relevant at the end of the day, but I know I used to avoid
> g++ like the plague because even it's output isn't that fast compared
> to MSVC++ or Intel's compiler, even though the difference is maybe
> only 10% or so. I was working on interactive fluid simulation, so I
> wanted every bit of speed I could get out of the processor. With
> interactive stuff, a 10% difference really can matter, I think.
>
>> And man, it is so fast!
>
> You mean compile times?
>
> --bb
Yeah.
| |||
December 16, 2008 Re: LDC Windows exception handling | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Aarti_pl | Aarti_pl pisze: > dsimcha pisze: >> == Quote from Christian Kamm (kamm-incasoftware@removethis.de)'s article >>>> Speaking of LDC, any chance that the exception handling on Win32 gets >>>> fixed in the near future? >>> No, unfortunately. >>> It's a problem with LLVM only supporting Dwarf2 exception handling. I'm >>> pretty sure it'd work if we used ELF for the object files and GCC for >>> linking, but Windows people tell me this is hardly acceptable. >> >> I think this solution is much better than nothing. I assume it would at least >> work ok on standalone-type projects. >> > > Yeah... Also my thoughts... > > Additionally maybe there are 3rd party object files converters, and "Windows people" work could be done using them as workaround? > > BR > Marcin Kuszczak > (aarti_pl) I found such a converter (GPL licenced): http://agner.org/optimize/#objconv Can anyone comment if such a workaround will solve initial problem? (at least temporary). If the answer is yes, then can we expect adding exception handling for LDC on windows? :-) BR Marcin Kuszczak (aarti_pl) | |||
December 16, 2008 Re: Basic benchmark | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | Bill Baxter wrote:
> Anyway, all that said, it's not clear that we really do have that
> mythical "uber backend" available right now.
>
> According to my conversations on the clang mailing list, the current
> target is for LLVM to be able to fully support a C++ compiler by 2010.
> I'm not quite sure what all that involves, but apparently it includes
> things like making exceptions work on Windows.
I wonder if there's any chance of getting a LLVM D compiler working before the LLVM C++ compiler works? <g>
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply