August 18, 2015
On 8/18/2015 2:25 PM, Vladimir Panteleev wrote:
> I would like to add that fixing the regression does not make it go away. Even
> though it's fixed in git, and even after the fix ships with a new DMD release,
> there is still a D version out there that has the bug, and that will never
> change until the end of time.

Not necessarily. The reason we split off a new branch of dmd with each release is so that we can patch it if necessary.


> Fixing regressions is not enough. We need to try harder to prevent them from
> ending up in DMD releases at all.

I agree, but stopping development isn't much of a solution.

August 18, 2015
On 8/18/2015 2:26 PM, rsw0x wrote:
> if you want to make D fast - Fix the interface between the compiler and the
> runtime(including the inability for compilers to inline simple things like
> allocations which makes allocations have massive overheads.) Then, fix the GC.
> Make the GC both shared and immutable aware, then moving the GC to a thread
> local "island"-style GC would be fairly easy.

The fundamental issue of island GCs is what to do with casting of data from one island to another.


> Maybe you should take a look at what Go has recently done with their GC to get
> an idea of what D's competition has been up to.
> https://talks.golang.org/2015/go-gc.pdf

"you"? There's a whole community here, we're all in this together. Pull requests are welcome.
August 18, 2015
> >On Tuesday, 18 August 2015 at 10:45:49 UTC, Walter Bright wrote:
> >>Martin ran some benchmarks recently that showed that ddmd compiled with dmd was about 30% slower than when compiled with gdc/ldc. This seems to be fairly typical.
[...]

This matches my experience of dmd vs. gdc as well. No surprise there.


> >>I'm interested in ways to reduce that gap.
[...]

Replace the backend with GDC or LLVM? :-P


T

-- 
Prosperity breeds contempt, and poverty breeds consent. -- Suck.com
August 18, 2015
On Tuesday, 18 August 2015 at 21:26:43 UTC, rsw0x wrote:
> On Tuesday, 18 August 2015 at 21:18:34 UTC, rsw0x wrote:
>> On Tuesday, 18 August 2015 at 10:45:49 UTC, Walter Bright wrote:
>>> Martin ran some benchmarks recently that showed that ddmd compiled with dmd was about 30% slower than when compiled with gdc/ldc. This seems to be fairly typical.
>>>
>>> I'm interested in ways to reduce that gap.
>>
>> retire dmd?
>> this is ridiculous.
>
> To further expand upon this,
> if you want to make D fast - Fix the interface between the compiler and the runtime(including the inability for compilers to inline simple things like allocations which makes allocations have massive overheads.) Then, fix the GC. Make the GC both shared and immutable aware, then moving the GC to a thread local "island"-style GC would be fairly easy. D's GC is probably the slowest GC of any major language available, and the entire thing is wrapped in mutexes.
>

I've been working on that for a while. It is definitively the right direction fro D IMO, but that is far from being "fairly easy".

> D has far, far bigger performance problems that dmd's backend.
>
> Maybe you should take a look at what Go has recently done with their GC to get an idea of what D's competition has been up to. https://talks.golang.org/2015/go-gc.pdf


August 18, 2015
On 8/18/2015 1:33 PM, Jacob Carlborg wrote:
> There's profile guided optimization, which LLVM supports.

dmd does have that to some extent. If you run with -profile, the profiler will emit a trace.def file. This is a script which can be fed to the linker which controls the layout of functions in the executable. The layout is organized so that strongly connected functions reside in the same page, minimizing swapping and maximizing cache hits.

Unfortunately, nobody makes use of it, which makes me reluctant to expend further effort on PGO.

  http://www.digitalmars.com/ctg/trace.html

I wonder how many people actually use the llvm profile guided optimizations. I suspect very, very few.

August 18, 2015
On Tuesday, 18 August 2015 at 21:36:39 UTC, Walter Bright wrote:
> On 8/18/2015 2:26 PM, rsw0x wrote:
>> if you want to make D fast - Fix the interface between the compiler and the
>> runtime(including the inability for compilers to inline simple things like
>> allocations which makes allocations have massive overheads.) Then, fix the GC.
>> Make the GC both shared and immutable aware, then moving the GC to a thread
>> local "island"-style GC would be fairly easy.
>
> The fundamental issue of island GCs is what to do with casting of data from one island to another.

If you want D to have a GC, you have to design the language around having a GC. Right now, D could be likened to using C++ with Boehm. Something needs done with shared to fix this problem, but everything I could suggest would probably be deemed entirely too big of a change(e.g, making casting to/from shared undefined, and putting methods in the GC API to explicitly move memory between heaps)

>
>
>> Maybe you should take a look at what Go has recently done with their GC to get
>> an idea of what D's competition has been up to.
>> https://talks.golang.org/2015/go-gc.pdf
>
> "you"? There's a whole community here, we're all in this together. Pull requests are welcome.

How many people here do you think know the intricacies of dmd as well as you do? At most, a handful and I'm certainly not one of those people.
August 18, 2015
On Tuesday, 18 August 2015 at 21:41:26 UTC, deadalnix wrote:
> On Tuesday, 18 August 2015 at 21:26:43 UTC, rsw0x wrote:
>> On Tuesday, 18 August 2015 at 21:18:34 UTC, rsw0x wrote:
>>> On Tuesday, 18 August 2015 at 10:45:49 UTC, Walter Bright wrote:
>>>> Martin ran some benchmarks recently that showed that ddmd compiled with dmd was about 30% slower than when compiled with gdc/ldc. This seems to be fairly typical.
>>>>
>>>> I'm interested in ways to reduce that gap.
>>>
>>> retire dmd?
>>> this is ridiculous.
>>
>> To further expand upon this,
>> if you want to make D fast - Fix the interface between the compiler and the runtime(including the inability for compilers to inline simple things like allocations which makes allocations have massive overheads.) Then, fix the GC. Make the GC both shared and immutable aware, then moving the GC to a thread local "island"-style GC would be fairly easy. D's GC is probably the slowest GC of any major language available, and the entire thing is wrapped in mutexes.
>>
>
> I've been working on that for a while. It is definitively the right direction fro D IMO, but that is far from being "fairly easy".
>
>> D has far, far bigger performance problems that dmd's backend.
>>
>> Maybe you should take a look at what Go has recently done with their GC to get an idea of what D's competition has been up to. https://talks.golang.org/2015/go-gc.pdf

I used 'fairly easy' in the 'the implementation is left to the reader' sort of way ;)

But yes, we discussed this on Twitter and Walter confirmed what I thought would be a huge issue with this. Shared needs to be changed for a GC like that to be implemented.
D's current GC could see improvements, but it will never ever catch up to the GC of any other major language without changes to the language itself.
August 18, 2015
On Tuesday, 18 August 2015 at 21:31:17 UTC, Walter Bright wrote:
> On 8/18/2015 1:24 PM, Vladimir Panteleev wrote:
>> The specific bugs in question have
>> been fixed, but that doesn't change the general problem.
>
> The reason we have regression tests is to make sure things that are fixed stay fixed. Codegen bugs also always had the highest priority.

It doesn't matter. Regression tests protect against the same bugs reappearing, not new bugs. I'm talking about the general pattern: optimization PR? Regression a few months later.

> Being paralyzed by fear of introducing new bugs is not a way forward with any project.

When the risk outweighs the gain, what's the point of moving forward?

> (Switching to ddmd, and eventually put the back end in D, will also help with this. DMC++ is always built with any changes and tested to exactly duplicate itself, and that filters out a lot of problems. Unfortunately, DMC++ is a 32 bit program and doesn't exercise the 64 bit code gen. Again, ddmd will fix that.)

I don't see how switching to D is going to magically reduce the number of regressions.
August 18, 2015
On Tuesday, 18 August 2015 at 21:45:42 UTC, rsw0x wrote:
> If you want D to have a GC, you have to design the language around having a GC. Right now, D could be likened to using C++ with Boehm.

The irony is that most GC-related complaints are the exact opposite - that the language depends too much on the GC.
August 18, 2015
On 8/18/2015 2:33 PM, deadalnix wrote:
> There is none. There is a ton of 0.5% one that adds up to the 30% difference.

I regard a simple pattern that nets 0.5% as quite a worthwhile win. That's only 60 of those to make up the difference.

If you've got any that you know of that would net 0.5% for dmd, lay it on me!


> If I'd were to bet on what would impact DMD perfs the most, I'd go for SRAO, and
> a inliner in the middle end that works bottom up :
>   - Explore the call graph to-down optimizing functions along the way
>   - Backtrack bottom-up and check for inlining opportunities.
>   - Rerun optimizations on the function inlining was done in.

That's how the inliner already works. The data flow analysis optimizer also runs repeatedly as each optimization exposes more possibilities. The register allocator also runs repeatedly.

(I am unfamiliar with the term SRAO. Google comes up with nothing for it.)


> It require a fair amount of tweaking and probably need a way for the backends to
> provide a cost heuristic for various functions,

A cost heuristic is already used for the inliner (it's in the front end, in 'inline.d'). A cost heuristic is also used for the register allocator.