February 26, 2016
Le 25/02/2016 03:48, Walter Bright a écrit :
> On 2/24/2016 6:05 PM, Adam D. Ruppe wrote:
>> I've also heard from big users who want the performance more than
>> compile time
>> and hit difficulty in build scaling..
>
> I know that performance trumps all for many users. But we can have both
> - dmd and ldc/gdc.
>
> My point is that compile speed is a valuable and distinguishing feature
> of D. It's one that I have to constantly maintain, or it bit rots away.
> It's one that people regularly dismiss as unimportant. Sometimes it
> seems I'm the only one working on the compiler who cares about it.
>
> For comparison, C++ compiles like a pig, I've read that Rust compiles
> like a pig, and Go makes a lot of hay for compiling fast.

I think that you are very gentle with C++. It can be hell, when you are working on multiple platforms with different compiler and build systems it took a lot of efforts and time to maintain compilation time at a decent level.
I recently made optimizations on our build configurations after adding some boost modules at my day job. Our build time double instantly.
All optimizations have a great cost on at least an other point.
- PIMPL: Increase the code complexity, decrease performances
- Precompiled header: not standard, mingw is limited to 130Mo generated file
- Unity build: can be hard to add to many build system if auto-generated. compiler can crash with an out of memory (mingw will be the first)
- cleaning our includes: how doing that without tools?
- multi-threaded compilation: not standard, sometimes it have to be configured on computer

So thank you for having created a fast compiler even if I just can dream to be able to use it a day professionally.
IMO if Go is a fast compiler is just because dmd shows the way.

Is dmd multi-threaded?

PS: I don't understand why import modules aren't already here in c++, it make years that clang team is working on it.
February 25, 2016
On 2/25/2016 3:06 PM, David Nadlinger wrote:
> On Thursday, 25 February 2016 at 22:03:47 UTC, Walter Bright wrote:
>> DMD did slow down because it was now being compiled by DMD instead of g++.
> You can compile it using LDC just fine now. ;)

I think we should ask Martin to set that up for the release builds.

>> Also, dmd was doing multithreaded file I/O, but that was removed because speed
>> didn't matter <grrrr>.
> Did we ever have any numbers showing that this in particular produced a tangible
> performance benefit (even a single barnacle)?

On a machine with local disk and running nothing else, no speedup. With a slow filesystem, like an external, network, or cloud (!) drive, yes. I would also expect it to speed up when the machine is running a lot of other stuff.


> LDC doesn't do so either. I think what rsw0x referred to is doing a normal
> "C-style" parallel compilation of several compilation unit. I'm not sure why
> this couldn't also be done with DMD, though.

-j should work just fine with dmd.

There's a lot internal to the compiler that can be parallelized - just about everything but the semantic analysis.

February 25, 2016
On 2/25/2016 3:06 PM, H. S. Teoh via Digitalmars-d wrote:
> I remember you did a bunch of stuff to the optimizer after the
> switchover to self-hosting; how much of a difference did that make? Are
> there any low-hanging fruit left that could make dmd faster?

There's a lot of low hanging fruit in dmd. In particular, far too many templates are instantiated over and over.

The data structures need to be looked at, and the excessive memory consumption also slows things down.

> On a related note, I discovered an O(n^2) algorithm in the front-end...
> it's unlikely to be an actual bottleneck in practice (basically it's
> quadratic in the number of fields in an aggregate), though you never
> know. It actually does a full n^2 iterations, and seemed like it could
> be at least pared down to n(n+1)/2, even without doing better than
> O(n^2).

Please add a comment to the source code about this and put it in a PR.

February 26, 2016
On Fri, 26 Feb 2016 00:48:15 +0100, Xavier Bigand wrote:

> Is dmd multi-threaded?

Not at present.

It should be relatively easy to parallelize IO and parsing, at least in theory. I think IO parallelism was removed with the ddmd switch, maybe? But you'd have to identify the files you need to read in advance, so that's not as straightforward.

D's metaprogramming is too complex for a 100% solution for parallelizing
semantic analysis on a module level. But you could create a partial
solution:
* After parsing, look for unconditional imports. Skip static if/else
blocks, skip template bodies, but grab everything else.
* Make a module dependency graph from that.
* Map each module to a task.
* Merge dependency cycles into single tasks. You now have a DAG.
* While there are any tasks in the graph:
  - Find all leaf tasks in the graph.
  - Run semantic analysis on them in parallel.

When you encounter a conditional or mixed in import, you can insert it into the DAG if it's not already there, but it would be simpler just to run analysis right then and there.

Alternatively, you can find regular and conditional imports and try to use them all. But this requires you to hold errors until you're certain that the module is used, and you end up doing more work overall. And that could be *tons* more work. Consider:

  module a;
  enum data = import("ten_million_records.csv");
  mixin(createClassesFromData(data));

  module b;
  enum shouldUseModuleA = false;

  module c;
  import b;
  static if (shouldUseModuleA) import a;

And even if you ignored that, you'd still have to deal with mixed in imports, which can be the result of arbitrarily complex CTFE expressions.

While all of this is straightforward in theory, it probably isn't so simple in practice.
February 26, 2016
On Friday, 26 February 2016 at 00:56:22 UTC, Walter Bright wrote:
> On 2/25/2016 3:06 PM, H. S. Teoh via Digitalmars-d wrote:
>> I remember you did a bunch of stuff to the optimizer after the
>> switchover to self-hosting; how much of a difference did that make? Are
>> there any low-hanging fruit left that could make dmd faster?
>
> There's a lot of low hanging fruit in dmd. In particular, far too many templates are instantiated over and over.

LOL. That would be an understatement. IIRC, at one point, Don figured out that we were instantiating _millions_ of templates for the std.algorithm unit tests. The number of templates used in template constraints alone is likely through the roof. Imagine how many times something like isInputRange!string gets compiled in your typical program. With how template-heavy range-base code is, almost anything we can do to speed of the compiler with regards to templates is likely to pay off.

- Jonathan M Davis

February 25, 2016
On 2/18/2016 1:30 PM, Jonathan M Davis wrote:
> It's not a strawman. Walter has state previously that he's explicitly avoided
> looking at the source code for other compilers like gcc, because he doesn't want
> anyone to be able to accuse him of stealing code, copyright infringement, etc.
> Now, that's obviously much more of a risk with gcc than llvm given their
> respective licenses, but it is a position that Walter has taken when the issue
> has come up, and it's not something that I'm making up.
>
> Now, if Walter were willing to give up on the dmd backend entirely, then
> presumably, that wouldn't be a problem anymore regardless of license issues, but
> he still has dmc, which uses the same backend, so I very much doubt that that's
> going to happen.

It's still an issue I worry about. I've been (falsely) accused of stealing code in the past, even once accused of having stolen the old Datalight C compiler from some BYU students. Once a game company stole Empire, and then had the astonishing nerve to sic their lawyers on me accusing me of stealing it from them! (Showing them my registered copyright of the source code that predated their claim by 10 years was entertaining.)

More recently this came up in the Tango/Phobos rift, as some of the long term members here will recall.

So it is not an issue to be taken too lightly. I have the scars to prove it :-/

One thing I adore about github is it provides a legal audit trail of where the code came from. While that proves nothing about whether contributions are stolen or not, it provides a date stamp (like my registered copyright did), and if stolen code does make its way into the code base, it can be precisely excised. Github takes a great load off my mind.

There are other reasons to have dmd's back end. One obvious one is we wouldn't have had a Win64 port without it. And anytime we wish to experiment with something new in code generation, it's a heluva lot easier to do that with dmd than with the monumental code bases of gcc and llvm.

One thing that has changed a lot in my attitudes is I no longer worry about people stealing my code. If someone can make good use of my stuff, have at it. Boost license FTW!

I wish LLVM would switch to the Boost license, in particular removing this clause:

"Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimers in the documentation and/or other materials provided with the distribution."

Reading it adversely means if I write a simple utility and include a few lines from LLVM, I have to include that license in the binary and a means to print it out. If I include a bit of code from several places, each with their own version of that license, there's just a bunch of crap to deal with to be in compliance.

February 26, 2016
On Friday, 26 February 2016 at 06:19:27 UTC, Walter Bright wrote:
> I wish LLVM would switch to the Boost license, in particular removing this clause:
>
> "Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimers in the documentation and/or other materials provided with the distribution."
>
> Reading it adversely means if I write a simple utility and include a few lines from LLVM, I have to include that license in the binary and a means to print it out. If I include a bit of code from several places, each with their own version of that license, there's just a bunch of crap to deal with to be in compliance.

That's why I tend to encourage folks to use the Boost license rather than the BSD license when it comes up (LLVM isn't BSD-licensed, but its license is very similar). While source attribution makes sense, I just don't want to deal with binary attribution in anything I write. It does make some sense when you don't want someone to be able to claim that they didn't use your code (even if you're not looking to require that they open everything up like the GPL does), but for the most part, I just don't think that that's worth it - though it is kind of cool that some commercial stuff (like the PS4) is using BSD-licensed code, and we know it, because they're forced to give attribution with their binaries.

- Jonathan M Davis
February 26, 2016
On 25 Feb 2016 11:05 pm, "Walter Bright via Digitalmars-d" < digitalmars-d@puremagic.com> wrote:
>
> On 2/25/2016 1:50 PM, Andrei Alexandrescu wrote:
>>
>> Good to know, thanks! -- Andrei
>
>
> DMD did slow down because it was now being compiled by DMD instead of
g++. Also, dmd was doing multithreaded file I/O, but that was removed because speed didn't matter <grrrr>.
>

I thought that mulithreaded I/O did not change anything, or slowed compilation down in some cases?

Or I recall seeing a slight slowdown when I first tested it in gdc all those years ago.  So left it disabled - probably for the best too.


February 26, 2016
On 2/26/2016 12:20 AM, Iain Buclaw via Digitalmars-d wrote:
> I thought that mulithreaded I/O did not change anything, or slowed compilation
> down in some cases?
>
> Or I recall seeing a slight slowdown when I first tested it in gdc all those
> years ago.  So left it disabled - probably for the best too.


Running one test won't really give much useful information. I also wrote:

"On a machine with local disk and running nothing else, no speedup. With a slow filesystem, like an external, network, or cloud (!) drive, yes. I would also expect it to speed up when the machine is running a lot of other stuff."
February 26, 2016
On Thursday, 25 February 2016 at 23:48:15 UTC, Xavier Bigand wrote:
> IMO if Go is a fast compiler is just because dmd shows the way.

Go was designed to compile fast because Google was looking for something faster than C++ for largish projects. The authors were also involved with Unix/Plan9 and have experience with creating languages and compilers for building operating systems...

Anyway, compilation speed isn't the primary concern these days when you look at how people pick their platform. People tend to go for languages/compilers that are convenient, generate good code, support many platforms and resort to parallell builds when the project grows.

You can build a very fast compiler for a stable language with a simple type system like C that don't even build an AST (using an implicit AST) and do code-gen on the fly. But it turns out people prefer sticking to GCC even when other C compilers have been 10-20x faster.