Thread overview
multi threading in dmd
Oct 11, 2019
Robert Schadek
Oct 11, 2019
Jacob Carlborg
Oct 11, 2019
Sebastian Wilzbach
October 11, 2019
Compiling is IMHO is getting painfully slow with growing projects.
One thing I'm working on is up to 30+ seconds, for 20k lines of somewhat heavy code.
But lets not argue whether or not or not I'm doing it wrong, for the sake of
the arguments lets assume compiling is slow.

One thing I see is that dub passes many files at once to dmd.
And dmd runs one thread on that input.

I think there is some opportunity to start multiple threads to do at least some of
the work in parallel.

1. Has anybody done any work on doing work in dmd with threads?

2. Am I correct that in theory dmd should be able to lex all passed files in
parallel (given enough cpu cores).

3. Is it correct that currently one token is created at a time on request by the
parser.

4. This would currently require the classes Identifier and StringTable be made
thread safe.

5. AsyncRead in mars.d is dead code?

6. Is there any way to test all the different version statements and static if's
used for the same purpose?

7. Is there a change to parse all the initially given files in parallel?

8. Any other ideas on how to do threading in dmd?
October 11, 2019
On 2019-10-11 14:49, Robert Schadek wrote:
> Compiling is IMHO is getting painfully slow with growing projects.
> One thing I'm working on is up to 30+ seconds, for 20k lines of somewhat heavy code.
> But lets not argue whether or not or not I'm doing it wrong, for the sake of
> the arguments lets assume compiling is slow.
> 
> One thing I see is that dub passes many files at once to dmd.
> And dmd runs one thread on that input.
> 
> I think there is some opportunity to start multiple threads to do at least some of
> the work in parallel.
> 
> 1. Has anybody done any work on doing work in dmd with threads?
> 
> 2. Am I correct that in theory dmd should be able to lex all passed files in
> parallel (given enough cpu cores).
> 
> 3. Is it correct that currently one token is created at a time on request by the
> parser.
> 
> 4. This would currently require the classes Identifier and StringTable be made
> thread safe.
> 
> 5. AsyncRead in mars.d is dead code?
> 
> 6. Is there any way to test all the different version statements and static if's
> used for the same purpose?
> 
> 7. Is there a change to parse all the initially given files in parallel?
> 
> 8. Any other ideas on how to do threading in dmd?

In general, there are quite a lot of globals in DMD. There are five `__gshared` variables in the lexer alone. Then more will be referenced from the lexer and parser. There's also the issue of reporting diagnostics. That might need to be synchronized otherwise some parts of an error message from one file might be printed and then some other parts from another file.

Most of the time is spent doing semantic analysis. That will be even harder to do with multiple threads.

-- 
/Jacob Carlborg
October 11, 2019
On 11/10/2019 14.49, Robert Schadek via Dlang-internal wrote:
> Compiling is IMHO is getting painfully slow with growing projects.
> One thing I'm working on is up to 30+ seconds, for 20k lines of somewhat
> heavy code.
> But lets not argue whether or not or not I'm doing it wrong, for the
> sake of
> the arguments lets assume compiling is slow.

Are you aware that the official release of dmd is built with dmd? Compiling with LDC improves dmd about 2x as fast in my last tests (this as without LTO, PGO and an older LLVM backend).

> One thing I see is that dub passes many files at once to dmd.

Dub has the same problem (built with dmd), but the semi-official
binaries that you can grab here are built with LDC:
https://github.com/dlang/dub/releases (and of course the ones shipped
with LDC).

> And dmd runs one thread on that input.
> 
> I think there is some opportunity to start multiple threads to do at
> least some of
> the work in parallel.

Yes, but I don't think lexing is an important part here. It's too cheap.

> 1. Has anybody done any work on doing work in dmd with threads?

https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/

> 2. Am I correct that in theory dmd should be able to lex all passed
> files in
> parallel (given enough cpu cores).

Yes, but lexing is __very__ cheap. Your performance problems come from code with heavy templates + CTFE usage and other expensive semantics check. Benchmark before you optimize!

> 3. Is it correct that currently one token is created at a time on
> request by the
> parser.

The parser generally calls nextToken(), but it can also ask for more
e.g. with peekNext2() or peekPastParen(tk). Though note that the entire
file is loaded into one buffer
(https://github.com/dlang/dmd/blob/7c90cf18cf2ff8bea7eb9aa372b09fc4870efe9e/src/dmd/dmodule.d#L560).

> 4. This would currently require the classes Identifier and StringTable
> be made
> thread safe

Lexing doesn't touch Identifer or StringTable. It simply slices the
string from the fully allocated blob (see e.g.
https://github.com/dlang/dmd/blob/7c90cf18cf2ff8bea7eb9aa372b09fc4870efe9e/src/dmd/lexer.d#L1657)
and new allocations are malloc and copied (see e.g.
https://github.com/dlang/dmd/blob/7c90cf18cf2ff8bea7eb9aa372b09fc4870efe9e/src/dmd/tokens.d#L736).

> 5. AsyncRead in mars.d is dead code?

Yes.

> 6. Is there any way to test all the different version statements and
> static if's
> used for the same purpose?

No.

> 7. Is there a change to parse all the initially given files in parallel?

No. I think Async changes were abandoned when it become apparent that it the work/benefit ratio was low.

> 8. Any other ideas on how to do threading in dmd?

Do not focus on lexing. Focus on CTFE + templates.
You want to do the following:
- cache (e.g. https://github.com/dlang/dmd/pull/7843)
  - even entire modules could be cached and loaded for subsequent runs
- be more lazy (i.e. DMD could be a lot more conservative)
- reduce DMD's memory comsumption (there are still many low-hanging fruits)
  example:
https://github.com/dlang/dmd/pull/10396#issuecomment-531454363 or
https://github.com/dlang/dmd/pull/10427
- optimize DMD's CTFE + template code (there are still many low-hanging
fruits)
  - example: https://github.com/dlang/dmd/pull/10395,
https://github.com/dlang/dmd/pull/10394 or even things like
https://github.com/dlang/dmd/pull/10391
- focus on running semantics in parallel (hard, but should be easier for
when working on independent modules)


Also, I recommend to look for real culprits (dub does come with a real overhead too) or easy low hanging fruits. For example, on Linux DMD could use mmaped files to speed-up file reading.