D compilation is too slow and I am forking the compiler

On Wednesday, 21 November 2018 at 08:07:52 UTC, Vladimir Panteleev wrote: > https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/ You gave me a fright there with the title there for a moment. Awesome stuff though. Not sure how easy it will be to upstream considering this needs to not wreck Windows and needs to work with LDC/GDC (at least we have inlining in the backend).

On Wednesday, 21 November 2018 at 08:32:39 UTC, Nicholas Wilson wrote: > You gave me a fright there with the title there for a moment. :) > Awesome stuff though. Not sure how easy it will be to upstream considering this needs to not wreck Windows and needs to work with LDC/GDC (at least we have inlining in the backend). All the DMD-side logic is all encapsulated in one function: https://github.com/CyberShadow/dmd/blob/dmdforker/src/dmd/mars.d#L501-L673 Its body can be versioned out in incompatible platforms/implementations.

On Wednesday, 21 November 2018 at 08:07:52 UTC, Vladimir Panteleev wrote: > https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/ You might want to have a brush up on which direction C++ modules are heading in. Notable talks would be those given at the GNU Cauldron for both 2017 and 2018. The general run-down as I understand it. === Problem to solve: Compiler asks an Oracle about module A. Phrased this way, Compiler is a client, Oracle is a server. Oracle could be a file, socket, remote server, anything that can be read from or written to. Communication can be done via a standard format (such as json). This means that the Oracle (the implementation of) that keeps track of compilation and dependencies of the build is now someone else's problem as far as the Compiler is concerned. === I think what you've already started would fit well into this. Iain.

On 11/21/2018 12:07 AM, Vladimir Panteleev wrote: > https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/ I implemented precompiled headers for Digital Mars C++. It took a loooong time to work all the bugs out of it. It's also a brittle system. It works by allocating memory from a memory-mapped file, which serves as the precompiled header.

On Wednesday, 21 November 2018 at 09:46:44 UTC, Walter Bright wrote: > It works by allocating memory from a memory-mapped file, which serves as the precompiled header. Hey, that's a great idea! Can we do this for DMD? :D On a more serious note: do you think that with D's features (type system / metaprogramming), you could have avoided some of those bugs? For example, one thing we can do in D which is still impossible in C++ is to automatically serialize/deserialize all fields of a struct/class (using tupleof / allMembers).

November 21, 2018

Re: D compilation is too slow and I am forking the compiler

Posted by Walter Bright
in reply to Vladimir Panteleev

Permalink

Walter Bright

Posted in reply to Vladimir Panteleev

Permalink

On 11/21/2018 2:16 AM, Vladimir Panteleev wrote:
> On Wednesday, 21 November 2018 at 09:46:44 UTC, Walter Bright wrote:
>> It works by allocating memory from a memory-mapped file, which serves as the precompiled header.
> 
> Hey, that's a great idea! Can we do this for DMD? :D
> 
> On a more serious note: do you think that with D's features (type system / metaprogramming), you could have avoided some of those bugs?
> 
> For example, one thing we can do in D which is still impossible in C++ is to automatically serialize/deserialize all fields of a struct/class (using tupleof / allMembers).
> 

Memory mapped files really were the key to success, because if you could reload the mmf at the same address, the pointers did not have to be patched. In the DMC++ source code, "dehydrating" a pointer meant subtracting a value from it so it was correct for the base address of the mmf, and "hydrating" a pointer was the inverse.

The two bug prone problems were:

1. separating out the tangled data structures into what goes into the pch, and what does not. Obviously, nothing in the pch could point outside of it.

2. .h files are simply not compatible with this, so you've got to detect when it won't work. For example, anything like a command line switch or a macro that might cause different code to be generated in the pch had to invalidate it.

Maybe I should have done your fork idea? :-)

My experience with this drove many design decisions for D modules, for example, D modules are unaffected by where they are imported, the order they are imported, or the number of times they are imported. (Yes, I know about https://digitalmars.com/d/archives/digitalmars/D/D_needs_to_be_honest_320976.html)

Anyhow, what I've thought about doing since the beginning was make DMD multithreaded. The language is designed to support multithreaded compilation. For example, lexing, parsing, semantic analysis, optimization, and code generation can all be done concurrently.

DMD 1.0 would read imports in a separate thread. This would speed things up if you were using a slow filesystem, like NAS or a USB stick, but it was eventually disabled because there wasn't a perceptible speedup with current filesystems.

Wouldn't it be awesome to have the lexing/parsing of the imports all done in parallel? The main difficulty in getting that to work is dealing with the shared string table.

On Wednesday, 21 November 2018 at 10:56:02 UTC, Walter Bright wrote: > > Wouldn't it be awesome to have the lexing/parsing of the imports all done in parallel? The main difficulty in getting that to work is dealing with the shared string table. What about creating a new Fiber for each module needing lexing/parsing/semantic to be ran? Compilation of one module would then get as far as it can until it needs to defer, then calls yield() to continue compilation of the next module. This in hope that when the round trip returns, the AST will be sufficiently complete enough for compilation to continue.

Forums