Jump to page: 1 2
Thread overview
Building the compiler in 2 seconds with `dmd -i`
May 19, 2023
Dennis
May 19, 2023
RazvanN
May 19, 2023
max haughton
May 19, 2023
Dany12L
May 19, 2023
Adam D Ruppe
May 19, 2023
Ali Çehreli
May 19, 2023
Adam D Ruppe
May 19, 2023
H. S. Teoh
May 19, 2023
H. S. Teoh
May 19, 2023
ryuukk_
May 19, 2023
Jonathan Marler
May 20, 2023
Adam D Ruppe
May 19, 2023
max haughton
May 19, 2023
H. S. Teoh
May 19, 2023
Monkyyy
May 19, 2023
Dennis
May 19, 2023
Dany12L
May 19, 2023
max haughton
May 19, 2023

In theory, dmd is really easy to build: it's just a collection of D files in a single repository with no external dependencies. Why have there always been such complex makefiles/scripts to build it?

Well, after replacing most of dmd's backend's C-style extern function declarations with proper imports [1], there's a simple way to build the compiler. Instead of needing to list every source file, you can leverage the -i flag which automatically compiles imports, and create a build script as short as this:

echo -n "/etc" >> SYSCONFDIR.imp
cd compiler/src
dmd -i dmd/mars.d -of=../../dmd dmd/eh.d dmd/backend/dtype.d -version=MARS -Jdmd/res -J../..

On my linux pc, these 3 lines take 2.0 seconds to complete and it uses max 680MB RAM.

The official way to build a non-release compiler is:

cd compiler/src
rdmd build.d BUILD=debug

build.d is 2400 lines, takes 2.3 seconds to complete a clean build, and uses max 613MB RAM.

The build script does separate compilation, multi-threading, includes debug info (-g), and builds other things as well. Still, my takeaways are:

  • The build script is overly complex
  • dmd -i is awesome
  • Without too much templates/CTFE, dmd compiles pretty fast (150 KLOC/second in this case), reducing the need for incremental/parallel compilation

Eventually I hope to make just dmd -i dmd/mars.d work, I'll see how close I can get to that.

[1] https://github.com/dlang/dmd/pulls?q=is%3Aclosed+is%3Apr+author%3Adkorpel+label%3ABackend+label%3ARefactoring+prototypes

May 19, 2023

On Friday, 19 May 2023 at 09:51:39 UTC, Dennis wrote:

>

In theory, dmd is really easy to build: it's just a collection of D files in a single repository with no external dependencies. Why have there always been such complex makefiles/scripts to build it?

Well, after replacing most of dmd's backend's C-style extern function declarations with proper imports [1], there's a simple way to build the compiler. Instead of needing to list every source file, you can leverage the -i flag which automatically compiles imports, and create a build script as short as this:

echo -n "/etc" >> SYSCONFDIR.imp
cd compiler/src
dmd -i dmd/mars.d -of=../../dmd dmd/eh.d dmd/backend/dtype.d -version=MARS -Jdmd/res -J../..

On my linux pc, these 3 lines take 2.0 seconds to complete and it uses max 680MB RAM.

The official way to build a non-release compiler is:

cd compiler/src
rdmd build.d BUILD=debug

build.d is 2400 lines, takes 2.3 seconds to complete a clean build, and uses max 613MB RAM.

I would suggest dropping the build.d file. Its main advantage is that you have the same script to compile the code regardless of the platform you are running on, however, we are now stuck between 2 worlds: dmd has build.d but druntime and phobos use makefiles.

Also, build.d is much more complicated and requires more time for handling than a measly makefile which anyone can understand. Every time I add a new file to the repo I waste time to read and understand how build.d works.

>

The build script does separate compilation, multi-threading, includes debug info (-g), and builds other things as well. Still, my takeaways are:

  • The build script is overly complex
  • dmd -i is awesome
  • Without too much templates/CTFE, dmd compiles pretty fast (150 KLOC/second in this case), reducing the need for incremental/parallel compilation

Eventually I hope to make just dmd -i dmd/mars.d work, I'll see how close I can get to that.

Good luck!

>

[1] https://github.com/dlang/dmd/pulls?q=is%3Aclosed+is%3Apr+author%3Adkorpel+label%3ABackend+label%3ARefactoring+prototypes

May 19, 2023
On Friday, 19 May 2023 at 09:51:39 UTC, Dennis wrote:
> Eventually I hope to make just `dmd -i dmd/mars.d` work, I'll see how close I can get to that.


dmd -i is heaven. I, without a doubt, place it as the most significant addition to the D ecosystem of the last six years (probably more if I extended the audit)

I've deleted so many convoluted builds in favor of it and always seen wins.
May 19, 2023
On 5/19/23 07:28, Adam D Ruppe wrote:

> dmd -i is heaven. I, without a doubt, place it as the most significant
> addition to the D ecosystem of the last six years (probably more if I
> extended the audit)
>
> I've deleted so many convoluted builds in favor of it and always seen wins.

-i is great and I've started with -i in my last project that uses Makefiles. For now, I went back to manual dependencies by listing all the modules explicitly. Otherwise, 'make' did not know modifications to the -i files. I think I will incorporate the -deps switch later.

Of course, this is not a point against -i but I wanted to point out that dependencies must still be taken care of some other way.

Ali

May 19, 2023
On Friday, 19 May 2023 at 16:13:02 UTC, Ali Çehreli wrote:
> Of course, this is not a point against -i but I wanted to point out that dependencies must still be taken care of some other way.

sometimes maybe, and what's interesting about -i is you can also tell it to exclude certain packages if you want. but more often than not i just do:

$ cat Makefile
all:
    dmd -i main.d

May 19, 2023
On Fri, May 19, 2023 at 02:28:38PM +0000, Adam D Ruppe via Digitalmars-d wrote:
> On Friday, 19 May 2023 at 09:51:39 UTC, Dennis wrote:
> > Eventually I hope to make just `dmd -i dmd/mars.d` work, I'll see how close I can get to that.
> 
> dmd -i is heaven. I, without a doubt, place it as the most significant addition to the D ecosystem of the last six years (probably more if I extended the audit)
[...]

On that note, ldc2 -i also works.  And it's super-convenient.


T

-- 
One Word to write them all, One Access to find them, One Excel to count them all, And thus to Windows bind them. -- Mike Champion
May 19, 2023

On Friday, 19 May 2023 at 10:00:26 UTC, RazvanN wrote:

>

On Friday, 19 May 2023 at 09:51:39 UTC, Dennis wrote:

>

[...]

I would suggest dropping the build.d file. Its main advantage is that you have the same script to compile the code regardless of the platform you are running on, however, we are now stuck between 2 worlds: dmd has build.d but druntime and phobos use makefiles.

Also, build.d is much more complicated and requires more time for handling than a measly makefile which anyone can understand. Every time I add a new file to the repo I waste time to read and understand how build.d works.

Really? I added a completely new build process (PGO), it wasn't that hard. Adding new file is literally just adding the filename to a list, no?

> >

[...]

Good luck!

>

[...]

Not against removing build.d in principle though, just keep in mind you have to replace 100% of the functionality.

The makefiles were a complete mess and were a block to doing work on windows due to the ambiguity between digital mars and msvc make.

May 19, 2023

On Friday, 19 May 2023 at 09:51:39 UTC, Dennis wrote:

>

In theory, dmd is really easy to build: it's just a collection of D files in a single repository with no external dependencies. Why have there always been such complex makefiles/scripts to build it?

Well, after replacing most of dmd's backend's C-style extern function declarations with proper imports [1], there's a simple way to build the compiler. Instead of needing to list every source file, you can leverage the -i flag which automatically compiles imports, and create a build script as short as this:

echo -n "/etc" >> SYSCONFDIR.imp
cd compiler/src
dmd -i dmd/mars.d -of=../../dmd dmd/eh.d dmd/backend/dtype.d -version=MARS -Jdmd/res -J../..

On my linux pc, these 3 lines take 2.0 seconds to complete and it uses max 680MB RAM.

The official way to build a non-release compiler is:

cd compiler/src
rdmd build.d BUILD=debug

build.d is 2400 lines, takes 2.3 seconds to complete a clean build, and uses max 613MB RAM.

The build script does separate compilation, multi-threading, includes debug info (-g), and builds other things as well. Still, my takeaways are:

  • The build script is overly complex
  • dmd -i is awesome
  • Without too much templates/CTFE, dmd compiles pretty fast (150 KLOC/second in this case), reducing the need for incremental/parallel compilation

Eventually I hope to make just dmd -i dmd/mars.d work, I'll see how close I can get to that.

[1] https://github.com/dlang/dmd/pulls?q=is%3Aclosed+is%3Apr+author%3Adkorpel+label%3ABackend+label%3ARefactoring+prototypes

I'm about 80% certain I have got mars.d working with -i before. I forget what I did specifically but it was basically the same stuff in the backend.

The reason why dmd -i works better than one might initially assume is that:

  1. importing druntime is too slow.
  2. Semantic analysis is not shared between parallel compiler invocations

Both of which result in a lower bound on the time taken regardless of parallelism

2.0 seconds is still way way way too slow IMO. Iteration should ideally be instant, tests running before you blink after saving the file. That's the benefit of incremental compilation, ideally you just have it watch the source code and reuse stuff that didn't change.

May 19, 2023
On Fri, May 19, 2023 at 09:13:02AM -0700, Ali Çehreli via Digitalmars-d wrote:
> On 5/19/23 07:28, Adam D Ruppe wrote:
> 
> > dmd -i is heaven. I, without a doubt, place it as the most significant addition to the D ecosystem of the last six years (probably more if I extended the audit)
> >
> > I've deleted so many convoluted builds in favor of it and always seen wins.
> 
> -i is great and I've started with -i in my last project that uses Makefiles.  For now, I went back to manual dependencies by listing all the modules explicitly. Otherwise, 'make' did not know modifications to the -i files. I think I will incorporate the -deps switch later.
[...]

I propose a new compiler switch that does the following:
- For every file included by -i, check whether it has changed since the
  last compile.
- If none of the input files, including those pulled in by -i, have
  changed since the last compile, skip all semantic and codegen steps
  and exit with a success status.

Then you could just unconditionally run {dmd,ldc2} -i every time, and it will just be a (mostly) no-op if none of the source files have changed.

//

The big question is, of course, how will the compiler know whether a file has changed?  There are several ways of doing this.

- The age-old Make way of checking timestamps. OK if you're on the
  filesystem, but unreliable and prone to false positives and false
  negatives. False positives are not horrible -- the worst is you just
  recompile something that didn't change, no problem, you get the same
  binary. But false negatives are very bad: the compiler gives you the
  impression that it has recompiled the binary but actually did nothing.
  But in the general case this shouldn't happen (not frequently anyway),
  so a workaround could just be to recompile without the new switch when
  you suspect false negatives are happening.

- Do a quick checksum of each imported file and compare against some
  stored checksums. This could be stored in the executable in an
  optional section that gets ignored by the linker, for example. The
  compiler loads this section and compares the checksum of each source
  file against it. If the checksum is the same, the source file hasn't
  changed. If the checksum is different, or if there's no checksum
  loaded (this is the first compilation of that file), compile as usual.
  The advantage of this approach is that if your checksum algorithm is
  reliable, false positives and false negatives should (almost) never
  happen. The disadvantage is that this will be slow.

//

The above scheme, of course, falls down if you do tricky things like having string mixins that construct import statements:

	string makeCode(string x, string y) {
		return x ~ " " ~ y ~ ";";
	}
	void main() {
		mixin(makeCode("import", "mymodule");
	}

The compiler wouldn't know that mymodule.d is being pulled in without running semantic. So the savings will be less, possibly much less, and you might as well just recompile everything every time instead.


T

-- 
Mediocrity has been pushed to extremes.
May 19, 2023
On Fri, May 19, 2023 at 04:48:01PM +0000, max haughton via Digitalmars-d wrote: [...]
> The reason why dmd -i works better than one might initially assume is that:
> 
> 1. importing druntime is too slow.
> 2. Semantic analysis is not shared between parallel compiler
> invocations
> 
> Both of which result in a lower bound on the time taken regardless of parallelism
> 
> 2.0 seconds is still way way way too slow IMO. Iteration should ideally be instant, tests running before you blink after saving the file. That's the benefit of incremental compilation, ideally you just have it watch the source code and reuse stuff that didn't change.

The current architecture of DMDFE just doesn't lend itself to this, what with mutating the AST as the compilation goes on, and all that.  If you want to do this, you're probably looking at reimplementing the compiler from scratch to be a live compiler service, rather than a batch process. It will have to hold the program AST in memory and be able to replace and recompile parts of it upon request (e.g. when a source file changes).

Keep in mind that replacing and recompiling parts of the AST may not be as simple as it sounds, because other parts of the AST (e.g., other modules) may have references to the old AST, so all of those have to be updated as well.  The AST structures must be designed to handle this kind of operation; I highly doubt whether the current DMDFE AST structures can handle something like this at all.


T

-- 
Guns don't kill people. Bullets do.
« First   ‹ Prev
1 2