July 25, 2012
(Maybe this should be in D.learn but it's a somewhat advanced topic)

I would really like to understand how D compiles a program or library. I looked through TDPL and it doesn't seem to say anything about how compilation works.

- Does it compile all source files in a project at once?
- Does the compiler it have to re-parse all Phobos templates (in modules used by the program) whenever it starts?
- Is there any concept of an incremental build?
- Obviously, one can set up circular dependencies in which the compile-time meaning of some code in module A depends on the meaning of some code in module B, which in turn depends on the meaning of some other code in module A. Sometimes the D compiler can resolve the ultimate meaning, other times it cannot. I was pleased that the compiler successfully understood this:

// Y.d
import X;
struct StructY {
	int a = StructX().c;
	auto b() { return StructX().d(); }
}

// X.d
import Y;
struct StructX {
	int c = 3;
	auto d()
	{
		static if (StructY().a == 3 && StructY().a.sizeof == 3)
			return 3;
		else
			return "C";
	}
}

But what procedure does the compiler use to resolve the semantics of the code? Is there a specification anywhere? Does it have some limitations, such that there is code with an unambiguous meaning that a human could resolve but the compiler cannot?

- In light of the above (that the meaning of D code can be interdependent with other D code, plus the presence of mixins and all that), what are the limitations of __traits(allMembers...) and other compile-time reflection operations, and what kind of problems might a user expect to encounter?
July 25, 2012
On Wed, 25 Jul 2012 02:16:04 +0200
"David Piepgrass" <qwertie256@gmail.com> wrote:

> (Maybe this should be in D.learn but it's a somewhat advanced topic)
> 
> I would really like to understand how D compiles a program or library. I looked through TDPL and it doesn't seem to say anything about how compilation works.
> 

The compilation model is very similar to C or C++, so that's a good starting point for understanding how D's works.

Here's how it works:

Whatever file *or files* you pass to DMD on the command line, *those* are the files it will compile and generate object files for. No more, no less.

However, in the process, it will *also* parse and perform semantic
analysis on any files that are directly or indirectly imported, but it
won't actually generate any machine code or object files for them (it
will find these files  via the -Ipath command line switch you pass
to DMD - this -I switch is like D's equivalent of Java's classpaths).

This does mean that, unlike what's typically done in C/C++, it's generally much faster to pass all your files into DMD at once, instead of the typical C/C++ route of making separate calls to the compiler for each source file.

After DMD generates the object files for all source files you give it, it will automatically send them to the linker (OPTLINK on windows, or gcc/ld on Posix) to be linked into an executable. That is, *unless* you give it either -c ("compile-only, do not link") or -lib ("generate library instead of object files"). That way, you can link manually if you wish.

So typically, you pass DMD all the .d files in your program, and it'll compile them all, and pass them to the linker to be linked into an executable. But if you don't want to automatically link, you don't have to. If you want to compile them all separately, you can do so (though it'd be very slow - probably almost as slow as C++, but not quite).

But that's just the DMD compiler itself. Instead of using DMD directly, there's a better modern trick that's generally preferred: RDMD.

If you use rdmd to compile (instead of dmd), you *just* give it
your *one* main source file (typically the one with your "main()"
function). This file must be the *last* parameter passed to rdmd:

$rdmd --build-only (any other flags) main.d

Then, RDMD will figure out *all* of the source files needed (using
the full compiler's frontend, so it never gets fooled into missing
anything), and if any of them have been changed, it will automatically
pass them *all* into DMD for you. This way, you don't have to
manually keep track of all your files and pass them all into
DMD youself. Just give RDMD your main file and that's it, you're golden.

Side note: Another little trick with RDMD: Omit the --build-only and it will compile AND then run your program:

$cat simpleecho.d
import std.stdio;
void main(string[] args)
{
	writeln(args[1]);
}

$rdmd simpleecho.d "Anything after the .d file is passed to your app"
{automatically compiles all sources if needed}
Anything after the .d file is passed to your app

$wheee!!
command not found


> - Does it compile all source files in a project at once?

Answered this above. In short: It compiles whatever you give it (and
processes, but doesn't compile, any needed imports). Unless you use RDMD
in which case it automatically detects and compiles all your
needed sources (unless none of them have changed).

> - Does the compiler it have to re-parse all Phobos templates (in modules used by the program) whenever it starts?

Yes. (Unless you never import anything from in phobos...I think.) But it's very, very fast to parse. Lightning-speed if you compare it to C++.

But it shouldn't run full semantic analysis on templates that are never actually used. (Unless they're used in a piece of dead code.)

> - Is there any concept of an incremental build?

Yes, but there's a few "gotcha"s:

1. D compiles so damn fast that it's not nearly as much of an issue as it is with C++ (which is notoriously ultra-slow compared to...everything, hence the monumental importance of C++'s incremental builds).

2. Historically, there can be problems with templates when incrementally compiling. DMD has been known to get confused about which object file it put an instantiated template into, which can lead to occasional linker errors. These errors can be fixed by doing a full rebuild (which is WAAAY faster than it would be with C++). I don't know whether or not this has been fixed.

3. Incremental building typically involves compiling files one-at-a-time. But with D, you get a HUGE boost in compilation speed by not compiling one-at-a-time. So if you have a huge, slow-to-compile codebase (for example, 15 seconds or so), and you change a handful of files, it may actually be much *faster* to do a full rebuild (since you're not re-analysing all the imports). Of course, you could probably get around that issue by passing all the changed files (and only the changed files) into DMD at once (instead of one-at-a-time), but I don't know whether typical build tools (like make) can realistically handle that.


> - Obviously, one can set up circular dependencies in which the compile-time meaning of some code in module A depends on the meaning of some code in module B, which in turn depends on the meaning of some other code in module A. Sometimes the D compiler can resolve the ultimate meaning, other times it cannot. I was pleased that the compiler successfully understood this:
> 
> // Y.d
> import X;
> struct StructY {
> 	int a = StructX().c;
> 	auto b() { return StructX().d(); }
> }
> 
> // X.d
> import Y;
> struct StructX {
> 	int c = 3;
> 	auto d()
> 	{
> 		static if (StructY().a == 3 && StructY().a.sizeof ==
> 3) return 3;
> 		else
> 			return "C";
> 	}
> }
> 
> But what procedure does the compiler use to resolve the semantics of the code? Is there a specification anywhere? Does it have some limitations, such that there is code with an unambiguous meaning that a human could resolve but the compiler cannot?
> 

It keeps diving deeper and deeper to find anything it can "start" with. One it finds that, it'll just build everything back up in whatever order is necessary.

If it *truly is* a circular definition, and there isn't any place it can actually start with, then it issues an error.

(If there's any cases where it doesn't work this way, they should be
filed as bugs in the compiler.)

> - In light of the above (that the meaning of D code can be interdependent with other D code, plus the presence of mixins and all that), what are the limitations of __traits(allMembers...) and other compile-time reflection operations, and what kind of problems might a user expect to encounter?

Shouldn't really be an issue. Such things won't get evaluated until the types/identifiers involved are *fully* analyzed (or at least to the extent that they need to be analyzed). So the results of things like __traits(allMembers...) should *never* change during compilation, or when changing the order of files or imports (unless there's some compiler bug). Any situation that *would* result in any such ambiguity will get flagged as an error in your code.

I would however, recommend avoiding static constructors and module
constructors whenever you reasonably can. If you have a circular
import (ie: module a imports b, which imports c, which imports
a), then that's normally OK, *UNLESS* they all have static
and/or module constructors. If they do, then the startup code D builds
into your application won't know which needs to run first (and it
doesn't analyze the actual code, it just assumes there *could* be
an order-of-execution dependency), so you'll get a circular dependency
error when you run your program. And the safest, easiest way to get rid
of those errors is to eliminate one or more static/module constructors.

July 25, 2012
On Tuesday, July 24, 2012 22:00:56 Nick Sabalausky wrote:
> But with D, you get a HUGE boost in compilation speed by
> not compiling one-at-a-time. So if you have a huge, slow-to-compile
> codebase (for example, 15 seconds or so),

I find it shocking that anyone would consider 15 seconds slow to compile for a large program. Yes, D's builds are lightning fast in general, and 15 seconds is probably a longer build, but calling 15 seconds "slow-to-compile" just about blows my mind. 15 seconds for a large program is _fast_. If anyone complains about a large program taking 15 seconds to build, then they're just plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of 10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15 seconds is a godsend.

- Jonathan M Davis
July 25, 2012
On Tue, 24 Jul 2012 20:35:27 -0700
Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Tuesday, July 24, 2012 22:00:56 Nick Sabalausky wrote:
> > But with D, you get a HUGE boost in compilation speed by
> > not compiling one-at-a-time. So if you have a huge, slow-to-compile
> > codebase (for example, 15 seconds or so),
> 
> I find it shocking that anyone would consider 15 seconds slow to compile for a large program. Yes, D's builds are lightning fast in general, and 15 seconds is probably a longer build, but calling 15 seconds "slow-to-compile" just about blows my mind. 15 seconds for a large program is _fast_. If anyone complains about a large program taking 15 seconds to build, then they're just plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of 10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15 seconds is a godsend.
> 

I just meant that I haven't heard of much D stuff that took much longer than that, so it's somewhat on the long end as far as D stuff goes. But I may be off-base. 'Course it depends a lot of the computer, too. I probably worded it weird.


July 25, 2012
On 2012-07-25 04:00, Nick Sabalausky wrote:

> But that's just the DMD compiler itself. Instead of using DMD
> directly, there's a better modern trick that's generally preferred:
> RDMD.
>
> If you use rdmd to compile (instead of dmd), you *just* give it
> your *one* main source file (typically the one with your "main()"
> function). This file must be the *last* parameter passed to rdmd:
>
> $rdmd --build-only (any other flags) main.d
>
> Then, RDMD will figure out *all* of the source files needed (using
> the full compiler's frontend, so it never gets fooled into missing
> anything), and if any of them have been changed, it will automatically
> pass them *all* into DMD for you. This way, you don't have to
> manually keep track of all your files and pass them all into
> DMD youself. Just give RDMD your main file and that's it, you're golden.

RDMD is mostly useful for executables, not so much for libraries. For libraries you would need to pass _all_ of your project files directly to DMD (or find some other tool). It's perfectly fine to have a library which consists of two files with no interaction between them. Neither RDMD or the compiler can track that.

-- 
/Jacob Carlborg
July 25, 2012
On Tue, 2012-07-24 at 20:35 -0700, Jonathan M Davis wrote: […]
> I find it shocking that anyone would consider 15 seconds slow to compile for a large program. Yes, D's builds are lightning fast in general, and 15 seconds is probably a longer build, but calling 15 seconds "slow-to-compile" just about blows my mind. 15 seconds for a large program is _fast_. If anyone complains about a large program taking 15 seconds to build, then they're just plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of 10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15 seconds is a godsend.

A company I did some Python training for (they used Python for their integration and system testing, and a bit of unit testing) back in 2006 had a C++ product whose "from scratch" build time genuinely was 56 hours.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


July 25, 2012
On Wednesday, July 25, 2012 08:54:24 Russel Winder wrote:
> On Tue, 2012-07-24 at 20:35 -0700, Jonathan M Davis wrote: […]
> 
> > I find it shocking that anyone would consider 15 seconds slow to compile for a large program. Yes, D's builds are lightning fast in general, and 15 seconds is probably a longer build, but calling 15 seconds "slow-to-compile" just about blows my mind. 15 seconds for a large program is _fast_. If anyone complains about a large program taking 15 seconds to build, then they're just plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of 10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15 seconds is a godsend.
> 
> A company I did some Python training for (they used Python for their integration and system testing, and a bit of unit testing) back in 2006 had a C++ product whose "from scratch" build time genuinely was 56 hours.

I've heard of overnight builds, and I've heard of _regression tests_ running for over a week, but I've never heard of builds being over 2 days. Ouch.

It has got to have been possible to have a shorter build than that. Of course, if their code was bad enough that the build was that long, it may have been rather disgusting code to clean up. But then again, maybe they genuinely had a legitimate reason for having the build take that long. I'd be very surprised though.

In any case, much as I like C++ (not as much as D, but I still like it quite a bit), its build times are undeniably horrible.

- Jonathan M Davis
July 25, 2012
On Wed, 25 Jul 2012 08:54:24 +0100
Russel Winder <russel@winder.org.uk> wrote:

> On Tue, 2012-07-24 at 20:35 -0700, Jonathan M Davis wrote: […]
> > I find it shocking that anyone would consider 15 seconds slow to compile for a large program. Yes, D's builds are lightning fast in general, and 15 seconds is probably a longer build, but calling 15 seconds "slow-to-compile" just about blows my mind. 15 seconds for a large program is _fast_. If anyone complains about a large program taking 15 seconds to build, then they're just plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of 10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15 seconds is a godsend.
> 
> A company I did some Python training for (they used Python for their integration and system testing, and a bit of unit testing) back in 2006 had a C++ product whose "from scratch" build time genuinely was 56 hours.
> 

Yea, my understanding is that full-build times measured in days are (or used to be, don't know if they still are) also typical of high-budget C++-based videogames.

July 25, 2012
On 7/25/12, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> I find it shocking that anyone would consider 15 seconds slow to compile for a large program.

It's not shocking if you're used to a fast edit-compile-run cycle which takes a few seconds and then starts to slow down considerably when you involve more and more templates. When I start working on a new D app it almost feels like programming in Python, the edit-compile-run cycle is really fast. But eventually the codebase grows, things slow down and I lose that "Python" feeling when it starts taking a dozen seconds to compile. It just breaks my concentration having to wait for something to finish.

Hell I can't believe how outdated the compiler technology is. I can play incredibly realistic and interactive 3D games in real-time with practically no input lag, but I have to wait a dozen seconds for a tool to convert lines of text into object code? From a syntax perspective D has moved forward but from a compilation perspective it hasn't innovated at all.
July 25, 2012
> I find it shocking that anyone would consider 15 seconds slow to compile for a
> large program. Yes, D's builds are lightning fast in general, and 15 seconds
> is probably a longer build, but calling 15 seconds "slow-to-compile" just
> about blows my mind. 15 seconds for a large program is _fast_. If anyone
> complains about a large program taking 15 seconds to build, then they're just
> plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of
> 10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15
> seconds is a godsend.

I agree with Andrej, 15 seconds *is* slow for a edit-compile-run cycle, although it might be understandable when editing code that uses a lot of CTFE and static foreach and reinstantiates templates with a crapton of different arguments.

I am neither spoiled nor naive to think it can be done in under 15 seconds. Fully rebuilding all my C# code takes less than 10 seconds (okay, not a big program, but several smaller programs).

Plus, it isn't just build times that concern me. In C# I'm used to having an IDE that immediately understands what I have typed, giving me error messages and keeping metadata about the program up-to-date within 2 seconds. I can edit a class definition in file A and get code completion for it in file B, 2 seconds later. I don't expect the IDE can ever do that if the compiler can't do a debug build in a similar timeframe.
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home