On 9 Apr 2019, at 03:58, Mike Franklin via dmd-internals <dmd-internals@puremagic.com> wrote:

That's an awfully broad question.  I'm no expert in DMD, but I do dabble in the source code often.  

I'd start here: https://wiki.dlang.org/DMD_Source_Guide  The information there is old but still relevant.

This is my mental model and understanding

source code --> lexer.d --> tokesn
tokens --> parse.d --> expressions (see expression.d)
expressions --> expressionsem.d --> lowered expressions
lowered expressions --> e2ir --> intermediate representation (I'm awufully vauge on this)
After that it's in the backend, and that's pretty much a black box to me.

I can add to the above that the parser is a subclass of the lexer. The parser will retrieve the next token when it needs to, to continue parsing. The parsing is initiated from `Module.parse` in `dmodule.d` [1] which will then call `Parser.parseModule` in `parse.d`. After that the semantic analyzing phase will begin with the call to `dsymbolSemantic` for each module given to the compiler. This is all available in `mars.d`, which contains the entry point of the compiler.

Tracing with `printf` is your friend, which is why you see so many commented `printf` statements in the source code.  You should be able to print out almost anything with with the `toChars()` method (e.g. `printf("%s\n", whatever.toChars());`)

For debugging I can also recommend tracing with `std.stdio.writeln` and `std.stdio.writefln`.  Although it's not allowed to use Phobos in the compiler, it’s perfectly fine to use it during development and debugging. Just make sure that all traces of Phobos are gone when making a PR. Most classes will have a `toString` method which is doing exactly the same thing as `toChars`, but it will return a D string instead of a C string. `writeln` knows about this and will call `toString` automatically. When printing an enum value with `writeln`, it will print the name of the enum member instead of it’s value, I like that very much. It’s also much easier to print D string with `writeln` than with `printf`. This is especially useful when we’re trying to get rid of all C strings and replace them with D strings.

[1] https://github.com/dlang/dmd/blob/b4429221e0b0024e5f0b99e084e075f02972e19a/src/dmd/dmodule.d#L654

-- 
/Jacob Carlborg