August 10, 2019
On Friday, 9 August 2019 at 09:22:12 UTC, Dukc wrote:
> So I'm interested, are virtually-instant compiles something one has to experience to understand their benefit?

Here is recent article on that topic: http://jsomers.net/blog/speed-matters
August 10, 2019
On Saturday, 10 August 2019 at 10:31:41 UTC, MrSmith wrote:
> On Friday, 9 August 2019 at 09:22:12 UTC, Dukc wrote:
>> So I'm interested, are virtually-instant compiles something one has to experience to understand their benefit?
>
> Here is recent article on that topic: http://jsomers.net/blog/speed-matters

Thanks! That one needs a deep thought.
August 12, 2019
On Friday, 9 August 2019 at 13:42:27 UTC, Ethan wrote:
> On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
>> .di files are usually auto-generated, not needed, and AFAIK not that used.
>
> They're needed. At least, the idea of them is needed.
>
> Right now, it's just a glorified implementation stripper. Gets rid of everything between {} that aren't aggregate/enum definitions, and leaves the slow and expensive mixins there to be recompiled every. single. time. they're. imported.
>
> .di files need to be redefined to represent a module after mixins have resolved.

Hmm, interesting! Is that you volunteering to work on that? :P
August 12, 2019
On Saturday, 10 August 2019 at 08:17:03 UTC, Russel Winder wrote:
> On Fri, 2019-08-09 at 13:17 +0000, Atila Neves via Digitalmars-d wrote:
>> 
> […]
>>  From experience, it makes me work much slower if I don't get
>> results in less than 100ms. If I'm not mistaken, IBM did a study
>> on this that I read once but never managed to find again about
>> how much faster people worked on short feedback cycles.
>> 
> […]
>
> Most people in the world couldn't tell the difference between 100ms and 200ms.

Musicians can ;) The threshold for noticeable latency in an audio interface is ~10ms.

Imagine if it took 100ms to from hitting a key to seeing the character on screen. That's almost how slow compile times feel to me.

> But this leads to a whole off-theme discussion about psychology, reaction times, and "won't wait" times.

Indeed.

> Also of course: https://www.xkcd.com/303/

Classic.


August 12, 2019
On Friday, 9 August 2019 at 16:45:05 UTC, Russel Winder wrote:
> On Fri, 2019-08-09 at 08:37 +0000, Atila Neves via Digitalmars-d wrote: […]
>> I don't think it is. Fast is relative, and it's death by a thousand cuts for me at the moment. And this despite the fact that I use reggae as much as I can, which means I wait less than most other D programmers on average to get a compiled binary!
>
> Is there any chance of getting Reggae into D-Apt or better still the standard Debian Sid repository along with ldc2, GtkD, and GStreamerD, so it can be a standard install for anyone using Debian or Ubuntu and so get some real traction in the D build market?

That's a good question. The thing is I basically want to rewrite reggae from scratch, and, worse than that, am trying to figure out how to best leverage the work done in this paper:

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwjmtM7hhf3jAhXr0qYKHZRUCoYQFjAAegQIABAC&url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fuploads%2Fprod%2F2018%2F03%2Fbuild-systems-final.pdf&usg=AOvVaw2j71hXjOoEQLNNjvEOp_RQ

(Build Systems à la carte by Microsoft Research, in which they show how to compose different types of build systems in Haskell)

It's clear to me that the current way of building software is broken in the sense that we almost always do more work than needed. My vision for the future is a build system so smart that it only rebuilds mod1.d if mod0.d was modified in such a way that it actually needs to. For instance, if the signatures of any functions imported are changed. I think that paper is a step in the right direction by abstracting away how changes are computed.
August 12, 2019
On Mon, 2019-08-12 at 09:47 +0000, Atila Neves via Digitalmars-d wrote:
> 
[…]
> That's a good question. The thing is I basically want to rewrite reggae from scratch, and, worse than that, am trying to figure out how to best leverage the work done in this paper:
> 
> https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwjmtM7hhf3jAhXr0qYKHZRUCoYQFjAAegQIABAC&url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fuploads%2Fprod%2F2018%2F03%2Fbuild-systems-final.pdf&usg=AOvVaw2j71hXjOoEQLNNjvEOp_RQ
> 
> (Build Systems à la carte by Microsoft Research, in which they show how to compose different types of build systems in Haskell)
> 
> It's clear to me that the current way of building software is broken in the sense that we almost always do more work than needed. My vision for the future is a build system so smart that it only rebuilds mod1.d if mod0.d was modified in such a way that it actually needs to. For instance, if the signatures of any functions imported are changed. I think that paper is a step in the right direction by abstracting away how changes are computed.

Anything Simon is involved in is always worth looking at. The focus of the document is though Microsoft and Haskell, so lots of good stuff, but potentially missing lots of other good stuff.

Although slightly different in some ways, Gradle has had (and continues to have) a huge amount of work in it to try and minimize dependencies being seen as causing a rebuild. Gradle is principally a JVM-oriented system, but a very big client funded Gradle working with C++.

SCons (and Waf) have done quite a lot of work on build minimization (especially Parts which is an addition over SCons), I am not sure how much of this got into Meson – I guess that partly depends on what Ninja does.

I have not used Tup, but it should have a role in any review given it's claims of minimising work.

The core question is though given Dub and Meson, can Reggae gain real traction in the D build arena possibly replacing Dub as the default D project build controller? Is a rewrite of Dub more cost effective than a rewrite of Reggae?

-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



August 12, 2019
On 8/8/2019 3:10 AM, Russel Winder wrote:
> So if C++ has this dependency problem why doesn't D, Go, or Rust?

Great question! The answer is it does, it's just that you don't notice it.

The trick, if you can call it that, is the same as how D can compile:

  int x = y;
  enum y = 3;

In other words, D can handle forward references, and even many cases of circular references. (C++, weirdly, can handle forward references in struct definitions, but nowhere else.)

D accomplishes this with 3 techniques:

1. Parsing is completely separate from semantic analysis. I.e. all code can be lexed/parsed in parallel, or in any order, without concern for dependencies.

2. Semantic analysis is lazy, i.e. it is done symbol-by-symbol on demand. In the above example, when y is encountered, the compiler goes "y is an enum, I'd better suspend the semantic analysis of x and go do the semantic analysis for y now".

3. D is able to partially semantically analyze things. This comes into play when two structs mutually refer to each other. It does this well enough that only rarely do "circular reference" errors come up that possibly could be resolved.

D processes imports by reading the file and doing a parse on them. Only "on demand" does it do semantic analysis on them. My original plan was to parse them and write a binary file, and then the import would simply and fastly load the binary file. But it turned out the code to load the binary file and turn it into an AST wasn't much better than simply parsing the source code, so I abandoned the binary file approach.

Another way D deals with the issue is you can manually prepare a "header" file for a module, a .di file. This makes a lot of sense for modules full of code that's irrelevant to the user, like the gc implementation.

-------------------

Some languages deal with this issue by disallowing circular imports entirely. Then the dependency graph is a simple acyclic graph, i.e. a tree. This method does have its attractions, as it forces the programmer to carefully decompose the project into properly encapsulated units. On the other hand, it makes it very difficult to interface smoothly with C and C++ files, which typically each just #include all the headers in the project.

August 12, 2019
On Mon, Aug 12, 2019 at 12:33:07PM -0700, Walter Bright via Digitalmars-d wrote: [...]
> 1. Parsing is completely separate from semantic analysis. I.e. all code can be lexed/parsed in parallel, or in any order, without concern for dependencies.

This is a big part of why C++'s must-be-parsed-before-it-can-be-lexed syntax is a big hindrance to meaningful progress.  The only way such a needlessly over-complex syntax can be handled is a needlessly over-complex lexer/parser combo, which necessarily results in needlessly over-complex corner cases and other such gotchas.  Part of this nastiness is the poor choice of template syntax (overloading '<' and '>' to be delimiters in addition to their original roles of comparison operators), among several other things.


> 2. Semantic analysis is lazy, i.e. it is done symbol-by-symbol on demand. In the above example, when y is encountered, the compiler goes "y is an enum, I'd better suspend the semantic analysis of x and go do the semantic analysis for y now".

This is an extremely powerful approach, and many may not be aware that it's a powerful cornerstone on which D's meta-programming capabilities are built.  It's a beautiful example of the principle of "laziness": don't do the work until it's actually necessary. Something too many applications of today fail to observe, with their own detriment.


> 3. D is able to partially semantically analyze things. This comes into play when two structs mutually refer to each other. It does this well enough that only rarely do "circular reference" errors come up that possibly could be resolved.

I wasn't aware of this before, but it makes sense, in retrospect.


> D processes imports by reading the file and doing a parse on them. Only "on demand" does it do semantic analysis on them. My original plan was to parse them and write a binary file, and then the import would simply and fastly load the binary file. But it turned out the code to load the binary file and turn it into an AST wasn't much better than simply parsing the source code, so I abandoned the binary file approach.
[...]

That's an interesting data point.  I've been toying with the same idea over the years, but it seems that's a dead-end approach.  In any case, from what I've gathered parsing and lexing are nowhere near the bottleneck as far as D compilation is concerned (though it might be different for a language like C++, but even there I doubt it would play much of a role in the overall compilation performance, there being far more complex problems in semantic analyses and codegen that require algorithms with non-trivial running times).  There are bigger fish to fry elsewhere in the compiler.

(Like *cough*memory usage*ahem*, that to this day makes D a laughing stock on low-memory systems. Even with -lowmem the situation today isn't much better from a year or two ago. I find my hands tied w.r.t. D as far as low-memory systems are concerned, and that's a very sad thing, since I'd have liked to replace many things with D. Currently I can't, because either dmd outright won't run and I have to build executables offline and upload them, or else I have to build the dmd toolchain offline and upload it to the low-memory target system. Both choices suck.)


T

-- 
Why can't you just be a nonconformist like everyone else? -- YHL
August 13, 2019
On 2019-08-12 11:47, Atila Neves wrote:

> That's a good question. The thing is I basically want to rewrite reggae from scratch, and, worse than that, am trying to figure out how to best leverage the work done in this paper:
> 
> https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwjmtM7hhf3jAhXr0qYKHZRUCoYQFjAAegQIABAC&url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fuploads%2Fprod%2F2018%2F03%2Fbuild-systems-final.pdf&usg=AOvVaw2j71hXjOoEQLNNjvEOp_RQ 
> 
> 
> (Build Systems à la carte by Microsoft Research, in which they show how to compose different types of build systems in Haskell)
> 
> It's clear to me that the current way of building software is broken in the sense that we almost always do more work than needed. My vision for the future is a build system so smart that it only rebuilds mod1.d if mod0.d was modified in such a way that it actually needs to. For instance, if the signatures of any functions imported are changed. I think that paper is a step in the right direction by abstracting away how changes are computed.

I suggest you look into incremental compilation, if you haven't done that already. I'm not talking about recompiling a whole file and relink. I'm talking incremental lexing, parsing, semantic analyzing and code generation. That is, recompile only those characters that have changed in a source file and what depends on it.

For example: "void foo()". If "foo" is changed to "bar" then the compiler only needs to lex those three characters: "bar". Then run the rest of the compiler only on AST nodes that is dependent on the "bar" token.

The Eclipse Java compiler (JDT) has a pretty interesting concept. It allows to compile and run invalid code. I'm guessing a bit here, but I assume if a function has a valid signature and the body is syntactically valid but semantically it contains errors. The compiler will replace the body of the function with a runtime error. If that function is never called at runtime there is no problem. Similar to how templates work in D.

-- 
/Jacob Carlborg
August 13, 2019
On 2019-08-12 21:58, H. S. Teoh wrote:

> This is a big part of why C++'s must-be-parsed-before-it-can-be-lexed
> syntax is a big hindrance to meaningful progress.  The only way such a
> needlessly over-complex syntax can be handled is a needlessly
> over-complex lexer/parser combo, which necessarily results in needlessly
> over-complex corner cases and other such gotchas.  Part of this
> nastiness is the poor choice of template syntax (overloading '<' and '>'
> to be delimiters in addition to their original roles of comparison
> operators), among several other things.

I don't know how this is implemented in a C++ compiler but can't the lexer use a more abstract token that includes both the usage for templates and for comparison operators? The parser can then figure out exactly what it is.

DMD is doing something similar, but at a later stage. For example, in the following code snippet: "int a = foo;", "foo" is parsed as an identifier expression. Then the semantic analyzer figures out if "foo" is a function call or a variable.

-- 
/Jacob Carlborg