February 25, 2019
On 2/25/19 5:20 AM, Jacob Carlborg wrote:
> On 2019-02-25 02:04, Manu wrote:
> 
>> Why wouldn't you do it in the same pass as the .di output?
> 
> * Separation of concerns
> * Simplifying the compiler ("simplifying" is not the correct description, rather avoid making the compiler more complex)

Indeed so. There's also the network effect of tooling. Integrating within the compiler would be like the proverbial "giving someone a fish", whereas framing it as a tool that can be the first inspiring many others is akin to "teaching fishing".
February 25, 2019
On Mon, Feb 25, 2019 at 2:25 AM Jacob Carlborg via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On 2019-02-25 02:04, Manu wrote:
>
> > Why wouldn't you do it in the same pass as the .di output?
>
> * Separation of concerns
> * Simplifying the compiler ("simplifying" is not the correct
> description, rather avoid making the compiler more complex)
>
> I think the .di generation should be a separate tool as well.

Compile times already suck pretty hard, I feel like it's a very valuable feature that DMD can emit .di from the same compile pass. It's already done all the work... why repeat that build cost a second time for every source file?
February 25, 2019
On Mon, Feb 25, 2019 at 10:10 AM Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On 2/25/19 5:20 AM, Jacob Carlborg wrote:
> > On 2019-02-25 02:04, Manu wrote:
> >
> >> Why wouldn't you do it in the same pass as the .di output?
> >
> > * Separation of concerns

Are we planning to remove .di output?

> > * Simplifying the compiler ("simplifying" is not the correct description, rather avoid making the compiler more complex)

It seems theoretically very simple to me; whatever the .di code looks
like, I can imagine a filter for isExternCorCPP() on candidate nodes
when walking the AST. Seems like a pretty simple tweak of the existing
code... but I haven't looked at it.
I suspect 1 line in the AST walk code, and 99% of the job, a big ugly
block that emits a C++ declaration instead of the D declaration?

> Indeed so. There's also the network effect of tooling. Integrating within the compiler would be like the proverbial "giving someone a fish", whereas framing it as a tool that can be the first inspiring many others is akin to "teaching fishing".

That sounds nice, but it's bollocks though; give me dtoh, i'm about
95% less likely to use it. It's easy to add a flag to the command line
of our hyper-complex build, but reworking custom tooling into it, not
so much.
I'm not a build engineer, and I have no idea how I'd wire a second
pass to each source compile if I wanted to. Tell me how to wire that
into VS? How do I wite that into XCode? How do I express that in the
scripts that emit those project formats, and also makefiles and ninja?
How do I express that the outputs (which are .h files) are correctly
expressed as inputs of dependent .cpp compile steps?
At best, it would take me hours or days of implementing that
comprehensive solution, it might not be possible in all build
environments (XCode is a disaster), and I will never spare that time.
Give me dtoh, you give me problems, not solutions.

Certainly it *could* be a separate tool, but your argument that it's more enabling as a separate tool is the opposite of the truth. At best, it'll just waste more precious CI time and complicate our build.
February 25, 2019
On Mon, Feb 25, 2019 at 11:04:56AM -0800, Manu via Digitalmars-d wrote:
> On Mon, Feb 25, 2019 at 10:10 AM Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
[...]
> > Indeed so. There's also the network effect of tooling. Integrating within the compiler would be like the proverbial "giving someone a fish", whereas framing it as a tool that can be the first inspiring many others is akin to "teaching fishing".
> 
> That sounds nice, but it's bollocks though; give me dtoh, i'm about
> 95% less likely to use it. It's easy to add a flag to the command line
> of our hyper-complex build, but reworking custom tooling into it, not
> so much.
> I'm not a build engineer, and I have no idea how I'd wire a second
> pass to each source compile if I wanted to. Tell me how to wire that
> into VS? How do I wite that into XCode? How do I express that in the
> scripts that emit those project formats, and also makefiles and ninja?
> How do I express that the outputs (which are .h files) are correctly
> expressed as inputs of dependent .cpp compile steps?
[...]

<off-topic rant>
This is a perfect example of what has gone completely wrong in the world
of build systems. Too many assumptions and poor designs over an
extremely simple and straightforward dependency graph walk algorithm,
that turn something that ought to be trivial to implement into a
gargantuan task that requires a dedicated job title like "build
engineer".  It's completely insane, yet people accept it as a fact of
life. It boggles the mind.
</off-topic rant>


T

-- 
Those who don't understand Unix are condemned to reinvent it, poorly.
February 25, 2019
On 2/25/19 2:04 PM, Manu wrote:
> On Mon, Feb 25, 2019 at 10:10 AM Andrei Alexandrescu via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>>
>> On 2/25/19 5:20 AM, Jacob Carlborg wrote:
>>> On 2019-02-25 02:04, Manu wrote:
>>>
>>>> Why wouldn't you do it in the same pass as the .di output?
>>>
>>> * Separation of concerns
> 
> Are we planning to remove .di output?
> 
>>> * Simplifying the compiler ("simplifying" is not the correct
>>> description, rather avoid making the compiler more complex)
> 
> It seems theoretically very simple to me; whatever the .di code looks
> like, I can imagine a filter for isExternCorCPP() on candidate nodes
> when walking the AST. Seems like a pretty simple tweak of the existing
> code... but I haven't looked at it.
> I suspect 1 line in the AST walk code, and 99% of the job, a big ugly
> block that emits a C++ declaration instead of the D declaration?
> 
>> Indeed so. There's also the network effect of tooling. Integrating
>> within the compiler would be like the proverbial "giving someone a
>> fish", whereas framing it as a tool that can be the first inspiring many
>> others is akin to "teaching fishing".
> 
> That sounds nice, but it's bollocks though; give me dtoh, i'm about
> 95% less likely to use it. It's easy to add a flag to the command line
> of our hyper-complex build, but reworking custom tooling into it, not
> so much.

More like dog's ones, right? :o)

There are indeed arguments going either way. The point is a universe of tools can be built based on the compiler as a library, of which only a minority should be realistically integrated within the compiler itself.

That said, I'd take such work in either form!


Andrei
February 25, 2019
On Monday, 25 February 2019 at 19:28:54 UTC, H. S. Teoh wrote:
> On Mon, Feb 25, 2019 at 11:04:56AM -0800, Manu via Digitalmars-d wrote:
>> On Mon, Feb 25, 2019 at 10:10 AM Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> [...]
>> > Indeed so. There's also the network effect of tooling. Integrating within the compiler would be like the proverbial "giving someone a fish", whereas framing it as a tool that can be the first inspiring many others is akin to "teaching fishing".
>> 
>> That sounds nice, but it's bollocks though; give me dtoh, i'm about
>> 95% less likely to use it. It's easy to add a flag to the command line
>> of our hyper-complex build, but reworking custom tooling into it, not
>> so much.
>> I'm not a build engineer, and I have no idea how I'd wire a second
>> pass to each source compile if I wanted to. Tell me how to wire that
>> into VS? How do I wite that into XCode? How do I express that in the
>> scripts that emit those project formats, and also makefiles and ninja?
>> How do I express that the outputs (which are .h files) are correctly
>> expressed as inputs of dependent .cpp compile steps?
> [...]
>
> <off-topic rant>
> This is a perfect example of what has gone completely wrong in the world
> of build systems. Too many assumptions and poor designs over an
> extremely simple and straightforward dependency graph walk algorithm,
> that turn something that ought to be trivial to implement into a
> gargantuan task that requires a dedicated job title like "build
> engineer".  It's completely insane, yet people accept it as a fact of
> life. It boggles the mind.
> </off-topic rant>
>
>
> T

I don't think it is as simple as you make it seem. Especially when you need to start adding components that need to be build that isn't source code. Add in different different operating systems to that. Each have different requirements and how do you not make assumptions? You have to implement something in some way, you can't just not implement it. Doing so you weigh the benefits and drawbacks of certain implementations. That means a build system may not be suitable for all circumstances and it is impossible for it to. If you are building something simple it is very easy to say that build-systems are over complicated. If you don't have to worry about not rebuilding files that don't have to, then it becomes extremely simple, my <50 line build script is all I need but then that becomes more complicated if I don't want to rebuild files that don't need to. That is especially bad for something like D where you can import any where and with mixins that can import as well using from!"std.stdio".

It's easy to say build-systems are overly complicated until you actually work on a big project.
February 25, 2019
On Mon, Feb 25, 2019 at 10:14:18PM +0000, Rubn via Digitalmars-d wrote:
> On Monday, 25 February 2019 at 19:28:54 UTC, H. S. Teoh wrote:
[...]
> > <off-topic rant>
> > This is a perfect example of what has gone completely wrong in the world
> > of build systems. Too many assumptions and poor designs over an
> > extremely simple and straightforward dependency graph walk algorithm,
> > that turn something that ought to be trivial to implement into a
> > gargantuan task that requires a dedicated job title like "build
> > engineer".  It's completely insane, yet people accept it as a fact of
> > life. It boggles the mind.
> > </off-topic rant>
[...]
> I don't think it is as simple as you make it seem. Especially when you need to start adding components that need to be build that isn't source code.

It's very simple. The build description is essentially a DAG whose nodes represent files (well, any product, really, but let's say files for a concrete example), and whose edges represent commands that transform input files into output files. All the build system has to do is to do a topological walk of this DAG, and execute the commands associated with each edge to derive the output from the input.

This is all that's needed. The rest are all fluff.

The basic problem with today's build systems is that they impose arbitrary assumptions on top of this simple DAG. For example, all input nodes are arbitrarily restricted to source code files, or in some bad cases, source code of some specific language or set of languages. Then they arbitrarily limit edges to be only compiler invocations and/or linker invocations.  So the result is that if you have an input file that isn't source code, or if the output file requires invoking something other than a compiler/linker, then the build system doesn't support it and you're left out in the cold.

Worse yet, many "modern" build systems assume a fixed depth of paths in the graph, i.e., you can only compile source files into binaries, you cannot compile a subset of source files into an auxiliary utility that in turn generates new source files that are then compiled into an executable.  So automatic code generation is ruled out, preprocessing is ruled out, etc., unless you shoehorn all of that into the compiler invocation, which is a ridiculous idea.

None of these restrictions are necessary, and they only needlessly limit what you can do with your build system.

I understand that these assumptions are primarily to simplify the build description, e.g., by inferring dependencies so that you don't have to specify edges and nodes yourself (which is obviously impractical for large projects).  But these additional niceties ought to be implemented as a SEPARATE layer on top of the topological walk, and the user should not be arbitrarily prevented from directly accessing the DAG description.  The way so many build systems are designed is that either you have to do everything manually, like makefiles, which everybody hates, or the hood is welded shut and you can only do what the authors decide that you should be able to do and nothing else.


[...]
> It's easy to say build-systems are overly complicated until you actually work on a big project.

You seem to think that I'm talking out of an ivory tower.  I assure you I know what I'm talking about.  I have written actual build systems that do things like this:

- Compile a subset of source files into a utility;

- Run said utility to transform certain input data files into source
  code;

- Compile the generated source code into executables;

- Run said executables on other data files to transform the data into
  PovRay scene files;

- Run PovRay to produce images;

- Run post-processing utilities on said images to crop / reborder them;

- Run another utility to convert these images into animations;

- Install these animations into a target directory.

- Compile another set of source files into a different utility;

- Run said utility on input files to transform them to PHP input files;

- Run php-cli to generate HTML from said input files;

- Install said HTML files into a target directory.

- Run a network utility to retrieve the history of a specific log file
  and pipe it through a filter to extract a list of dates.

- Run a utility to transform said dates into a gnuplot input file for
  generating a graph;

- Run gnuplot to create the graph;

- Run postprocessing image utilities to touch up the image;

- Install the result into the target directory.

None of the above are baked-in rules. The user is fully capable of specifying whatever transformation he wants on whatever inputs he wants to produce whatever output he wants.  No straitjackets, no stupid hacks to work around stupid build system limitations. Tell it how you want your inputs to be transformed into outputs, and it handles the rest for you.

Furthermore, the build system is incremental: if I modify any of the above input files, it automatically runs the necessary commands to derive the updated output files AND NOTHING ELSE (i.e., it does not needlessly re-derive stuff that hasn't changed).  Better yet, if any of the intermediate output files are identical to the previous outputs, the build stops right there and does not needlessly recreate other outputs down the line.

The build system is also reliable: running the build in a dirty workspace produces identical products as running the build in a fresh checkout.  I never have to worry about doing the equivalent of 'make clean; make', which is a stupid thing to have to do in 2019. I have a workspace that hasn't been "cleaned" for months, and running the build on it produces exactly the same outputs as a fresh checkout.

There's more I can say, but basically, this is the power that having direct access to the DAG can give you.  In this day and age, it's inexcusable not to be able to do this.

Any build system that cannot do all of the above is a crippled build system that I will not use, because life is far too short to waste fighting with your build system rather than getting things done.


T

-- 
English has the lovely word "defenestrate", meaning "to execute by throwing someone out a window", or more recently "to remove Windows from a computer and replace it with something useful". :-) -- John Cowan
February 26, 2019

On 26/2/19 9:25 am, H. S. Teoh wrote:
> On Mon, Feb 25, 2019 at 10:14:18PM +0000, Rubn via Digitalmars-d wrote:
>> On Monday, 25 February 2019 at 19:28:54 UTC, H. S. Teoh wrote:
> [...]
>>> <off-topic rant>
>>> This is a perfect example of what has gone completely wrong in the world
>>> of build systems. Too many assumptions and poor designs over an
>>> extremely simple and straightforward dependency graph walk algorithm,
>>> that turn something that ought to be trivial to implement into a
>>> gargantuan task that requires a dedicated job title like "build
>>> engineer".  It's completely insane, yet people accept it as a fact of
>>> life. It boggles the mind.
>>> </off-topic rant>
> [...]
>> I don't think it is as simple as you make it seem. Especially when you
>> need to start adding components that need to be build that isn't
>> source code.
> 
> It's very simple. The build description is essentially a DAG whose nodes
> represent files (well, any product, really, but let's say files for a
> concrete example), and whose edges represent commands that transform
> input files into output files. All the build system has to do is to do a
> topological walk of this DAG, and execute the commands associated with
> each edge to derive the output from the input.
> 
> This is all that's needed. The rest are all fluff.
> 
> The basic problem with today's build systems is that they impose
> arbitrary assumptions on top of this simple DAG. For example, all input
> nodes are arbitrarily restricted to source code files, or in some bad
> cases, source code of some specific language or set of languages. Then
> they arbitrarily limit edges to be only compiler invocations and/or
> linker invocations.  So the result is that if you have an input file
> that isn't source code, or if the output file requires invoking
> something other than a compiler/linker, then the build system doesn't
> support it and you're left out in the cold.
> 
> Worse yet, many "modern" build systems assume a fixed depth of paths in
> the graph, i.e., you can only compile source files into binaries, you
> cannot compile a subset of source files into an auxiliary utility that
> in turn generates new source files that are then compiled into an
> executable.  So automatic code generation is ruled out, preprocessing is
> ruled out, etc., unless you shoehorn all of that into the compiler
> invocation, which is a ridiculous idea.
> 
> None of these restrictions are necessary, and they only needlessly limit
> what you can do with your build system.
> 
> I understand that these assumptions are primarily to simplify the build
> description, e.g., by inferring dependencies so that you don't have to
> specify edges and nodes yourself (which is obviously impractical for
> large projects).  But these additional niceties ought to be implemented
> as a SEPARATE layer on top of the topological walk, and the user should
> not be arbitrarily prevented from directly accessing the DAG
> description.  The way so many build systems are designed is that either
> you have to do everything manually, like makefiles, which everybody
> hates, or the hood is welded shut and you can only do what the authors
> decide that you should be able to do and nothing else.
> 
> 
> [...]
>> It's easy to say build-systems are overly complicated until you
>> actually work on a big project.
> 
> You seem to think that I'm talking out of an ivory tower.  I assure you
> I know what I'm talking about.  I have written actual build systems that
> do things like this:
> 
> - Compile a subset of source files into a utility;
> 
> - Run said utility to transform certain input data files into source
>    code;
> 
> - Compile the generated source code into executables;
> 
> - Run said executables on other data files to transform the data into
>    PovRay scene files;
> 
> - Run PovRay to produce images;
> 
> - Run post-processing utilities on said images to crop / reborder them;
> 
> - Run another utility to convert these images into animations;
> 
> - Install these animations into a target directory.
> 
> - Compile another set of source files into a different utility;
> 
> - Run said utility on input files to transform them to PHP input files;
> 
> - Run php-cli to generate HTML from said input files;
> 
> - Install said HTML files into a target directory.
> 
> - Run a network utility to retrieve the history of a specific log file
>    and pipe it through a filter to extract a list of dates.
> 
> - Run a utility to transform said dates into a gnuplot input file for
>    generating a graph;
> 
> - Run gnuplot to create the graph;
> 
> - Run postprocessing image utilities to touch up the image;
> 
> - Install the result into the target directory.
> 
> None of the above are baked-in rules. The user is fully capable of
> specifying whatever transformation he wants on whatever inputs he wants
> to produce whatever output he wants.  No straitjackets, no stupid hacks
> to work around stupid build system limitations. Tell it how you want
> your inputs to be transformed into outputs, and it handles the rest for
> you.
> 
> Furthermore, the build system is incremental: if I modify any of the
> above input files, it automatically runs the necessary commands to
> derive the updated output files AND NOTHING ELSE (i.e., it does not
> needlessly re-derive stuff that hasn't changed).  Better yet, if any of
> the intermediate output files are identical to the previous outputs, the
> build stops right there and does not needlessly recreate other outputs
> down the line.
> 
> The build system is also reliable: running the build in a dirty
> workspace produces identical products as running the build in a fresh
> checkout.  I never have to worry about doing the equivalent of 'make
> clean; make', which is a stupid thing to have to do in 2019. I have a
> workspace that hasn't been "cleaned" for months, and running the build
> on it produces exactly the same outputs as a fresh checkout.
> 
> There's more I can say, but basically, this is the power that having
> direct access to the DAG can give you.  In this day and age, it's
> inexcusable not to be able to do this.
> 
> Any build system that cannot do all of the above is a crippled build
> system that I will not use, because life is far too short to waste
> fighting with your build system rather than getting things done.
> 
> 
> T
> 

I'd be interested in your thoughts on
https://github.com/GrahamStJack/bottom-up-build

We use it here (commercial environment with deliverables into defence & commercial customers) (it was created as a response to poor existing build tools). It is rigid on preventing circularities, and deals with code-generation as part of its build cycle. It currently deals with a mixed C++/D codebase of well over 1/2 million lines. It is agnostic to the tool chain - that's part of the configuration - we just use c++ (clang & gcc), and D (dmd & ldc). Also allows codebase to be split amongst multiple repositories.

--ted
February 25, 2019
On Mon, Feb 25, 2019 at 2:55 PM H. S. Teoh via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Mon, Feb 25, 2019 at 10:14:18PM +0000, Rubn via Digitalmars-d wrote:
> > On Monday, 25 February 2019 at 19:28:54 UTC, H. S. Teoh wrote:
> [...]
> > > <off-topic rant>
> > > This is a perfect example of what has gone completely wrong in the world
> > > of build systems. Too many assumptions and poor designs over an
> > > extremely simple and straightforward dependency graph walk algorithm,
> > > that turn something that ought to be trivial to implement into a
> > > gargantuan task that requires a dedicated job title like "build
> > > engineer".  It's completely insane, yet people accept it as a fact of
> > > life. It boggles the mind.
> > > </off-topic rant>
> [...]
> > I don't think it is as simple as you make it seem. Especially when you need to start adding components that need to be build that isn't source code.
>
> It's very simple. The build description is essentially a DAG whose nodes represent files (well, any product, really, but let's say files for a concrete example), and whose edges represent commands that transform input files into output files. All the build system has to do is to do a topological walk of this DAG, and execute the commands associated with each edge to derive the output from the input.

Problem #1:
You don't know the edges of the DAG until AFTER you run the compiler
(ie, discovering imports/#includes, etc from the source code)
You also want to run the build with all 64 cores in your machine.

File B's build depends on file A's build output, but it can't know that until after it attempts (and fails) to build B...

How do you resolve this tension?
There's no 'simple' solution to this problem that I'm aware of. You
start to address this with higher-level structure, and that is not a
'simple DAG' anymore.

Now... whatever solution you concluded; express that in make, ninja, MSBuild, .xcodeproj...
February 25, 2019
On Mon, Feb 25, 2019 at 11:29 AM H. S. Teoh via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Mon, Feb 25, 2019 at 11:04:56AM -0800, Manu via Digitalmars-d wrote:
> > On Mon, Feb 25, 2019 at 10:10 AM Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> [...]
> > > Indeed so. There's also the network effect of tooling. Integrating within the compiler would be like the proverbial "giving someone a fish", whereas framing it as a tool that can be the first inspiring many others is akin to "teaching fishing".
> >
> > That sounds nice, but it's bollocks though; give me dtoh, i'm about
> > 95% less likely to use it. It's easy to add a flag to the command line
> > of our hyper-complex build, but reworking custom tooling into it, not
> > so much.
> > I'm not a build engineer, and I have no idea how I'd wire a second
> > pass to each source compile if I wanted to. Tell me how to wire that
> > into VS? How do I wite that into XCode? How do I express that in the
> > scripts that emit those project formats, and also makefiles and ninja?
> > How do I express that the outputs (which are .h files) are correctly
> > expressed as inputs of dependent .cpp compile steps?
> [...]
>
> <off-topic rant>
> This is a perfect example of what has gone completely wrong in the world
> of build systems. Too many assumptions and poor designs over an
> extremely simple and straightforward dependency graph walk algorithm,
> that turn something that ought to be trivial to implement into a
> gargantuan task that requires a dedicated job title like "build
> engineer".  It's completely insane, yet people accept it as a fact of
> life. It boggles the mind.
> </off-topic rant>

I couldn't agree more (that existing solutions make it harder than it
needs to be... not that it's actually easy in the first place), but
this is how it is. I can't change that, and I have work to do.
Is D an ecosystem that I use to get my work done, or is it one that I
use to do some intellectual masturbation on the weekend?
I've been tirelessly trying to make the former my reality for a long
time now... I've failed so far... I don't know what to do to correct
the trajectory.
I figure, if you just try and clear the hurdles; one by one, they will
eventually be cleared. But I've also become tired in the meantime.