December 12, 2018
Am 11.12.2018 um 20:46 schrieb H. S. Teoh:
> On Tue, Dec 11, 2018 at 11:26:45AM +0100, Sönke Ludwig via Digitalmars-d-announce wrote:
> [...]
> 
>> The main open point right now AFAICS is to make --parallel work with
>> the multiple-files-at-once build modes for machines that have enough
>> RAM. This is rather simple, but someone has to do it. But apart from
>> that, I think that the current state is relatively fine form a
>> performance point of view.
> 
> Wait, what does --parallel do if it doesn't compile multiple files at
> once?

It currently only works when building with `--build-mode=singleFile`, so compiling individual files in parallel instead of compiling chunks of files in parallel, which would be the ideal.
>>> Then it requires a specific source layout, with incomplete /
>>> non-existent configuration options for alternatives.  Which makes it
>>> unusable for existing code bases.  Unacceptable.
>>
>> You can define arbitrary import/source directories and list (or
>> delist) source files individually if you want. There are restrictions
>> on the naming of the output binary, though, is that what you mean?
> 
> Is this documented? I couldn't find any info on it the last time I
> looked.

There are the three directives sourcePaths, sourceFiles and excludedSourceFiles (the latter two supporting wildcard expressions) to control the list of files. Once an explicit sourcePaths directive is given, the folder that is possibly detected by default ("source"/"src") is also skipped. They are documented in the package format specs ([1], [2]).

> 
> Also, you refer to "the output binary". Does that mean I cannot
> generate multiple executables? 'cos that's a showstopper for me.

Compiling multiple executables currently either requires multiple invocations (e.g. with different configurations or sub packages specified), or a targetType "none" package that has one dependency per executable - the same configuration/architecture applies to all of them in that case. If they are actually build dependencies, a possible approach is to invoke dub recursively inside of a preBuildCommand.

But what I meant is that there is for example currently no way to customize the output binary base name ("targetName") and directory ("targetPath") depending on the build type.

>>> Worst of all, it does not support custom build actions, which is a
>>> requirement for many of my projects.  It does not support polyglot
>>> projects. It either does not support explicit control over exact
>>> build commands, or any such support is so poorly documented it might
>>> as well not exist.  This is not only unacceptable, it is a
>>> show-stopper.
>>
>> Do you mean modifying the compiler invocations that DUB generates or
>> adding custom commands (aka pre/post build/generate commands)?
> 
> Does dub support the following scenario?
> 
> - There's a bunch of .java files that have to be compiled with javac.
>     - But some of the .java files are generated by an external tool, that
>       must be run first, before the .java files are compiled.
> - There's a bunch of .d files in two directories.
>     - The second directory contains .d files that need to be compiled
>       into multiple executables, and they must be compiled with a local
>       (i.e., non-cross) compiler.
>     - Some of the resulting executables must be run first in order to
>       generate a few .d files in the first directory (in addition to
>       what's already there).
>     - After the .d files are generated, the first directory needs to be
>       compiled TWICE: once with a cross-compiler (LDC, targetting
>       Arm/Android), once with the local D compiler. The first compilation
>       must link with cross-compilation Android runtime libraries, and the
>       second compilation must link with local X11 libraries.
>        - (And obviously, the build products must be put in separate
>          subdirectories to prevent stomping over each other.)
> - After the .java and .d files are compiled, a series of tools must be
>    invoked to generate an .apk file, which also includes a bunch of
>    non-code files in resource subdirectories.  Then, another tool must be
>    run to align and sign the .apk file.
> 
> And here's a critical requirement: any time a file is changed (it can be
> a .java file, a .d file, or one of the resources that they depend on),
> all affected build products must be correctly updated. This must be done
> as efficiently as possible, because it's part of my code-compile-test
> cycle, and if it requires more than a few seconds or recompiling the
> entire codebase, it's a no-go.
> 
> If dub can handle this, then I'm suitably impressed, and retract most of
> my criticisms against it. ;-)

This will currently realistically require invoking an external tool such as make through a pre/post-build command (although it may actually be possible to hack this together using sub packages, build commands, and string import paths for the file dependencies). Most notably, there is a directive missing to specify arbitrary files as build dependencies.

Another feature that should be there to get this to a fully self-contained state is the ability to build `targetType "none"` packages that include custom build commands. Currently this will just result in a build error.

BTW, my plan for the Android part of this was to add support for plugins (fetchable from the registry, see [3] for a draft) that handle the details in a centralized manner instead of having to put that knowledge into the build recipe of each individual project. Unfortunately I still don't really have time to do substantial work on DUB.

So there is definitely still a enough valid criticism, especially as far as language-heterogeneous projects are concerned (see also [4]).

[1] https://dub.pm/package-format-sdl.html#build-settings
[2] https://dub.pm/package-format-json.html#build-settings
[3] https://github.com/dlang/dub/wiki/DEP3
[4] https://github.com/dlang/dub/wiki/DEP5
December 12, 2018
On Wednesday, 12 December 2018 at 09:38:55 UTC, Sönke Ludwig wrote:
> Am 11.12.2018 um 20:46 schrieb H. S. Teoh:
>> On Tue, Dec 11, 2018 at 11:26:45AM +0100, Sönke Ludwig via Digitalmars-d-announce wrote:
>> [...]
>> 
>>> The main open point right now AFAICS is to make --parallel work with
>>> the multiple-files-at-once build modes for machines that have enough
>>> RAM. This is rather simple, but someone has to do it. But apart from
>>> that, I think that the current state is relatively fine form a
>>> performance point of view.
>> 
>> Wait, what does --parallel do if it doesn't compile multiple files at
>> once?
>
> It currently only works when building with `--build-mode=singleFile`, so compiling individual files in parallel instead of compiling chunks of files in parallel, which would be the ideal.

If by "the ideal" you mean "compile the fastest", then you don't want to compile single files in parallel. I measured across multiple projects, and compiling per package (in the D sense, not the dub one) was fastest. Which is why it's the default with reggae.

December 12, 2018
Am 12.12.2018 um 15:53 schrieb Atila Neves:
> On Wednesday, 12 December 2018 at 09:38:55 UTC, Sönke Ludwig wrote:
>> Am 11.12.2018 um 20:46 schrieb H. S. Teoh:
>>> On Tue, Dec 11, 2018 at 11:26:45AM +0100, Sönke Ludwig via Digitalmars-d-announce wrote:
>>> [...]
>>>
>>>> The main open point right now AFAICS is to make --parallel work with
>>>> the multiple-files-at-once build modes for machines that have enough
>>>> RAM. This is rather simple, but someone has to do it. But apart from
>>>> that, I think that the current state is relatively fine form a
>>>> performance point of view.
>>>
>>> Wait, what does --parallel do if it doesn't compile multiple files at
>>> once?
>>
>> It currently only works when building with `--build-mode=singleFile`, so compiling individual files in parallel instead of compiling chunks of files in parallel, which would be the ideal.
> 
> If by "the ideal" you mean "compile the fastest", then you don't want to compile single files in parallel. I measured across multiple projects, and compiling per package (in the D sense, not the dub one) was fastest. Which is why it's the default with reggae.
> 

The sentence was ambiguous, but that's what I meant!
December 12, 2018
On Wednesday, 12 December 2018 at 09:38:55 UTC, Sönke Ludwig wrote:
> Most notably, there is a directive missing to specify arbitrary files as build dependencies.

I am working on a pull request:
https://github.com/andre2007/dub/commit/97161fb352dc1237411e2e7010447f8a9e817d48

Productive implementation is finished.
Only tests are missing.

Kind regards
André
December 12, 2018
On Wed, Dec 12, 2018 at 10:38:55AM +0100, Sönke Ludwig via Digitalmars-d-announce wrote:
> Am 11.12.2018 um 20:46 schrieb H. S. Teoh:
> > [...]
> > Wait, what does --parallel do if it doesn't compile multiple files
> > at once?
> 
> It currently only works when building with `--build-mode=singleFile`, so compiling individual files in parallel instead of compiling chunks of files in parallel, which would be the ideal.

Ah, I see.  But that should be relatively easy to fix, right?


[...]
> There are the three directives sourcePaths, sourceFiles and excludedSourceFiles (the latter two supporting wildcard expressions) to control the list of files. Once an explicit sourcePaths directive is given, the folder that is possibly detected by default ("source"/"src") is also skipped. They are documented in the package format specs ([1], [2]).

Thanks for the info.


> > Also, you refer to "the output binary". Does that mean I cannot generate multiple executables? 'cos that's a showstopper for me.
> 
> Compiling multiple executables currently either requires multiple invocations (e.g. with different configurations or sub packages specified), or a targetType "none" package that has one dependency per executable - the same configuration/architecture applies to all of them in that case. If they are actually build dependencies, a possible approach is to invoke dub recursively inside of a preBuildCommand.

Unfortunately, that is not a practical solution for me.  Many of my projects have source files that are generated by utilities that are themselves D code that needs to be compiled (and run) as part of the build.  I suppose in theory I could separate them into subpackages, and factor out the common code shared between these utilities and the main executable(s), but that is far too much work for something that IMO ought to be very simple -- since most of the utilities are single-file drivers with a small number of imports of some shared modules. Creating entire subpackages for each of them just seems excessive, esp. during development where the set of utilities / generated files may change a lot.  Creating/deleting a subpackage every time is just too much work for little benefit.

Also, does dub correctly support the case where some .d files are generated by said utilities (which would be dub subpackages, if we hypothetically went with that setup), but the output may change depending on the contents of some input data/config files? I.e., if I change a data file and run dub again, it ought to re-run the codegen tool and then recompile the main executable that contains the changed code.  This is a pretty important use-case for me, since it's kinda the whole point of having a codegen tool.

Compiling the same set of sources for multiple archs (with each arch possibly entailing a separate list of source files) is kinda a special case for my current Android project; generally I don't really need support for this. But solid support for codegen that properly percolates changes from input data down to recompiling executables is must-have for me.  Not being able to do this in the most efficient way possible would greatly hamper my productivity.


> But what I meant is that there is for example currently no way to
> customize the output binary base name ("targetName") and directory
> ("targetPath") depending on the build type.

But this shouldn't be difficult to support, right?  Though I don't particularly need this feature -- for the time being.


[...]
> > Does dub support the following scenario?
[...]
> This will currently realistically require invoking an external tool such as make through a pre/post-build command (although it may actually be possible to hack this together using sub packages, build commands, and string import paths for the file dependencies). Most notably, there is a directive missing to specify arbitrary files as build dependencies.

I see.  I think this is a basic limitation of dub's design -- it assumes a certain (common) compilation model of sources to (single) executable, and everything else is only expressible in terms of larger abstractions like subpackages.  It doesn't really match the way I work, which I guess explains my continuing frustration with using it.  I think of my build processes as a general graph of arbitrary input files being converted by arbitrary operations (not just compilation) into arbitrary output files. When I'm unable to express this in a simple way in my build spec, or when I'm forced to use tedious workarounds to express what in my mind ought to be something very simple, it distracts me from my focusing on my problem domain, and results in a lot of lost time/energy and frustration.


[...]
> BTW, my plan for the Android part of this was to add support for plugins (fetchable from the registry, see [3] for a draft) that handle the details in a centralized manner instead of having to put that knowledge into the build recipe of each individual project. Unfortunately I still don't really have time to do substantial work on DUB.
[...]

This would be nice, though without support for codegen tools and general data transformation actions, and having a core design that, AFAICT, is ill-suited to this sort of use, I'm afraid I can't see myself using dub in any serious way in the foreseeable future.  The most I can see myself using dub for, unfortunately, is just to fetch code.dlang.org dependencies via an empty dummy project, like I've been doing for my vibe.d project.

Though, on the brighter side, after updating dub to the latest version yesterday I did note that there has been a noticeable performance improvement.  I didn't measure exactly how much or how it compares to alternative build tools that I use, but as far as dub-compatible builds are concerned, things seem to be improving, and that's a good thing. Even if I won't be able to reap the benefits myself.


T

-- 
All problems are easy in retrospect.
December 12, 2018
On Wednesday, December 12, 2018 1:33:39 PM MST H. S. Teoh via Digitalmars-d- announce wrote:
> On Wed, Dec 12, 2018 at 10:38:55AM +0100, Sönke Ludwig via Digitalmars-d-
announce wrote:
> > Am 11.12.2018 um 20:46 schrieb H. S. Teoh:
> > > Does dub support the following scenario?
>
> [...]
>
> > This will currently realistically require invoking an external tool such as make through a pre/post-build command (although it may actually be possible to hack this together using sub packages, build commands, and string import paths for the file dependencies). Most notably, there is a directive missing to specify arbitrary files as build dependencies.
>
> I see.  I think this is a basic limitation of dub's design -- it assumes a certain (common) compilation model of sources to (single) executable, and everything else is only expressible in terms of larger abstractions like subpackages.  It doesn't really match the way I work, which I guess explains my continuing frustration with using it.  I think of my build processes as a general graph of arbitrary input files being converted by arbitrary operations (not just compilation) into arbitrary output files. When I'm unable to express this in a simple way in my build spec, or when I'm forced to use tedious workarounds to express what in my mind ought to be something very simple, it distracts me from my focusing on my problem domain, and results in a lot of lost time/energy and frustration.

What you're describing sounds like it would require a lot of extra machinery in comparison to how dub is designed to work. dub solves the typical use case of building a single executable or library (which is what the vast majority of projects do), and it removes the need to specify much of anything to make that work, making it fantastic for the typical use case but causing problems for any use cases that have more complicated needs. I really don't see how doing much of anything other than building a single executable or library from a dub project is going to result in anything other than frustration from the tool even if you can make it work. By the very nature of what you'd be trying to do, you'd be constantly trying to work around how dub is designed to work. dub can do more thanks to subprojects and some of the extra facilities it has for running stuff before or after the build, but all of that sort of stuff has to work around dub's core design, making it generally awkward to use, whereas to do something more complex, at some point, what you really want is basically a build script (albeit maybe with some extra facilities to properly detect whether certain phases of the build can be skipped).

I would think that to be fully flexible, dub would need to abstract things a bit more, maybe effectively using a plugin system for builds so that it's possible to have a dub project that uses dub for pulling in dependencies but which can use whatever build system works best for your project (with the current dub build system being the default). But of course, even if that is made to work well, it then introduces the problem of random dub projects then needing 3rd party build systems that you may or may not have (which is one of the things that dub's current build system mostly avoids).

On some level, dub is able to do as well as it does precisely because it's able to assume a bunch of stuff about D projects which is true the vast majority of the time, and the more it allows projects that don't work that way, the worse dub is going to work as a general tool, because it increasingly opens up problems with regards to whether you have the right tools or environment to build a particular project when using it as a dependency. However, if we don't figure out how to make it more flexible, then certain classes of projects really aren't going to work well with dub. That's less of a problem if the project is not for a library (and thus does not need to be a dub package so that other packages can pull it in as a dependency) and if dub provides a good way to just make libraries available as dependencies rather than requiring the the ultimate target be built with dub, but even then, it doesn't solve the problem when the target _is_ a library (e.g. what if it were for wrapping a C or C++ library and needed to do a bunch of extra code steps for code generation and needed multiple build steps).

So, I don't know. Ultimately, what this seems to come down to is that all of the stuff that dub does to make things simple for the common case make it terrible for complex cases, but making it work well for complex cases would almost certainly make it _far_ worse for the common case. So, I don't know that we really want to be drastically changing how dub works, but I do think that we need to make it so that more is possible with it (even if it's more painful, because it's doing something that goes against the typical use case).

The most obvious thing that I can think of is to make it work better to use dub to pull in libraries without actually using dub as the build tool for your project. That doesn't solve using dub to actually build your project when it needs a complex build, but it makes working with it better when it really doesn't make sense to use dub for more than pulling in dependencies. Beyond that, I suspect that if we really wanted to make dub truly flexible, we'd have to look into making it more plugin-based to allow alternate build systems, but that's a _much_ larger shift in how it works. Regardless, it would require manpower that isn't currently being targeted at dub.

- Jonathan M Davis




December 12, 2018
On Wed, Dec 12, 2018 at 02:52:09PM -0700, Jonathan M Davis via Digitalmars-d-announce wrote: [...]
> I would think that to be fully flexible, dub would need to abstract things a bit more, maybe effectively using a plugin system for builds so that it's possible to have a dub project that uses dub for pulling in dependencies but which can use whatever build system works best for your project (with the current dub build system being the default). But of course, even if that is made to work well, it then introduces the problem of random dub projects then needing 3rd party build systems that you may or may not have (which is one of the things that dub's current build system mostly avoids).

And here is the crux of my rant about build systems (earlier in this thread).  There is no *technical reason* why build systems should be constricted in this way. Today's landscape of specific projects being inextricably tied to a specific build system is completely the wrong approach.

Projects should not be tied to a specific build system.  Instead, whatever build tool the author uses to build the project should export a universal description of how to build it, in a standard format that can be imported by any other build system. This description should be a fully general DAG, that specifies all inputs, all outputs (including intermediate ones), and the actions required to get from input to output.

Armed with this build description, any build system should be able to import as a dependency any project built with any other build system, and be able to successfully build said dependency without even knowing what build system was originally used to build it or what build system it is "intended" to be built with.  I should be able to import a Gradle project, a dub project, and an SCons project as dependencies, and be able to use make to build everything. And my downstream users ought to be able to build my project with tup, or any other build tool they choose, without needing to care that I used make to build my project.

Seriously, building a lousy software project is essentially traversing a DAG of inputs and actions in topological order.  The algorithms have been known since decades ago, if not longer, and there is absolutely no valid reason why we cannot import arbitrary sub-DAGs and glue it to the main DAG, and have everything work with no additional effort, regardless of where said sub-DAGs came from.  It's just a bunch of nodes and labelled edges, guys!  All the rest of the complications and build system dependencies and walled gardens are extraneous and completely unnecessary baggage imposed upon a straightforward DAG topological walk that any CS grad could write in less than a day.  It's ridiculous.


> On some level, dub is able to do as well as it does precisely because it's able to assume a bunch of stuff about D projects which is true the vast majority of the time, and the more it allows projects that don't work that way, the worse dub is going to work as a general tool, because it increasingly opens up problems with regards to whether you have the right tools or environment to build a particular project when using it as a dependency. However, if we don't figure out how to make it more flexible, then certain classes of projects really aren't going to work well with dub.  That's less of a problem if the project is not for a library (and thus does not need to be a dub package so that other packages can pull it in as a dependency) and if dub provides a good way to just make libraries available as dependencies rather than requiring the the ultimate target be built with dub, but even then, it doesn't solve the problem when the target _is_ a library (e.g. what if it were for wrapping a C or C++ library and needed to do a bunch of extra code steps for code generation and needed multiple build steps).

Well exactly, again, the monolithic approach to building software is the wrong approach, and leads to arbitrary and needless limitations of this sort.  DAG generation should be decoupled from build execution.  You can use whatever tool or fancy algorithm you want to generate the lousy DAG, but once generated, all you have to do is to export it in a standard format, then any arbitrary number of build executors can read the description and run it.

Again I say: projects should not be bound to this or that build system. Instead, they should export a universal build description in a standard format.  Whoever wants to depend on said projects can simply import the build description and it will Just Work(tm). The build executor will know exactly how to build the dependency independently of whatever fancy tooling the upstream author may have used to generate the DAG.


> So, I don't know. Ultimately, what this seems to come down to is that all of the stuff that dub does to make things simple for the common case make it terrible for complex cases, but making it work well for complex cases would almost certainly make it _far_ worse for the common case. So, I don't know that we really want to be drastically changing how dub works, but I do think that we need to make it so that more is possible with it (even if it's more painful, because it's doing something that goes against the typical use case).
[...]

Dub's very design as a monolithic build tool, like many other build tools out there, confines it to such needless limitations. Developing it further in this direction is IMO a waste of time.

It's time we came back to the essentials.  Current monolithic build systems ought to be split into two parts:

(1) Dependency detector / DAG generator.  Do whatever you need to do here: dub-style scanning of .d imports, scan directories for .d files, tup-style instrumenting of the compiler, type it out yourself, whatever. The resulting DAG is stored in a standard format in a standard location in the source tree.

(2) Build executor: read in a standard DAG and employ a standard topological walk to transform inputs into outputs.

Every project should publish the DAG in a standard format in a standard location. Then whenever you need that project as a dependency, you just import its DAG into yours, and build away. Problem solved.

Now of course, in real-life implementation, there will be many more details that need to be taken care of.  But these are the essentials: standard DAG representation, and a standard DAG import function. Once you have these two, there are no longer silly arbitrary limitations that serve no other purpose than to build walled gardens and annoy users. Everyone can use their build tool of choice, and it all Just Works(tm). You can have any project depend on any other project, and nobody has to worry about installing the 100th variation on `make` just to make the dumb thing compile.

Topological walk on a DAG is a solved problem, and there is no logical reason why it should be so danged complicated.


T

-- 
Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.
December 13, 2018
On Wednesday, 12 December 2018 at 22:41:50 UTC, H. S. Teoh wrote:
> And here is the crux of my rant about build systems (earlier in this thread).  There is no *technical reason* why build systems should be constricted in this way. Today's landscape of specific projects being inextricably tied to a specific build system is completely the wrong approach.

You could reduce all this language-specific stuff to a way to generate a description of what needs to be built and what programs are suggested for doing it. This is quite a layer of indirection, and that means more work. "I can do less work" is a technical reason.

Ensuring that your output is widely usable is also extra work.

There is also a psychological reason: when you're trying to solve a set of problems and you are good at code, it's easy to tunnel vision into writing all the code yourself. It can even, sometimes, be easier to write that new code than to figure out how to use something that already exists (if you think you can gloss over a lot of edge cases or support a lot fewer pieces, for instance).

This is probably why Dub has its own repository instead of using Maven.

> Seriously, building a lousy software project is essentially traversing a DAG of inputs and actions in topological order.  The algorithms have been known since decades ago, if not longer, and there is absolutely no valid reason why we cannot import arbitrary sub-DAGs and glue it to the main DAG, and have everything work with no additional effort, regardless of where said sub-DAGs came from.  It's just a bunch of nodes and labelled edges, guys!  All the rest of the complications and build system dependencies and walled gardens are extraneous and completely unnecessary baggage imposed upon a straightforward DAG topological walk that any CS grad could write in less than a day.  It's ridiculous.

If any CS grad student could write it in a day, you could say that having a generic DAG isn't useful or interesting. That makes it seem pretty much useless to pull that out into a separate software project, and that's a psychological barrier.
December 16, 2018
On Wednesday, 12 December 2018 at 22:41:50 UTC, H. S. Teoh wrote:
> It's time we came back to the essentials.  Current monolithic build systems ought to be split into two parts:
>
> (1) Dependency detector / DAG generator.  Do whatever you need to do here: dub-style scanning of .d imports, scan directories for .d files, tup-style instrumenting of the compiler, type it out yourself, whatever. The resulting DAG is stored in a standard format in a standard location in the source tree.
>
> (2) Build executor: read in a standard DAG and employ a standard topological walk to transform inputs into outputs.

You're missing (0) the package manager, which is probably the biggest advantage "monolothic" build tools like dub, cargo, and npm have compared to language-agnostic ones like make.

Granted, there's no reason dub couldn't function solely as a package manager and DAG generator, while leaving the actual build execution to some other tool.

> Every project should publish the DAG in a standard format in a standard location.

You mean a Makefile? :^)

> Now of course, in real-life implementation, there will be many more details that need to be taken care of.  But these are the essentials: standard DAG representation, and a standard DAG import function.

There's something important you're glossing over here, which is that, in the general case, there's no single obvious or natural way to compose two DAGs together.

For example: suppose project A's DAG has two "output" vertices (i.e., they have no outgoing edges), one corresponding to a "debug" build and one corresponding to a "release" build. Now suppose project B would like to depend on project A. For this to happen, our hypothetical DAG import function needs to add one or more edges that connect A's DAG to B's DAG. The question is, how many edges, and which vertices should these edges connect?

If we have out-of-band knowledge about A and B--for example, if we know they're both dub packages--then this is a relatively straightforward question to answer. (Though not completely trivial; see for example how dub handles the -unittest flag.) But if A and B can be absolutely *any* kind of project, written in any language, using any build tools, and able to produce any number of "outputs," there's no way to guarantee you've wired up the DAGs correctly short of doing it by hand.

One way to overcome this problem is to restrict projects to the subset of DAGs that have only one "output" vertex--but then, of course, you have to say goodbye to convenient debug builds, test builds, cross compiling, etc. So the cure may be worse than the disease.
December 16, 2018
On Sun, 16 Dec 2018 00:17:55 +0000, Paul Backus wrote:
> On Wednesday, 12 December 2018 at 22:41:50 UTC, H. S. Teoh wrote:
>> It's time we came back to the essentials.  Current monolithic build systems ought to be split into two parts: [...]
> You're missing (0) the package manager, which is probably the biggest advantage "monolothic" build tools like dub, cargo, and npm have compared to language-agnostic ones like make.

If I were to make a new build tool and wanted package manager integration, I'd choose Maven as the backend. This would no doubt be more frustrating than just making my own, but there would hopefully be fewer bugs on the repository side.

(I might separately make my own Maven-compatible backend.)

> There's something important you're glossing over here, which is that, in the general case, there's no single obvious or natural way to compose two DAGs together.

You do it like Bazel.

In Bazel, you have a WORKSPACE file at the root of your project. It describes, among other things, what dependencies you have. This might, for instance, be a git URL and revision. All this does is expose that package's build rules to you.

Separately, you have build rules. Each build rule can express a set of dependencies on other build rules. There's no difference between depending on a rule that your own code defines and depending on one from an external dependency.

It might be appropriate to have a hint on DAG nodes saying that this is the default thing that you should probably depend on if you're depending on the package. A lot of projects only produce one artifact for public consumption.