Thread overview
Some thoughts on dub housekeeping tasks for future design work
Dec 11, 2022
Rikki Cattermole
Dec 11, 2022
Sebastiaan Koppe
Dec 12, 2022
rikki cattermole
December 11, 2022

Hello again!

After debugging an already fixed in ~master bug, I had a thought kinda stuck in my mind about how dub is working internally although extremely poorly. It's related to how the metadata is actually pretty simple in its behavior, but not abstracted properly for the build manager. The metadata has one of three behaviors (although it can be in more than one category). It can go up, down, or nowhere (only that build).

A lot of the metadata is already abstracted into a single struct, BuildSettings, that's not an issue. What we want to do is move into having three instances per (sub)package being built. By doing this we can remove a significant amount of busy work that goes on within dub in duplication of BuildSettings generation and allow a much simpler process when building. Just pass in an array of BuildSettings and it'll merge them as part of the build. The responsibility of picking which BuildSettings apply to a particular build is the responsibility of the package manager, not the build manager.

Right now a lot of this logic is all intermixed with the package manager itself and done very badly. It's going to be a lot of work to untangle this and could easily break people's builds. So before anything structural like this can be done, anything that can be split out like leaf modules needs to be done. I've identified some housekeeping tasks that if done would make this process a lot easier or have quite significant benefits both currently and after such work is complete.

dub:

  • getBestPackage simplify down to one request instead of two (move logic to dub-registry)
  • Introduce caching mechanism of downloaded artifacts in file system, must be class and swapped out at runtime
  • Registered non-registry package sources must be able to be compiled into a JSON blob full of package versions ext. info
  • Able to consume compressed (zip) copies of package information (one per file in zip) in lieu of a registry
  • Decouple and split out into own sub package compilers, packagesuppliers, cache, platform, dependencyresolver, semver packages/modules

dub-registry:

  • Rewrite cache to use dub's (new) cache mechanism
  • Move the repositories package to dub and rewrite as required to fit both purposes
  • Use dub's new repositories that were moved here, minus registry (must be configurable at runtime)
  • Able to produce compressed (zip) copy of all package version information (one file in zip per package)

If we can do these things, and split up BuildSettings in preparation for directionality (up/down/nowhere) with arrays support; we might have a way to do invasive structural changes in the package management side to break up the behavior with clearer divisions of build/package manager. But it's going to be slow going, and it's going to have to be bottom-up from the leaf modules.

One of the reasons we have to start bottom up is because dub has two abstractions for metadata. Recipes are what the dub files are represented by, and BuildSettings for the build itself. Ideally, the package manager would not know about the build manager's metadata (although it would tell the build manager to load its data), and the build manager wouldn't know about the package manager's metadata, which isn't the case right now.

Thoughts?

December 11, 2022

On Sunday, 11 December 2022 at 12:43:47 UTC, Rikki Cattermole wrote:

>

Hello again!

[...]

dub-registry:

As for the dub-registry i can only advice to avoid having it do more things than it already does.

In fact, I can only suggest to make it considerably dumber.

Right now it partakes in dependency resolution by recursively resolving transitive dependencies. This has considerable resource requirements due to the reconstruction of relative big json snippets.

This thrashes a lot of memory and puts unnecessary pressure on the GC, resulting in high memory requirements for a relatively simple registry.

The tradeoff is having the client make multiple requests, however, they can be done semi-parallel and, without requiring transformations, can be streamed straight out of the database.

December 12, 2022
Dub isn't thread-safe. It would require significant structural changes to get any form of parallelism going on.

But the problem is that you have to retrieve metadata from the registry, pass that into getBestPackage and from there perform additional downloads. It is absolutely insane to have to do this when the logic is already compiled into dub-registry!