cycle dependencies (page 2)

On Thursday, 31 May 2018 at 23:17:20 UTC, Steven Schveighoffer wrote: > Hm... I just had a crazy idea: what if D had a switch that allowed it to store a dependency graph of constructors into a json file, and then when you link, there is a wrapper which consumes all these files, runs the cycle detection and sort, and then compiles a perfectly sorted moduleinfo section to be included in the binary (obviously, written in D and compiled by the compiler). This is how languages get custom object formats and custom linkers. A fly in the ointment is .di files. This works today and does the right thing: // bar.d import modulewithstaticctor; shared static this() {} string something = modulewithstaticctor.someValue; // bar.di // no static ctor, no imports string something; // foo.d import bar, std.stdiod; void main() { writefln(something); } So mixing that dependency graph with .di files is "here be dragons" territory. If the dependency data were inserted into the object files instead (which it should be? that's what ModuleInfo is), then the compiler could potentially read this out before linking, and we could be sure that the resulting order is correct. Like, insert a weak symbol into each normal object file saying there's no dependency graph, and then the compiler can run on a set of object files to produce a new object file with the complete dependency graph as a strong symbol. But that's more work.

June 01, 2018

Re: cycle dependencies

Posted by Steven Schveighoffer
in reply to Neia Neutuladh

Permalink

Steven Schveighoffer

Posted in reply to Neia Neutuladh

Permalink

On 5/31/18 8:49 PM, Neia Neutuladh wrote:
> On Thursday, 31 May 2018 at 23:17:20 UTC, Steven Schveighoffer wrote:
>> Hm... I just had a crazy idea: what if D had a switch that allowed it to store a dependency graph of constructors into a json file, and then when you link, there is a wrapper which consumes all these files, runs the cycle detection and sort, and then compiles a perfectly sorted moduleinfo section to be included in the binary (obviously, written in D and compiled by the compiler).
> 
> This is how languages get custom object formats and custom linkers.

Yep. I know that this is something that we need to be wary of. The proposed file format, however, would not be something like an object or even a linker, just a text file that hints in how to link these together without having to do the sort.

> 
> A fly in the ointment is .di files. This works today and does the right thing:

This actually is not a problem, because the dependency tree doesn't depend on whether the imported module has static ctors. It's how this works today -- each module only knows whether the modules has static ctors, and what imports it has. It doesn't go further than that.

The dependency file would keep this same mechanism, we wouldn't be doing more interpretation.

> If the dependency data were inserted into the object files instead (which it should be? that's what ModuleInfo is), then the compiler could potentially read this out before linking, and we could be sure that the resulting order is correct. Like, insert a weak symbol into each normal object file saying there's no dependency graph, and then the compiler can run on a set of object files to produce a new object file with the complete dependency graph as a strong symbol.

It's an interesting idea, the only benefit really being to avoid having separate files for the data. I don't know enough about object files to be able to speak coherently about it.

I have a much better feeling, however, about having a more readable/understandable file format than object files. It's definitely a lot easier to deal with something like a JSON file than an object file.

> But that's more work.

There's nothing in this proposal that is less than a lot of work :) So putting a bit more on top is OK, if it makes things more useable.

This isn't something to flesh out buried in a sub-thread anyway. I'm going to put together a more complete description and proposal. Not a DIP yet, because I want to gauge interest first.

-Steve

On Friday, 1 June 2018 at 12:50:31 UTC, Steven Schveighoffer wrote: >> A fly in the ointment is .di files. This works today and does the right thing: > > This actually is not a problem, because the dependency tree doesn't depend on whether the imported module has static ctors. It's how this works today -- each module only knows whether the modules has static ctors, and what imports it has. It doesn't go further than that. Thanks to .di files, when I compile my executable, I don't always know which modules import which other modules. That was half the point of my example. Are you talking about giving the initialization order for a subset of modules? Because that wouldn't give much of a speed increase. Or do you need all of your dependencies to have their own dependency order, and the compiler can concatenate them? As an alternative, you could produce the initialization order by running the program and seeing what order it uses.

On 6/1/18 11:42 AM, Neia Neutuladh wrote: > On Friday, 1 June 2018 at 12:50:31 UTC, Steven Schveighoffer wrote: >>> A fly in the ointment is .di files. This works today and does the right thing: >> >> This actually is not a problem, because the dependency tree doesn't depend on whether the imported module has static ctors. It's how this works today -- each module only knows whether the modules has static ctors, and what imports it has. It doesn't go further than that. > > Thanks to .di files, when I compile my executable, I don't always know which modules import which other modules. That was half the point of my example. The .di file is just an interface, it doesn't know what's actually compiled in the binary. To put it another way, the compiler only generates a ModuleInfo (or dependency modules) for .d files. .di files are simply a public API for the .d files. > Are you talking about giving the initialization order for a subset of modules? Because that wouldn't give much of a speed increase. Or do you need all of your dependencies to have their own dependency order, and the compiler can concatenate them? Speed is one aspect, but the more important aspect is to do more fine-grained dependency tracking. Doing it on a function level vs. simply assuming all things in the module depend on all things in the dependent modules, allowing for instance module cycles where there is no real cycle between the data initialized. That's not going to work well with today's mechanisms because it would be much more expensive both in runtime and memory usage, and even further highlight the drawback that we are solving the same static problem on every run of a program. > As an alternative, you could produce the initialization order by running the program and seeing what order it uses. Yes, but this still involves pushing all the dependency data into the binary. All we should care about at runtime is the order to execute ctors. We could even eliminate the needless storage of all the dependency modules that is there now. -Steve

On Friday, 1 June 2018 at 17:59:21 UTC, Steven Schveighoffer wrote: > The .di file is just an interface, it doesn't know what's actually compiled in the binary. > > To put it another way, the compiler only generates a ModuleInfo (or dependency modules) for .d files. .di files are simply a public API for the .d files. Yes, this is my point. (Communication is much harder than I thought.) When you encounter a .di file, you can't rely on an automated tool to tell you what modules need to be initialized or figure out an order for them. You have to do it manually, and if you mess it up, you get undefined behavior. This is why I called it "here be dragons" -- it's fraught. Unless your goal was to omit the depth-first search in the common case while preserving the rest of the current logic. I'm curious how much time that would save.

On Saturday, 2 June 2018 at 17:17:02 UTC, Neia Neutuladh wrote: > On Friday, 1 June 2018 at 17:59:21 UTC, Steven Schveighoffer wrote: >> The .di file is just an interface, it doesn't know what's actually compiled in the binary. >> >> To put it another way, the compiler only generates a ModuleInfo (or dependency modules) for .d files. .di files are simply a public API for the .d files. > > Yes, this is my point. (Communication is much harder than I thought.) > > When you encounter a .di file, you can't rely on an automated tool to tell you what modules need to be initialized or figure out an order for them. You have to do it manually, and if you mess it up, you get undefined behavior. I believe Steve's point was that with the suggested json file describing dependencies, that would be a part of the public interface, and so would be distributed alongside .di files. Of course, someone could forget to distribute those, and in such a case the runtime cycle test could be extended to do the conservative test if the compiler hasn't registered all modules as having json info. -- Simen

Forums