December 23, 2016
On Thu, 22 Dec 2016 10:17:33 +0000, Joakim wrote:
> Opening a file or 10 is extremely cheap compared to all the other costs
> of the compiler.  Purely from a technical cost perspective,
> I'm not sure even scoped imports were worth it, as my simple
> investigation below suggests.

The compiler doesn't merely have to open the file. It has to run at least partial semantic analysis on it in order to locate symbols. When templates are involved, this can become quite expensive.

With static and selective imports, this step can be entirely avoided -- the current file tells you which files contain which declarations. With either, the compiler can tell exactly which file contains which declarations, so it can avoid pulling in anything beyond what's strictly necessary.

With declaration-scoped imports, you avoid the same step. However, it requires the maintainer's discipline to reduce the declaration's imports to the minimal possible set of imports.

For instance, I have a declaration:

  import myapp.users, std.socket;
  bool isUserOnline(User user, Socket userSocket);

I decide that this needs compilation time optimizations. The current way:

  static import myapp.users, std.socket;
  bool isUserOnline(myapp.users.User user, std.socket.Socket socket);

And Andrei's way:

  bool isUserOnline(User user, Socket userSocket)
  import myapp.users, std.socket;

I refactor things so that this check finds the user socket on its own. The current way:

  static import myapp.users, std.socket;
  bool isUserOnline(myapp.users.User user);

Eh, I forgot I don't need std.socket anymore, but this costs a few microseconds of compiler time to add it to the current symbol table. It has to allocate a lazily expanded module stub. Shouldn't be a huge deal.

Andrei's way:

  bool isUserOnline(User user)
  import myapp.users, std.socket;

Again, I forgot to update the imports, but this time the compiler has to read std.socket from disk, parse it, run semantic on it, and import all its top-level symbols into the scope's symbol table. Because there's nothing here that says the 'User' type is in myapp.users instead of std.socket.

---

In point of fact, selective and static imports should be *faster* than Andrei's way. Consider:

  static import myapp.users, std.socket;
  bool isUserOnline(myapp.users.User user, std.socket.Socket socket);

This has to locate a declaration named `User` in myapp.users, and it has to locate a declaration named `Socket` in std.socket.

But let's look at Andrei's way:

  bool isUserOnline(User user, Socket userSocket)
  import myapp.users, std.socket;

Here, the compiler has to search *both* myapp.users and std.socket for a declaration named `User`, then it has to search both for `Socket`. (Even if it finds `User` in the first, it still needs to search the second in case both define that symbol.)

You go from O(distinct number of types referenced) lookups to O(types * imports).

Granted, you'll usually have between one and three types, between one and three imports, so the point is a bit less salient.
December 23, 2016
I'm looking at this part:

> The manual conversion of std.array to the "static import" form is shown here. It leads to the expected lengthening of the symbols used in declarations, which appears to eliminate one disadvantage by introducing another.

You get longer declarations when you're reading and writing code. This shouldn't impact the docs. Not arguing that there is a disadvantage, just pointing out the scope is relatively small.

All your references are entirely unambiguous. If I'm reading the source and I'm unfamiliar with phobos, I don't have to wonder if that template constraint comes from std.string or std.traits.

It's two benefits for one disadvantage.

You can also have renamed imports to reduce the amount of typing:

  static import trait = std.traits;
  trait.ForeachType!Range[] array(Range)(Range r)
  if (trait.isIterable!Range &&
      trait.isNarrowString!Range &&
      !trait.isInfinite!Range)) {
  }

> The manual conversion of std.array to the "selective import" form is shown here. Conversion was successful but because it collapses all imports at the top, it does not make it much easier to identify e.g. what dependencies would be pulled if a given artifact in std.array were used.

It's still trivial for the compiler to detect which modules a particular declaration depends on.

If I'm in vim, I can put my cursor on the declaration, hit '*', navigate to the top of the document, and find the next match. It's like five keypresses. With VS Code or the like, I can similarly go to the top of the file and find the first match without much trouble.

And this process tells me which module defines the symbol. Either the first hit is at the top of the file in the import section, and I know which file to look in, or it's not, and I know it's defined locally (and there's a good chance I'm at the declaration).

With your proposal, when I'm lucky, I know the declaration must be either in the current file or in the locally imported file. (When I'm unlucky, the file has some top-level imports or multiple local imports.) And since the utility of this is mostly for large modules, I'm still going to pull out grep. It *does* help when I need to import the same thing elsewhere, at least.

> Again the manual process was highly nontrivial.

This is also true for your proposal.

CyberShadow also offered to whip up a tool to automate the conversion to selective imports. Presumably they would be able to do something similar with your proposal.

So this is kind of irrelevant.
December 23, 2016
On 12/22/16 10:31 PM, Chris Wright wrote:
> It's two benefits for one disadvantage.

One won't make decisions by taking the difference of pros and cons, right? -- Andrei
December 23, 2016
On 12/22/16 9:53 PM, Chris Wright wrote:
>   bool isUserOnline(User user, Socket userSocket)
>   import myapp.users, std.socket;

Has changed to

with (import myapp.users, std.socket)
bool isUserOnline(User user, Socket userSocket);

Andrei
December 23, 2016
On 12/22/16 9:53 PM, Chris Wright wrote:
> In point of fact, selective and static imports should be *faster* than
> Andrei's way. Consider:
>
>   static import myapp.users, std.socket;
>   bool isUserOnline(myapp.users.User user, std.socket.Socket socket);
>
> This has to locate a declaration named `User` in myapp.users, and it has
> to locate a declaration named `Socket` in std.socket.
>
> But let's look at Andrei's way:
>
>   bool isUserOnline(User user, Socket userSocket)
>   import myapp.users, std.socket;

with (static import myapp.users, std.socket)
bool isUserOnline(myapp.users.User user, std.socket.Socket socket);


Andrei
December 23, 2016
On Fri, 23 Dec 2016 07:48:55 -0500, Andrei Alexandrescu wrote:

> On 12/22/16 10:31 PM, Chris Wright wrote:
>> It's two benefits for one disadvantage.
> 
> One won't make decisions by taking the difference of pros and cons, right? -- Andrei

You'd use the weighted difference, naturally.
December 23, 2016
Major update adding an experiment that shows the cost of top-level imports.

https://github.com/dlang/DIPs/pull/51

https://github.com/dlang/DIPs/blob/a3ef4e25cfb9f884fee29edb5553a3a2b840f679/DIPs/DIP1005.md

Relevant added text:

Another matter we investigated is how reducing top-level imports influences build times and the size of the object files produced. We do not have an experimental implementation of this DIP, so measuring impact directly was not possible. We did the converse experiment---adding top-level imports.

In a separate branch of the standard library code, for each module we added all nested imports back to the top level. Some hand-editing was needed after that because of clashes in symbol names. Also, some imports needed to be removed because of circular dependencies and related limitations in the language's design and implementation. The resulting setup can be seen in [PR4992](https://github.com/dlang/phobos/pull/4992). Then for each module in the standard library we compiled one file that consists of exactly one `import` declaration, monitoring compile time and object file size. Appendix B displays build times and size of object files produced by this experiment.

|Aggregate|Time (top-level)|Time (nested)|Object size (top-level)|Object size (nested)|
|---|---|---|---|---|
|Median|320ms|13788|32ms|4296|
|Average|287.6ms|13437.2|64.6ms|5734.1|

As expected, the experiment shows that both build times and object file sizes were improved by moving imports away from the top level. We estimate that eliminating the 10.5x slack dependency fan-in will bring import costs down to negligible and also bring object file size down.


Andrei

December 23, 2016
On Fri, 23 Dec 2016 11:25:41 -0500, Andrei Alexandrescu wrote:
> Dependency-Carrying Declarations allow scalable template libraries.

> template libraries

*This* is the part I hadn't noticed before, and it explains the confusion I had.

If you *only* apply this to templates, it works. The current situation is a result of where template constraints are located. If there were a `static throw` equivalent for indicating that parameters were invalid, or if template constraints were in an `in` block like contracts, then this wouldn't even have come up. If phobos went for object oriented code instead of templates (as an example, not a recommendation), then this wouldn't be an issue.

However, at that point, it would be utterly useless to me. I'm looking at my entire dub package cache, plus the ten-ish most recently updated dub packages:

* units-d uses allSatisfy. Once.
* vibe has two structs that would benefit, except they're inside a
unittest. I've never compiled dub's unittests.

Template constraints have little adoption outside phobos. When they *are* used, they tend to use language facilities instead of templates to express the condition. And when a template is used, it tends to be defined in the same module where it's used.

I grant that everyone uses phobos, and phobos uses template constraints a lot. If it's *just* a problem inside phobos, though, there's another obvious solution: split up modules that tend to house a lot of template constraints. Split up the modules that use a wide variety of template constraints.

Then I can decide whether the convenience of not hunting for narrower imports is worth the extra quarter second of compilation.

----

Now, if you want to apply this to things that are *not* templates, then you could get a lot further. However, you would end up with code that compiles when it wouldn't today. Code that compiles because the portions you use would compile if they were on their own, while other bits wouldn't. That's the status quo for templates, even no-arg templates, but a change from what we currently do everywhere else.

And *that* is what would make it equivalent to use static or selective imports. It would also increase the utility from my perspective from "why the hell are we even doing this?" to "that's kinda nice".
December 23, 2016
On 12/23/2016 05:33 PM, Chris Wright wrote:
> On Fri, 23 Dec 2016 11:25:41 -0500, Andrei Alexandrescu wrote:
>> Dependency-Carrying Declarations allow scalable template libraries.
>
>> template libraries
>
> *This* is the part I hadn't noticed before, and it explains the confusion
> I had.
>
> If you *only* apply this to templates, it works. The current situation is
> a result of where template constraints are located. If there were a
> `static throw` equivalent for indicating that parameters were invalid, or
> if template constraints were in an `in` block like contracts, then this
> wouldn't even have come up. If phobos went for object oriented code
> instead of templates (as an example, not a recommendation), then this
> wouldn't be an issue.

I acknowledge that if the language were defined a different way this issue wouldn't have come up. But this is a truism - one can say that about any issue.

> However, at that point, it would be utterly useless to me. I'm looking at
> my entire dub package cache, plus the ten-ish most recently updated dub
> packages:
>
> * units-d uses allSatisfy. Once.
> * vibe has two structs that would benefit, except they're inside a
> unittest. I've never compiled dub's unittests.

Fair enough. I reckon a number of traditional ways of designing software would not be helped radically by DIP1005.

> Template constraints have little adoption outside phobos.

That will change. We definitely need to do all we can to support and improve language support for template constraints.

> When they *are*
> used, they tend to use language facilities instead of templates to
> express the condition. And when a template is used, it tends to be
> defined in the same module where it's used.

That may be true for some code today but not for future code. std.traits gets larger and better with more interesting introspection capabilities. I envision introspection as a core differentiating feature of D that will put it ahead of all other languages.

> I grant that everyone uses phobos, and phobos uses template constraints a
> lot. If it's *just* a problem inside phobos, though, there's another
> obvious solution: split up modules that tend to house a lot of template
> constraints. Split up the modules that use a wide variety of template
> constraints.

This point is discussed carefully in DIP1005. Please let me know if anything needs to be added.

> Now, if you want to apply this to things that are *not* templates, then
> you could get a lot further. However, you would end up with code that
> compiles when it wouldn't today. Code that compiles because the portions
> you use would compile if they were on their own, while other bits
> wouldn't. That's the status quo for templates, even no-arg templates, but
> a change from what we currently do everywhere else.
>
> And *that* is what would make it equivalent to use static or selective
> imports. It would also increase the utility from my perspective from "why
> the hell are we even doing this?" to "that's kinda nice".

Lazier compilation is indeed a projected benefit of this DIP. I did not want to dilute the thrust of the proposal with a remote promise.


Andrei

December 24, 2016
tldr: the performance impact of this DIP would be too small to easily measure and only impacts 26 declarations referencing 14 templates in the standard library anyway.

On Fri, 23 Dec 2016 18:55:25 -0500, Andrei Alexandrescu wrote:
>> I grant that everyone uses phobos, and phobos uses template constraints a lot. If it's *just* a problem inside phobos, though, there's another obvious solution: split up modules that tend to house a lot of template constraints. Split up the modules that use a wide variety of template constraints.
> 
> This point is discussed carefully in DIP1005. Please let me know if anything needs to be added.

An estimate for the actual impact on phobos, since that's your primary driver for the change -- both under the status quo and if we try to split modules.

The comparison to mach.d is a strawman. When I thought this might be a problem within phobos, I thought we'd probably split std.traits and maybe std.meta up, probably into 2-5 modules each. Not 150 lines per module; more like 1500 to 4000 lines per module.

Then I looked at the code.

Phobos has 26 templates that would use this special syntax, referencing 14 distinct templates and CTFE functions in three modules.

If you kept the same ratios as are found in mach.d, you'd have one file for every template used as a constraint outside its own module, one for everything else, and as many files again with nothing in them.

The templates that have to be parsed are std.traits, std.meta, and std.range.primitives. I wrote a test program that imported them without using them. It compiled in too little time to accurately measure -- `time` reports 0.01s wall time.

You want to add new language syntax to save me ten milliseconds during compilation.

You hope that template constraints become more common. If they do, this change relies on:

* projects defining their own constraints (not just using std.traits)
* those constraints being defined in modules other than where they're used
* those constraints being defined in very large modules
* my code not depending on anything in those constraint-defining modules
* my code depending on something in a file with a template with one of
those constraints
* this whole situation being common enough to give me a death by a
thousand papercuts

If any project aside from the standard library has a 7k line module defining things mainly used in template constraints, something is seriously weird. But on the plus side, it would only cost me 10 milliseconds. Now, if *dozens* of projects did that, well, I'd be running to the relative simplicity of Java APIs long before I worried about compilation speed.

> Lazier compilation is indeed a projected benefit of this DIP. I did not want to dilute the thrust of the proposal with a remote promise.

Lazier compilation would *obviate* this DIP. Lazy compilation of selective and static imports would not require any parser changes and would make a lot of code faster (at the cost of allowing some things that don't compile to start compiling, as does your proposal). You can't get any performance advantages outside templates without implementing lazy imports.



Appendix A: templates that would receive 'with(import foo)' in Phobos.

I used a relatively simple regex to look for this. If someone put more than one line between a template and its constraints, or comments with certain formatting, I may have missed it. However, that would violate the phobos style guide.

This doesn't include templates that use constraints defined in their own modules, because that module must already be parsed and no efficiency gains could be realized.

It also omits a few cases where the module has a strong reason to import the dependency aside from template constraints. I believe this was only two constraints defined in std.digest, used in one or two other modules.

algorithm/comparison.d:98:template among(values...)
experimental/typecons.d:82:private template implementsInterface(Source,
Targets...)
experimental/typecons.d:94:    template implementsInterface()
experimental/typecons.d:126:private template implementsInterface(Source,
Targets...)
experimental/typecons.d:184:template wrap(Targets...)
experimental/typecons.d:237:        template wrap(Source)
range/package.d:2069:template Take(R)
range/package.d:3501:template Cycle(R)
range/interfaces.d:277:template MostDerivedInputRange(R)
range/interfaces.d:336:template InputRangeObject(R)
numeric.d:678:template FPTemporary(F)
conv.d:3894:template octal(alias decimalInteger)
utf.d:1136:package template codeUnitLimit(S)
typecons.d:1779:private mixin template RebindableCommon(T, U, alias This)
typecons.d:1838:template Rebindable(T)
typecons.d:4239:template wrap(Targets...)
typecons.d:4252:    template wrap(Source)
typecons.d:4412:template wrap(Targets...)
typecons.d:4429:template unwrap(Target)
typecons.d:4461:template unwrap(Target)
format.d:657:template FormatSpec(Char)
algorithm/iteration.d:1055:template filter(alias predicate) if (is(typeof
(unaryFun!predicate)))
internal/math/biguintcore.d:81:template maxBigDigits(T) if (isIntegral!T)
meta.d:248:package template OldAlias(T) if (!isAggregateType!T || is
(Unqual!T == T))
meta.d:254:package template OldAlias(T) if (isAggregateType!T && !is
(Unqual!T == T))
utf.d:3542:template byUTF(C) if (isSomeChar!C)


Appendix B: templates that would need to be extracted out in phobos, if parsing their modules cost a non-negligible amount of time.

std.meta:
    allSatisfy
    anySatisfy
    ApplyLeft

std.range.primitives:
    hasSlicing
    isInputRange
    isInfinite

std.traits:
    isAggregateType
    isAssociativeArray
    isDynamicArray
    isFloatingPoint
    isImplicitlyConvertible
    isIntegral
    isMutable
    isSomeChar