December 23, 2016
On 12/23/16 9:23 PM, Chris Wright wrote:
> The comparison to mach.d is a strawman.

The mach.d is given as an example of the approach of breaking code into fine-grained module. No comparison is made or implied.

> Then I looked at the code.
>
> Phobos has 26 templates that would use this special syntax, referencing
> 14 distinct templates and CTFE functions in three modules.

Could you please give more detail about the method you used? What special syntax are you referring to - would that be the "with (import ...)" syntax? If that's the case there's a bunch of stuff missing. Consider e.g.:

int cmp(alias pred = "a < b", R1, R2)(R1 r1, R2 r2)
if (isInputRange!R1 && isInputRange!R2 && !(isSomeString!R1 && isSomeString!R2));

int cmp(alias pred = "a < b", R1, R2)(R1 r1, R2 r2)
if (isSomeString!R1 && isSomeString!R2);

These and many like them would use the "with import" construct, wouldn't they? They are missing from your list.

> The templates that have to be parsed are std.traits, std.meta, and
> std.range.primitives.

That's where most introspection primitives for the standard library are situated indeed.

> I wrote a test program that imported them without
> using them. It compiled in too little time to accurately measure --
> `time` reports 0.01s wall time.
>
> You want to add new language syntax to save me ten milliseconds during
> compilation.

DIP1005 enumerates several benefits of the proposed feature. Speed is not the most important and is not presented as such, but it takes most space because the others are more difficult to experiment with without an experimental implementation.

> You hope that template constraints become more common.

Well, to the extent that is the way correct code is written, that's more than just hope.

> If they do, this
> change relies on:
>
> * projects defining their own constraints (not just using std.traits)
> * those constraints being defined in modules other than where they're used
> * those constraints being defined in very large modules
> * my code not depending on anything in those constraint-defining modules
> * my code depending on something in a file with a template with one of
> those constraints
> * this whole situation being common enough to give me a death by a
> thousand papercuts

It's not only constraints, it's everything in declarations that relies on entities (e.g. types) defined in other modules. And again the feature's benefits go well beyond making general projects faster to build.

A simple way to look at things is - local imports are The Right Thing, for many reasons beyond compilation speed. There needs to be a way to do the right thing for the declaration part as well.

(I'm glad we're getting to the point of diminished returns - I recall an argument made a while ago was that, on the contrary, our compilation model cannot possibly scale for large projects.)

> Appendix A: templates that would receive 'with(import foo)' in Phobos.

Could you please provide more detail on how you measured this and what assumptions you made? Most templates in Phobos would receive a with clause, so there's a disconnect somewhere.


Andrei

December 24, 2016
On Saturday, 24 December 2016 at 02:23:53 UTC, Chris Wright wrote:
> Appendix A: templates that would receive 'with(import foo)' in Phobos.
>
> I used a relatively simple regex to look for this. If someone put more than one line between a template and its constraints, or comments with certain formatting, I may have missed it. However, that would violate the phobos style guide.
>
> This doesn't include templates that use constraints defined in their own modules, because that module must already be parsed and no efficiency gains could be realized.
>
> It also omits a few cases where the module has a strong reason to import the dependency aside from template constraints. I believe this was only two constraints defined in std.digest, used in one or two other modules.
>
> algorithm/comparison.d:98:template among(values...)
> experimental/typecons.d:82:private template implementsInterface(Source,
> Targets...)
> experimental/typecons.d:94:    template implementsInterface()
> experimental/typecons.d:126:private template implementsInterface(Source,
> Targets...)
> experimental/typecons.d:184:template wrap(Targets...)
> experimental/typecons.d:237:        template wrap(Source)
> range/package.d:2069:template Take(R)
> range/package.d:3501:template Cycle(R)
> range/interfaces.d:277:template MostDerivedInputRange(R)
> range/interfaces.d:336:template InputRangeObject(R)
> numeric.d:678:template FPTemporary(F)
> conv.d:3894:template octal(alias decimalInteger)
> utf.d:1136:package template codeUnitLimit(S)
> typecons.d:1779:private mixin template RebindableCommon(T, U, alias This)
> typecons.d:1838:template Rebindable(T)
> typecons.d:4239:template wrap(Targets...)
> typecons.d:4252:    template wrap(Source)
> typecons.d:4412:template wrap(Targets...)
> typecons.d:4429:template unwrap(Target)
> typecons.d:4461:template unwrap(Target)
> format.d:657:template FormatSpec(Char)
> algorithm/iteration.d:1055:template filter(alias predicate) if (is(typeof
> (unaryFun!predicate)))
> internal/math/biguintcore.d:81:template maxBigDigits(T) if (isIntegral!T)
> meta.d:248:package template OldAlias(T) if (!isAggregateType!T || is
> (Unqual!T == T))
> meta.d:254:package template OldAlias(T) if (isAggregateType!T && !is
> (Unqual!T == T))
> utf.d:3542:template byUTF(C) if (isSomeChar!C)
>
>
> Appendix B: templates that would need to be extracted out in phobos, if parsing their modules cost a non-negligible amount of time.
>
> std.meta:
>     allSatisfy
>     anySatisfy
>     ApplyLeft
>
> std.range.primitives:
>     hasSlicing
>     isInputRange
>     isInfinite
>
> std.traits:
>     isAggregateType
>     isAssociativeArray
>     isDynamicArray
>     isFloatingPoint
>     isImplicitlyConvertible
>     isIntegral
>     isMutable
>     isSomeChar

There are a lot of templates in Phobos that never use the template keyword. The proposal doesn't only apply to constraints, it applies to the whole declaration.
December 24, 2016
On 12/23/2016 09:23 PM, Chris Wright wrote:
[abbreviated below]

Upon more investigation, I see a large discrepancy between the findings of DIP1005 and yours. So there are a number of claims here, which I'll summarize:

> tldr: the performance impact of this DIP would be too small to easily
> measure and only impacts 26 declarations referencing 14 templates in the
> standard library anyway.

> Phobos has 26 templates that would use this special syntax, referencing
> 14 distinct templates and CTFE functions in three modules.

> The templates that have to be parsed are std.traits, std.meta, and
> std.range.primitives. I wrote a test program that imported them without
> using them. It compiled in too little time to accurately measure --
> `time` reports 0.01s wall time.
>
> You want to add new language syntax to save me ten milliseconds during
> compilation.

The findings of DIP1005 are the following:

* Importing a single std module also imports on average 10.5 other modules.

* Importing a single std module costs on average 64.6 ms.

* (Not stated in the DIP) A majority of std templates would acquire inline imports.

According to the DIP, one may estimate that the proposed feature would reduce additional imports to 0 and the average time to import a single module by a factor of 10 to under 10 ms.

By your estimates:

* 26 templates in std need inline imports.

* Importing a single std module today would only imports 1-3 other modules most of the time (one or more of std.traits, std.meta, and std.range.primitives).

* These additional imports cost in aggregate under 10ms, bringing the average cost of importing a module itself to 54.6 ms.

* It follows that the average module takes 5.46 more times to import alone than the sum of std.traits, std.meta, and std.range.primitives (which have a total of 11263 lines, 5x more than the average Phobos module).

I don't see how your claims can be simultaneously true with the findings of DIP1005. The scripts that compute those numbers are available with the DIP. Were you able to reproduce them?

(I confirm that importing std.traits, std.meta, and std.range.primitives together takes 10ms.)


Thanks,

Andrei

December 24, 2016
On Saturday, 24 December 2016 at 09:34:03 UTC, Andrei Alexandrescu wrote:
> (I confirm that importing std.traits, std.meta, and std.range.primitives together takes 10ms.)

while compiling std.traits 6 files are opened and read into memory,
taking roughly 300 microseconds, which is not even 0.3% of the time spent.
Lexing them requires additionally also about 300 microseconds.
So Together that makes up 0.6% of the time spent.

The real problem here is _NOT_ opening and lexing files.

It is rather eager parsing and sema.

If that were made more lazy, we could import half of the world with noticing impact.

(Which espcially in std.traits, would not make that much of a difference since every template in there depends on nearly every other template in there)


December 24, 2016
On Saturday, 24 December 2016 at 10:54:08 UTC, Stefan Koch wrote:
> 300 microseconds, which is not even 0.3% of the time spent.
> Lexing them requires additionally also about 300 microseconds.
> So Together that makes up 0.6% of the time spent.

Of course in the above 3% and 6% are the right numbers.
(And still conservative.)
Since the are obtained by using a -profile build of dmd.

December 24, 2016
On 12/24/2016 05:54 AM, Stefan Koch wrote:
> If that were made more lazy, we could import half of the world with
> noticing impact.

That is what 1005 will bring. -- Andrei
December 24, 2016
On Saturday, 24 December 2016 at 14:08:48 UTC, Andrei Alexandrescu wrote:
> On 12/24/2016 05:54 AM, Stefan Koch wrote:
>> If that were made more lazy, we could import half of the world with
>> noticing impact.
>
> That is what 1005 will bring. -- Andrei

A compiler enhancement can do this _without_ a language change.

December 24, 2016
On 12/24/16 9:20 AM, Stefan Koch wrote:
> On Saturday, 24 December 2016 at 14:08:48 UTC, Andrei Alexandrescu wrote:
>> On 12/24/2016 05:54 AM, Stefan Koch wrote:
>>> If that were made more lazy, we could import half of the world with
>>> noticing impact.
>>
>> That is what 1005 will bring. -- Andrei
>
> A compiler enhancement can do this _without_ a language change.

The language addition has additional benefits as described by DIP1005. -- Andrei

December 24, 2016
On Saturday, 24 December 2016 at 15:44:18 UTC, Andrei Alexandrescu wrote:
> On 12/24/16 9:20 AM, Stefan Koch wrote:
>> On Saturday, 24 December 2016 at 14:08:48 UTC, Andrei Alexandrescu wrote:
>>> On 12/24/2016 05:54 AM, Stefan Koch wrote:
>>>> If that were made more lazy, we could import half of the world with
>>>> noticing impact.
>>>
>>> That is what 1005 will bring. -- Andrei
>>
>> A compiler enhancement can do this _without_ a language change.
>
> The language addition has additional benefits as described by DIP1005. -- Andrei

I just read over the dip, and it is a giant wall of text.
I cannot really make heads or tails of it.
Maybe you could write the advantages you hit at in short bullet-point form ?

December 24, 2016
On Sat, 24 Dec 2016 04:34:03 -0500, Andrei Alexandrescu wrote:
> Upon more investigation, I see a large discrepancy between the findings of DIP1005 and yours.

There's no discrepancy.

In part, you are misinterpreting most of what I said.

In part, you are assuming that imports on non-template declarations will be handled lazily, even though that is not part of this DIP, even though that is likewise possible with static and selective imports.

In part, you are using lines of code as a proxy for compile time.

In part, you dispute that this only affects template constraints, but:
* An import used only in the body of a template can be made a local
import today.
* An import used in the declaration of a templated type or function can
be addressed by using explicit template syntax, offering a place to
insert your imports.
* An import used anywhere else must still be processed, even assuming
this DIP is implemented.
* If, in a future DIP, we make it so that `with(import)` is handled
lazily, we can also make it so that static and selective imports are
handled lazily.

> The findings of DIP1005 are the following:
> 
> * Importing a single std module also imports on average 10.5 other modules.

Seems reasonable. Between 2 and 3.5 direct dependencies, by my count, and you're counting transitive dependencies.

We're concerned with the effects of DIP1005, though, which only affects template constraints.

> * Importing a single std module costs on average 64.6 ms.

55-ish for your hardware, you reported elsewhere. 47-ish for mine.

> * (Not stated in the DIP) A majority of std templates would acquire
> inline imports.

Again, that wouldn't impact compile times because these aren't template constraints.

You can make a separate DIP to make imports lazy. That can impact static, selective, and `with` imports equally well. But it's not part of what we're discussing today.

> According to the DIP, one may estimate that the proposed feature would reduce additional imports to 0 and the average time to import a single module by a factor of 10 to under 10 ms.

"The proposed feature" must be lazy semantic analysis, especially of imports. That isn't part of DIP1005.

You won't get to zero additional imports. You might get to zero *extraneous* imports -- that is, only the set of imports required to create a custom *.di file containing only the parts of the module that your application uses.

> By your estimates:
> 
> * 26 templates in std need inline imports.

I said that 26 templates *could possibly benefit from* your new style of imports. There's a difference between possibly benefitting from a change and needing that change.

> * Importing a single std module today would only imports 1-3 other modules most of the time (one or more of std.traits, std.meta, and std.range.primitives).

No, that's not what I said at all. I said that the only modules you would sometimes *stop* processing because of DIP1005 are std.traits, std.meta, and std.range.primitives. That's because those modules contain templates used in other modules as template constraints.

In order to get any additional improvements, you need lazy imports, which can also apply to static or selective imports without any syntax changes.

> * These additional imports cost in aggregate under 10ms, bringing the average cost of importing a module itself to 54.6 ms.

~10ms is the upper bound of the added cost if you import just one module in std that has a template constraint you don't use.

The way you state it implies that every module brings in std.traits, std.meta, and std.range.primitives unnecessarily, instead of 26 templates across at most 26 modules importing them for a reason.

> * It follows that the average module takes 5.46 more times to import alone than the sum of std.traits, std.meta, and std.range.primitives (which have a total of 11263 lines, 5x more than the average Phobos module).

More like 4.7 on my hardware, but yeah. 11k lines that have to be parsed and 0 lines that require semantic analysis. Not terribly surprising.

> I don't see how your claims can be simultaneously true with the findings of DIP1005.

You found that the average cost of importing a std module is 54ms or thereabouts. std.traits, std.meta, and std.range.primitives are well below average. No conflict there. They aren't even the cheapest modules in the standard library.

The modules in question are mostly unittests. The compiler doesn't run semantic on unittests in a module that wasn't included in the command line. (Even if you pass -unittest. Try it out -- you can even have a unittest that says `static assert(false);` and it does nothing.)

The parts of the modules that are not unittests are templates. The compiler doesn't run semantic analysis on templates until you use them.

So it should be pretty obvious why these modules are so cheap to import and not use.

> The scripts that compute those numbers are available with the DIP. Were you able to reproduce them?

The times it reported on my hardware:

Min: 5ms
Max: 300ms
Median: 21ms
Average: 47ms

The minimum isn't terribly useful because it gets to the point of testing the process scheduler and IO more than the compiler. If we want numbers that we can trust on the low end, we'll need to put timing information into the compiler, maybe control for IO by using a ramfs, that sort of thing.

You also reproduced my test, so this isn't a quirk of my installation.