Jump to page: 1 2
Thread overview
Slow code, slow
Feb 23, 2018
H. S. Teoh
Feb 23, 2018
Rubn
Feb 23, 2018
bauss
Feb 23, 2018
Rubn
Feb 23, 2018
H. S. Teoh
Feb 24, 2018
Rubn
Feb 23, 2018
H. S. Teoh
Feb 24, 2018
Dmitry Olshansky
Feb 24, 2018
kdevel
Feb 24, 2018
Stefan Koch
February 23, 2018
Now that I got your attention:

	https://issues.dlang.org/show_bug.cgi?id=18511

tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with std.algorithm and std.range templates, compiles *an order of magnitude* slower than the equivalent hand-written loop.  The way the compiler compiles templates needs some serious improvement.

(And this is why our current fast-fast-fast slogan annoys me so much. One can argue that it's misleading advertising, given that what's considered "idiomatic D", using features like templates and generic code that's highly-touted as D's strong points, compiles a whole order of magnitude slower than C-style D.  Makes me cringe every time I hear "fast code, fast". Our old slogan is a much more accurate description of the current state of things.)


T

-- 
Don't throw out the baby with the bathwater. Use your hands...
February 23, 2018
On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
> Now that I got your attention:
>
> 	https://issues.dlang.org/show_bug.cgi?id=18511
>
> tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with std.algorithm and std.range templates, compiles *an order of magnitude* slower than the equivalent hand-written loop.  The way the compiler compiles templates needs some serious improvement.
>
> (And this is why our current fast-fast-fast slogan annoys me so much. One can argue that it's misleading advertising, given that what's considered "idiomatic D", using features like templates and generic code that's highly-touted as D's strong points, compiles a whole order of magnitude slower than C-style D.  Makes me cringe every time I hear "fast code, fast". Our old slogan is a much more accurate description of the current state of things.)
>
>
> T

It's not that big of a slow down. Using "fast" you don't import any modules so they never have to be parsed. That's pretty much all of phobos you don't have to parse in that example. That's just the initial cost too. In a big project this won't make a difference. You create a tiny example that is irrelevant to the larger scale, that takes 0.3 seconds longer to compile. It's a magnitude slower cause in your fast example it's literately only parsing 5 lines of code instead of hundreds of lines like it is in your slow example.


February 23, 2018
On Friday, 23 February 2018 at 20:35:44 UTC, Rubn wrote:
> On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
>> Now that I got your attention:
>>
>> 	https://issues.dlang.org/show_bug.cgi?id=18511
>>
>> tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with std.algorithm and std.range templates, compiles *an order of magnitude* slower than the equivalent hand-written loop.  The way the compiler compiles templates needs some serious improvement.
>>
>> (And this is why our current fast-fast-fast slogan annoys me so much. One can argue that it's misleading advertising, given that what's considered "idiomatic D", using features like templates and generic code that's highly-touted as D's strong points, compiles a whole order of magnitude slower than C-style D.  Makes me cringe every time I hear "fast code, fast". Our old slogan is a much more accurate description of the current state of things.)
>>
>>
>> T
>
> It's not that big of a slow down. Using "fast" you don't import any modules so they never have to be parsed. That's pretty much all of phobos you don't have to parse in that example. That's just the initial cost too. In a big project this won't make a difference. You create a tiny example that is irrelevant to the larger scale, that takes 0.3 seconds longer to compile. It's a magnitude slower cause in your fast example it's literately only parsing 5 lines of code instead of hundreds of lines like it is in your slow example.

I disagree.

It actually matters a lot for big projects with lots of templates, especially nested templates. Gets a whole lot worse when it's templates within mixin templates with templates.

It's not just a "0.3" second difference, but can be half a minute or even more.
February 23, 2018
On Friday, 23 February 2018 at 20:41:17 UTC, bauss wrote:
> On Friday, 23 February 2018 at 20:35:44 UTC, Rubn wrote:
>> On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
>>> Now that I got your attention:
>>>
>>> 	https://issues.dlang.org/show_bug.cgi?id=18511
>>>
>>> tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with std.algorithm and std.range templates, compiles *an order of magnitude* slower than the equivalent hand-written loop.  The way the compiler compiles templates needs some serious improvement.
>>>
>>> (And this is why our current fast-fast-fast slogan annoys me so much. One can argue that it's misleading advertising, given that what's considered "idiomatic D", using features like templates and generic code that's highly-touted as D's strong points, compiles a whole order of magnitude slower than C-style D.  Makes me cringe every time I hear "fast code, fast". Our old slogan is a much more accurate description of the current state of things.)
>>>
>>>
>>> T
>>
>> It's not that big of a slow down. Using "fast" you don't import any modules so they never have to be parsed. That's pretty much all of phobos you don't have to parse in that example. That's just the initial cost too. In a big project this won't make a difference. You create a tiny example that is irrelevant to the larger scale, that takes 0.3 seconds longer to compile. It's a magnitude slower cause in your fast example it's literately only parsing 5 lines of code instead of hundreds of lines like it is in your slow example.
>
> I disagree.
>
> It actually matters a lot for big projects with lots of templates, especially nested templates. Gets a whole lot worse when it's templates within mixin templates with templates.
>
> It's not just a "0.3" second difference, but can be half a minute or even more.

Like with anything, since you can now basically run code at compile time, you are going to have to make optimizations to your code. If you make a million template instances, well a compiler isn't going to magically be able to make that fast. This slowdown for this specific example isn't cause by templates, it's caused by having to parse all the extra lines of code from phobos. I didn't say there aren't problems with templates, but this example accurately depicts nothing.
February 23, 2018
On Fri, Feb 23, 2018 at 08:41:17PM +0000, bauss via Digitalmars-d wrote: [...]
> It actually matters a lot for big projects with lots of templates, especially nested templates. Gets a whole lot worse when it's templates within mixin templates with templates.

The situation has actually improved somewhat after Rainer's symbol backreferencing PR was merged late last year. Before that, deeply nested templates were spending most of their time generating, scanning, and writing out 20MB-long symbols. :-D

Now that superlong symbols are no longer the bottleneck, though, other issues with the implementation of templates are coming to the surface. Like this one, where it takes *3 seconds* to compile a program containing a *single* (trivial) regex:

	https://issues.dlang.org/show_bug.cgi?id=18378


> It's not just a "0.3" second difference, but can be half a minute or even more.

In the old days, when yours truly submitted a naïve implementation of cartesianProduct to Phobos, compiling Phobos unittests would cause the autotester to freeze for a long time and then die with an OOM, because using cartesianProduct with multiple arguments caused an exponential number of templates to get instantiated. :-D

Over the years there have also been a number of PRs that try to mitigate the problem somewhat by, e.g., replacing a linearly-recursive template (usually tail-recursive -- but the compiler currently does not take advantage of that) with a divide-and-conquer scheme instead. A lot of stuff that iterates over AliasSeq suffers from this problem, actually. AIUI, due to the way templates are currently implemented, a linearly-recursive template causes quadratic slowdown in compilation time.  Clearly, the quality of implementation needs improvement here.


T

-- 
Once bitten, twice cry...
February 23, 2018
On Fri, Feb 23, 2018 at 08:51:20PM +0000, Rubn via Digitalmars-d wrote: [...]
> This slowdown for this specific example isn't cause by templates, it's caused by having to parse all the extra lines of code from phobos. I didn't say there aren't problems with templates, but this example accurately depicts nothing.

I say again, do you have measurements to back up your statement?

Parsing is actually very fast with the DMD front end.  I can't believe that it will take half a second to parse a Phobos module -- the compiler's parser is not that stupid.  I have a 1600+ line module that compiles in about 0.4 seconds (that's lexing + parsing + semantic + codegen), but that time more than doubles when you just change a loop into a range-based algorithm.  Clearly, parsing is not the bottleneck here.


T

-- 
Unix is my IDE. -- Justin Whear
February 23, 2018
On 2/23/18 3:15 PM, H. S. Teoh wrote:
> Now that I got your attention:
> 
> 	https://issues.dlang.org/show_bug.cgi?id=18511
> 
> tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with
> std.algorithm and std.range templates, compiles *an order of magnitude*
> slower than the equivalent hand-written loop.  The way the compiler
> compiles templates needs some serious improvement.
> 
> (And this is why our current fast-fast-fast slogan annoys me so much.
> One can argue that it's misleading advertising, given that what's
> considered "idiomatic D", using features like templates and generic code
> that's highly-touted as D's strong points, compiles a whole order of
> magnitude slower than C-style D.  Makes me cringe every time I hear
> "fast code, fast". Our old slogan is a much more accurate description of
> the current state of things.)

cc Dmitry

Thanks for a solid bug report. The right response here is to live into our "fast code, fast" principle. It might be the case that the slowdown is actually the negative side of an acceleration :o) - before Dmitry's recent work, the sheer act of importing std.regex would be slow. Dmitry, do you think you could use some precompiled tables to mitigate this? Is your caching compiler going to help the matter?

Andrei
February 24, 2018
On Friday, 23 February 2018 at 21:10:25 UTC, H. S. Teoh wrote:
> On Fri, Feb 23, 2018 at 08:51:20PM +0000, Rubn via Digitalmars-d wrote: [...]
>> This slowdown for this specific example isn't cause by templates, it's caused by having to parse all the extra lines of code from phobos. I didn't say there aren't problems with templates, but this example accurately depicts nothing.
>
> I say again, do you have measurements to back up your statement?
>
> Parsing is actually very fast with the DMD front end.  I can't believe that it will take half a second to parse a Phobos module -- the compiler's parser is not that stupid.  I have a 1600+ line module that compiles in about 0.4 seconds (that's lexing + parsing + semantic + codegen), but that time more than doubles when you just change a loop into a range-based algorithm.  Clearly, parsing is not the bottleneck here.
>
>
> T


I did measure it, adding another instigation of the templates using a different type adds a fraction of the time. Not another 0.3 seconds.

I don't know what your so called 1600+ line module is doing, just cause it's 1600 lines doesn't mean there won't be the same slow down if you don't use part of phobos in all those lines. Then add a few lines that do use it, which will incur this slowdown.
February 24, 2018
On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
> Now that I got your attention:
>
> 	https://issues.dlang.org/show_bug.cgi?id=18511

Your bug report is about slowdown in *compilation* time. I wondered if the longer compilation time is due to the better (faster) generated code. But this is not the case either:

$ ./dotbench
initialized arrays of type double
dot_fast: 279 ms
value = 0
dot_slow: 5413 ms
value = 0
dotProduct: 217 ms
value = 0



February 24, 2018
On Saturday, 24 February 2018 at 00:21:06 UTC, Andrei Alexandrescu wrote:
> On 2/23/18 3:15 PM, H. S. Teoh wrote:
>> 
>> tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with.  Makes me cringe every time I hear
>> "fast code, fast". Our old slogan is a much more accurate description of
>> the current state of things.)
>
> cc Dmitry
>
> Thanks for a solid bug report. The right response here is to live into our "fast code, fast" principle. It might be the case that the slowdown is actually the negative side of an acceleration :o) - before Dmitry's recent work, the sheer act of importing std.regex would be slow. Dmitry, do you think you could use some precompiled tables to mitigate this?

First things first sombody need to profile compiler while compiling this snippet.

My guesswork is that instantiating templates + generating long symbols is the problem.

The template system obviously needs some (re)work, I think at a time nobody thought templates would be that abundant in D code.

Nowdays it’s easily more templates then normal functions.

> Is your caching compiler going to help the matter?

In some distant bright future where it may be finally applied to instantiating templates and caching codegen but even then I’m not 100% positive.

Finally, I repeat - we have not yet identified problem. What takes time in the compiler needs to be figured out by disecting the time taken via profiler and experimentation.


—
Dmitry Olshansky


« First   ‹ Prev
1 2