Jump to page: 1 2
Thread overview
Speeding up DCD in big projects
Jul 21, 2020
WebFreak001
Jul 21, 2020
WebFreak001
Jul 21, 2020
Bruce Carneal
Jul 21, 2020
H. S. Teoh
Jul 21, 2020
H. S. Teoh
Jul 21, 2020
aberba
Jul 21, 2020
H. S. Teoh
Jul 22, 2020
Jacob Carlborg
Aug 01, 2020
H. S. Teoh
Jul 21, 2020
H. S. Teoh
Jul 21, 2020
WebFreak001
Jul 21, 2020
H. S. Teoh
Jul 21, 2020
H. S. Teoh
Jul 21, 2020
WebFreak001
Jul 21, 2020
aberba
July 21, 2020
So it turns out the `resolveImportLocation` function, which is called on the first semantic pass, currently on Windows takes like 50ms to complete once for just about 60 import paths. In a big project like serve-d (as source code project) this would sum up to over 7 minutes. (in release mode just around 2 minutes)

I optimized the import path check code and now in DCD debug mode you might get a 83x speed up for your first completion and in DCD release mode you might get a 26x speed up for your first completion. In serve-d (as source code project) this translated to the first completion now only taking 5 seconds.

I think in big projects optimizing the startup time too is pretty essential and this was a very easy improvement I immediately found using the debugger. The example times are made using DCD ~master with my dsymbol PR on and off inside the source code of serve-d trying to auto complete extension.d:860

This was a very easy to find and improve spot, there might be more low hanging fruit to get the first parse and completion even quicker than 5s.

The PR: https://github.com/dlang-community/dsymbol/pull/151
July 21, 2020
Essential to this were `nothrow` overloads of isFile/isDir in std.file, which I now manually implemented. Having functions for those in phobos would be awesome and clearly would offer huge performance improvements.
July 21, 2020
On Tuesday, 21 July 2020 at 12:51:50 UTC, WebFreak001 wrote:
> So it turns out the `resolveImportLocation` function, which is called on the first semantic pass, currently on Windows takes like 50ms to complete once for just about 60 import paths. In a big project like serve-d (as source code project) this would sum up to over 7 minutes. (in release mode just around 2 minutes)
>
> [...]


Nice. DCD improvements bubbles down to many dev tools.
July 21, 2020
On Tuesday, 21 July 2020 at 12:53:10 UTC, WebFreak001 wrote:
> Essential to this were `nothrow` overloads of isFile/isDir in std.file, which I now manually implemented. Having functions for those in phobos would be awesome and clearly would offer huge performance improvements.

Was the change to nothrow the only alteration?  I ask because I thought that a move to nothrow only wins big if it enables inlining within a hot loop.

Perhaps your manual implementation wins in other ways?







July 21, 2020
On Tue, Jul 21, 2020 at 04:46:23PM +0000, Bruce Carneal via Digitalmars-d wrote:
> On Tuesday, 21 July 2020 at 12:53:10 UTC, WebFreak001 wrote:
> > Essential to this were `nothrow` overloads of isFile/isDir in std.file, which I now manually implemented. Having functions for those in phobos would be awesome and clearly would offer huge performance improvements.
> 
> Was the change to nothrow the only alteration?  I ask because I thought that a move to nothrow only wins big if it enables inlining within a hot loop.
> 
> Perhaps your manual implementation wins in other ways?
[...]

I'm skeptical of "huge performance improvements" with nothrow in anything that involves disk I/O.  An I/O roundtrip far outweighs whatever meager savings you may have won with nothrow.  I have a hard time conceiving of a situation where nothrow would confer significant performance improvements to isFile/isDir.  IMO you'd get much better savings by reducing the number of I/O roundtrips instead.


T

-- 
Today's society is one of specialization: as you grow, you learn more and more about less and less. Eventually, you know everything about nothing.
July 21, 2020
On Tuesday, 21 July 2020 at 16:46:23 UTC, Bruce Carneal wrote:
> On Tuesday, 21 July 2020 at 12:53:10 UTC, WebFreak001 wrote:
>> Essential to this were `nothrow` overloads of isFile/isDir in std.file, which I now manually implemented. Having functions for those in phobos would be awesome and clearly would offer huge performance improvements.
>
> Was the change to nothrow the only alteration?  I ask because I thought that a move to nothrow only wins big if it enables inlining within a hot loop.
>
> Perhaps your manual implementation wins in other ways?

before code was like this:

try
    return isFile(file);
catch (FileException)
    return false;


now it's like this:

uint attr;
bool exists = getAttrs(file, &attr);
return exists && attrIsFile(attr);


so the useless throwing was released which was the main alteration. I further optimized the logic of the caller (removed useless array pushing, added early returns and reduced FS calls) but that was probably relatively small.

I only really checked in the debugger how fast the loop was roughly, where it was before like 40-60ms and only after the nothrow change it was <1ms. So the nothrow change definitely made most of this, considering this was pretty much hot code (called for every import, recursively into all D files imported + calling FS calls like 3 times for every import path where I had 60 import paths)
July 21, 2020
On 7/21/20 1:02 PM, H. S. Teoh wrote:
> On Tue, Jul 21, 2020 at 04:46:23PM +0000, Bruce Carneal via Digitalmars-d wrote:
>> On Tuesday, 21 July 2020 at 12:53:10 UTC, WebFreak001 wrote:
>>> Essential to this were `nothrow` overloads of isFile/isDir in
>>> std.file, which I now manually implemented. Having functions for
>>> those in phobos would be awesome and clearly would offer huge
>>> performance improvements.
>>
>> Was the change to nothrow the only alteration?  I ask because I
>> thought that a move to nothrow only wins big if it enables inlining
>> within a hot loop.
>>
>> Perhaps your manual implementation wins in other ways?
> [...]
> 
> I'm skeptical of "huge performance improvements" with nothrow in
> anything that involves disk I/O.  An I/O roundtrip far outweighs
> whatever meager savings you may have won with nothrow.  I have a hard
> time conceiving of a situation where nothrow would confer significant
> performance improvements to isFile/isDir.  IMO you'd get much better
> savings by reducing the number of I/O roundtrips instead.

If you look at the change, it's not the nothrow optimization that Walter always talks about (where code that is marked nothrow performs slightly better than code that isn't marked nothrow but doesn't end up throwing), it's that the COMMON CASE was that an exception is thrown and caught, and then bool is returned. Instead, just return the bool.

Also, just because it involves file info, doesn't mean it's doing disk i/o for every call.

-Steve
July 21, 2020
On Tue, Jul 21, 2020 at 01:12:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 7/21/20 1:02 PM, H. S. Teoh wrote:
[...]
> > I'm skeptical of "huge performance improvements" with nothrow in anything that involves disk I/O.
[...]
> If you look at the change, it's not the nothrow optimization that Walter always talks about (where code that is marked nothrow performs slightly better than code that isn't marked nothrow but doesn't end up throwing), it's that the COMMON CASE was that an exception is thrown and caught, and then bool is returned. Instead, just return the bool.
[...]

Ahh I see.  That makes sense then.

Which also begs the question, *why* does it even throw in the first place.  The non-existence of a file is a normally-expected outcome of isFile/isDir, throwing in that case seems excessively heavy-handed. It's probably a case of bad API design.


T

-- 
Век живи - век учись. А дураком помрёшь.
July 21, 2020
On 7/21/20 1:32 PM, H. S. Teoh wrote:
> On Tue, Jul 21, 2020 at 01:12:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
>> On 7/21/20 1:02 PM, H. S. Teoh wrote:
> [...]
>>> I'm skeptical of "huge performance improvements" with nothrow in
>>> anything that involves disk I/O.
> [...]
>> If you look at the change, it's not the nothrow optimization that
>> Walter always talks about (where code that is marked nothrow performs
>> slightly better than code that isn't marked nothrow but doesn't end up
>> throwing), it's that the COMMON CASE was that an exception is thrown
>> and caught, and then bool is returned. Instead, just return the bool.
> [...]
> 
> Ahh I see.  That makes sense then.
> 
> Which also begs the question, *why* does it even throw in the first
> place.  The non-existence of a file is a normally-expected outcome of
> isFile/isDir, throwing in that case seems excessively heavy-handed.
> It's probably a case of bad API design.
> 

Agree. Returning false seems appropriate as something that doesn't exist is clearly not a file or directory.

-Steve
July 21, 2020
On Tuesday, 21 July 2020 at 17:32:39 UTC, H. S. Teoh wrote:
> On Tue, Jul 21, 2020 at 01:12:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
>> [...]
> [...]
>> > [...]
> [...]
>> [...]
> [...]
>
> Ahh I see.  That makes sense then.
>
> Which also begs the question, *why* does it even throw in the first place.  The non-existence of a file is a normally-expected outcome of isFile/isDir, throwing in that case seems excessively heavy-handed. It's probably a case of bad API design.

I would say a work in progress
>
>
> T

« First   ‹ Prev
1 2