January 19, 2019 Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster. Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then, dmd myfile.d phobos.zip and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!) It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping. We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones. This can be a fun challenge! Anyone up for it? P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success. |
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
>
> [...]
If we are going there we might as well use a proper database as the compiler cache format.
similar to to pre-compiled headers.
I'd be interested to see in how this bears out.
I am going to devote some time this weekend to it.
But using sqlite rather than zip.
|
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
>
> So looking up fewer files would make it faster.
>
> Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then,
>
> dmd myfile.d phobos.zip
>
> and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!)
>
> It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping.
>
> We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones.
>
> This can be a fun challenge! Anyone up for it?
>
> P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success.
C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
Better speedup compilation speed.
|
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Temtaime | On 1/19/2019 1:00 AM, Temtaime wrote:
> C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
> Better speedup compilation speed.
You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
|
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:
> On 1/19/2019 1:00 AM, Temtaime wrote:
>> C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
>> Better speedup compilation speed.
>
> You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.
|
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to FeepingCreature | On 1/19/2019 1:12 AM, FeepingCreature wrote: > If you've benchmarked this, could you please post your benchmark source so people can reproduce it? I benchmarked it while developing Warp (the C preprocessor replacement I did for Facebook). I was able to speed up searches for .h files substantially by remembering previous lookups in a hash table. The speedup persisted across Windows and Linux. https://github.com/facebookarchive/warp > Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results. Sounds like a good idea. Please take charge of this! |
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones.
BTW firefox uses fast compression option indicated by general purpose flags 2, std.zip uses default compression option.
|
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
>
> So looking up fewer files would make it faster.
Sounds rather strange that on modern operating systems, that cache files themselves and the metadata even more surely (VFS directory cache as an example), file lookup is a problem.
Calls to something like glob(3) could render the whole phobos directory tree to your memory in milliseconds.
|
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Sat, Jan 19, 2019 at 08:59:37AM +0000, Stefan Koch via Digitalmars-d wrote: > On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote: > > Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. > > > > [...] > > If we are going there we might as well use a proper database as the > compiler cache format. > similar to to pre-compiled headers. [...] I'd like to see us go in this direction. It could lead to other new things, like the compiler inferring attributes for all functions (not just template / auto functions) and storing the inferred attributes in the precompiled cache. It could even store additional derived information not representable in the source that could be used for program-wide optimization, etc.. T -- Let's call it an accidental feature. -- Larry Wall |
January 19, 2019 Re: Speeding up importing Phobos files | ||||
---|---|---|---|---|
| ||||
Posted in reply to FeepingCreature | On 1/19/19 4:12 AM, FeepingCreature wrote: > On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote: >> On 1/19/2019 1:00 AM, Temtaime wrote: >>> C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? >>> Better speedup compilation speed. >> >> You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow. > > If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results. I've done a bunch of measurements while I was working on https://github.com/dlang/DIPs/blob/master/DIPs/DIP1005.md, on a modern machine with SSD and Linux (which aggressively caches file contents). I don't think I still have the code, but it shouldn't be difficult to sit down and produce some. The overall conclusion of those experiments was that if you want to improve compilation speed, you need to minimize the number of files opened; once opened, whether it was 1 KB or 100 KB made virtually no difference. One thing I didn't measure was whether opening the file was most overhead, or closing also had a large share. |
Copyright © 1999-2021 by the D Language Foundation