Jump to page: 1 29  
Page
Thread overview
Speeding up importing Phobos files
Jan 19, 2019
Walter Bright
Jan 19, 2019
Stefan Koch
Jan 19, 2019
H. S. Teoh
Jan 19, 2019
Temtaime
Jan 19, 2019
Walter Bright
Jan 19, 2019
FeepingCreature
Jan 19, 2019
Walter Bright
Jan 21, 2019
Pjotr Prins
Jan 19, 2019
Kagamin
Jan 19, 2019
Boris-Barboris
Jan 20, 2019
sarn
Jan 19, 2019
Neia Neutuladh
Jan 19, 2019
H. S. Teoh
Jan 19, 2019
Neia Neutuladh
Jan 19, 2019
Stefan Koch
Jan 20, 2019
Doc Andrew
Jan 21, 2019
Walter Bright
Jan 21, 2019
Adam Wilson
Jan 21, 2019
Jacob Carlborg
Jan 21, 2019
Adam Wilson
Jan 21, 2019
Vladimir Panteleev
Jan 21, 2019
Vladimir Panteleev
Jan 21, 2019
Neia Neutuladh
Jan 21, 2019
H. S. Teoh
Jan 21, 2019
Stefan Koch
Jan 20, 2019
Thomas Mader
Jan 21, 2019
Vladimir Panteleev
Jan 21, 2019
Walter Bright
Jan 21, 2019
Vladimir Panteleev
Jan 21, 2019
Walter Bright
Jan 21, 2019
Neia Neutuladh
Jan 21, 2019
Adam D. Ruppe
Jan 21, 2019
H. S. Teoh
Jan 22, 2019
12345swordy
Jun 08, 2019
H. S. Teoh
Jun 08, 2019
Guillaume Piolat
Jun 08, 2019
Nicholas Wilson
Jun 08, 2019
Adam D. Ruppe
Jun 08, 2019
matheus
Jun 08, 2019
Atila Neves
Jun 10, 2019
Amex
Jan 22, 2019
Adam D. Ruppe
Jan 22, 2019
Adam D. Ruppe
Jan 22, 2019
Jonathan M Davis
Jan 22, 2019
Neia Neutuladh
Jun 07, 2019
Mike Franklin
Jun 07, 2019
KnightMare
Jun 07, 2019
KnightMare
Jun 07, 2019
Patrick Schluter
Jun 07, 2019
H. S. Teoh
Jun 07, 2019
Mike Franklin
Jun 07, 2019
Gregor Mückl
Jun 07, 2019
Adam D. Ruppe
Jun 07, 2019
H. S. Teoh
Jun 07, 2019
KnightMare
Jun 07, 2019
Seb
Jun 08, 2019
Mike Franklin
Jun 09, 2019
Mike Franklin
Jun 13, 2019
Seb
Jun 07, 2019
H. S. Teoh
Jun 07, 2019
KnightMare
Jun 07, 2019
H. S. Teoh
Jun 08, 2019
Amex
Jun 08, 2019
KnightMare
Jun 08, 2019
KnightMare
Jun 08, 2019
Adam D. Ruppe
Jun 08, 2019
KnightMare
Jun 08, 2019
KnightMare
Jun 08, 2019
KnightMare
January 19, 2019
Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.

So looking up fewer files would make it faster.

Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then,

    dmd myfile.d phobos.zip

and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!)

It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping.

We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones.

This can be a fun challenge! Anyone up for it?

P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success.
January 19, 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
>
> [...]

If we are going there we might as well use a proper database as the compiler cache format.
similar to to pre-compiled headers.

I'd be interested to see in how this bears out.
I am going to devote some time this weekend to it.

But using sqlite rather than zip.

January 19, 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
>
> So looking up fewer files would make it faster.
>
> Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then,
>
>     dmd myfile.d phobos.zip
>
> and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!)
>
> It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping.
>
> We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones.
>
> This can be a fun challenge! Anyone up for it?
>
> P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success.

C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
Better speedup compilation speed.
January 19, 2019
On 1/19/2019 1:00 AM, Temtaime wrote:
> C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
> Better speedup compilation speed.

You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
January 19, 2019
On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:
> On 1/19/2019 1:00 AM, Temtaime wrote:
>> C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
>> Better speedup compilation speed.
>
> You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.

If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.
January 19, 2019
On 1/19/2019 1:12 AM, FeepingCreature wrote:
> If you've benchmarked this, could you please post your benchmark source so people can reproduce it?

I benchmarked it while developing Warp (the C preprocessor replacement I did for Facebook). I was able to speed up searches for .h files substantially by remembering previous lookups in a hash table. The speedup persisted across Windows and Linux.

https://github.com/facebookarchive/warp


> Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.

Sounds like a good idea. Please take charge of this!
January 19, 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones.

BTW firefox uses fast compression option indicated by general purpose flags 2, std.zip uses default compression option.
January 19, 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
>
> So looking up fewer files would make it faster.

Sounds rather strange that on modern operating systems, that cache files themselves and the metadata even more surely (VFS directory cache as an example), file lookup is a problem.

Calls to something like glob(3) could render the whole phobos directory tree to your memory in milliseconds.
January 19, 2019
On Sat, Jan 19, 2019 at 08:59:37AM +0000, Stefan Koch via Digitalmars-d wrote:
> On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> > Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.
> > 
> > [...]
> 
> If we are going there we might as well use a proper database as the
> compiler cache format.
> similar to to pre-compiled headers.
[...]

I'd like to see us go in this direction.  It could lead to other new things, like the compiler inferring attributes for all functions (not just template / auto functions) and storing the inferred attributes in the precompiled cache.  It could even store additional derived information not representable in the source that could be used for program-wide optimization, etc..


T

-- 
Let's call it an accidental feature. -- Larry Wall
January 19, 2019
On 1/19/19 4:12 AM, FeepingCreature wrote:
> On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:
>> On 1/19/2019 1:00 AM, Temtaime wrote:
>>> C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ?
>>> Better speedup compilation speed.
>>
>> You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
> 
> If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.

I've done a bunch of measurements while I was working on https://github.com/dlang/DIPs/blob/master/DIPs/DIP1005.md, on a modern machine with SSD and Linux (which aggressively caches file contents). I don't think I still have the code, but it shouldn't be difficult to sit down and produce some. The overall conclusion of those experiments was that if you want to improve compilation speed, you need to minimize the number of files opened; once opened, whether it was 1 KB or 100 KB made virtually no difference.

One thing I didn't measure was whether opening the file was most overhead, or closing also had a large share.
« First   ‹ Prev
1 2 3 4 5 6 7 8 9