Speeding up importing Phobos files (page 6) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Speeding up importing Phobos files (page 6)

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Adam D. Ruppe
in reply to Gregor Mückl

Adam D. Ruppe

Posted in reply to Gregor Mückl

On Friday, 7 June 2019 at 15:46:03 UTC, Gregor Mückl wrote:
> How would compilation even work with multiple modules per file? Wouldn't the compiler  have to parse all .d files in the whole search path then? That would be the opposite of faster compile times.

It doesn't have to search because you pass the modules to the compiler, just like we do now in the general case.

The search path and filename conventions are - today - just conventions, there's no requirement that the filename and module name match, which means you don't actually know what module a file is until it is parsed.

(which is quick and simple btw)

June 07, 2019

Re: Speeding up importing Phobos files

Posted by KnightMare
in reply to H. S. Teoh

KnightMare

Posted in reply to H. S. Teoh

> What *would* be really nice is if dmd could read .zip archives. Then all you need to use a library is to download the .zip into your source tree and run `dmd -i`.

or to use LZ4 for no-dependencies from zlib and smaller code:
decompressing is 50 LOC, and probably 200LOC for processing "directory entries" inside file. its faster by 10x than zip - decompressing working w/o hash-tables as LZW do.
contra: u cannot just unzip files

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Seb
in reply to Mike Franklin

Seb

Posted in reply to Mike Franklin

On Friday, 7 June 2019 at 15:34:22 UTC, Mike Franklin wrote:
> On Friday, 7 June 2019 at 14:45:34 UTC, H. S. Teoh wrote:
>
>> The flip side of this coin is that having one file per module helps with code organization in typical use cases. As opposed to the chaotic situation in C/C++ where you can have one .h file with arbitrary numbers of .c/.cc files that implement the function prototypes, with no obvious mapping between them, or multiple .h files with a single amorphous messy .c/.cc file that implements everything. Or where there are arbitrary numbers of modules nested inside any number of .c/.cc files with no clear separation between them, and where filenames bear no relation with contents, making code maintenance challenging at best, outright nightmarish at worst.
>
> What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import.  Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library.
>
> There may be use cases (like mine with the memory-mapped IO registers) where users may want to actually develop with more than one module per file.  Or when one wants to share multi-package/module cut-and-pastable code examples. But those are special cases.
>
> What's nice is that it changes nothing for users today.  It just removes an arbitrary limitation to help improve import speed while enabling some more flexibility for those who may it.
>
> Mike

Reading files is really cheap, evaluating templates and running CTFE isn't. That's why importing Phobos modules is slow - not because of the files it imports, but because of all the CTFE these imports trigger. This means we can do e.g.

- improve CTFE performance
- cache templates over multiple compilations (there's a DMD PR for this)
- make imports lazy
- reduce all the big CTFE bottlenecks in Phobos (e.g. std.uni)
- convert more Phobos code into templates with local imports to reduce the baseline import overhead

Ordered from hardest to easiest.

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Nick Sabalausky (Abscissa)
in reply to Mike Franklin

Nick Sabalausky (Abscissa)

Posted in reply to Mike Franklin

On 6/7/19 5:47 AM, Mike Franklin wrote:
> 
> Why not remove the arbitrary limitation that a module or package must be tied to a file or directory respectively?

We don't really have that limitation. The compiler gets the package/module name from the `module ...;` statement (if any) at the beginning of a *.d file. It's only when the `module` statement is absent that the package/module name is inferred from the filepath. (There *might* also be such a requirement when importing *but not compiling* a given module, but I'm not sure on that.)

Beyond that, any other requirement for packages/modules to match the filesystem is purely a convention relied upon by certain buildsystems, like rdmd (and now, `dmd -i`), and otherwise has nothing to do with the compiler.

TBH, I've always kinda wanted to just do away with the ability to have modules/packages that DON'T match the filesystem. I never saw any sensible use-cases for supporting such weird mismatches that couldn't already be accomplished via -I (not to be confused with the new -i) or version/static if.

HOWEVER, that said, you do bring up an interesting point I'd never thought of: If concating a bunch of modules/packages into one file would improve compile time, than that would certainly be a feature worth considering.

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Nick Sabalausky (Abscissa)
in reply to KnightMare

Nick Sabalausky (Abscissa)

Posted in reply to KnightMare

On 6/7/19 8:50 AM, KnightMare wrote:
> zip-archive allows you to unpack the file in its original form.
> unpacking allows to see source code.
> 

That's an interesting approach to the issue: Just allow a package to be either a directory tree OR a tar/tarball archive of a directory tree. Then, the language itself wouldn't need any provisions at all to support multiple packages in one file, and the compiler could still read an entire multi-module package by accessing only one file.

June 07, 2019

Re: Speeding up importing Phobos files

Posted by H. S. Teoh
in reply to Nick Sabalausky (Abscissa)

H. S. Teoh

Posted in reply to Nick Sabalausky (Abscissa)

On Fri, Jun 07, 2019 at 03:48:53PM -0400, Nick Sabalausky (Abscissa) via Digitalmars-d wrote:
[...]
> TBH, I've always kinda wanted to just do away with the ability to have modules/packages that DON'T match the filesystem. I never saw any sensible use-cases for supporting such weird mismatches that couldn't already be accomplished via -I (not to be confused with the new -i) or version/static if.
> 
> HOWEVER, that said, you do bring up an interesting point I'd never thought of: If concating a bunch of modules/packages into one file would improve compile time, than that would certainly be a feature worth considering.

I honestly doubt it would improve compilation time that much. Reading files from the filesystem is pretty fast, compared to the rest of the stuff the compiler has to do afterwards.

If anything, it would *slow down* the process if you really only needed to import one module but it was concatenated with a whole bunch of others in a single file, thereby requiring the compiler to parse the whole thing just to find it. There's also the problem that if multiple -I's were given, and more than 1 of those paths contain concatenated modules, then the compiler would potentially have to parse *everything* in *every import path* just to be sure it will find the one import you asked for.

I don't know, it just seems like there are too many disadvantages to justify doing this.

T

-- 
Philosophy: how to make a career out of daydreaming.

June 07, 2019

Re: Speeding up importing Phobos files

Posted by KnightMare
in reply to H. S. Teoh

KnightMare

Posted in reply to H. S. Teoh

>
> I honestly doubt it would improve compilation time that much. Reading files from the filesystem is pretty fast, compared to the rest of the stuff the compiler has to do afterwards.
>

need to search for existing files in dir-tree with some templates "datetime.d*" for each folder in -I.

I tried compile (in Windows) one small program from today issue https://issues.dlang.org/show_bug.cgi?id=19947 under Procmon.exe (tool from SysInternals that allow to see what doing some process with Net,FS,Registry,Process&Threads). Result for FileSystem only is:
3497 requests to FS for create/open,query,close,read/write (libs and dll counted too)
768! requests just "not found" (dlls and libs counted too)

2nd try with option "-c" - compile only (no linking):
2693 requests to FS
727 reqs are "not found"

so for clear benchmark need to compare compilation some middle program (without any dub packages coz they add mor path for searching maybe N! factorial) with SSD and with RAM-disk, imo we can win 1-2secs for compilation

June 07, 2019

Re: Speeding up importing Phobos files

Posted by H. S. Teoh
in reply to KnightMare

H. S. Teoh

Posted in reply to KnightMare

On Fri, Jun 07, 2019 at 09:09:48PM +0000, KnightMare via Digitalmars-d wrote: [...]
> I tried compile (in Windows) one small program from today issue
> https://issues.dlang.org/show_bug.cgi?id=19947 under Procmon.exe (tool from
> SysInternals that allow to see what doing some process with
> Net,FS,Registry,Process&Threads). Result for FileSystem only is:
> 3497 requests to FS for create/open,query,close,read/write (libs and dll
> counted too)
> 768! requests just "not found" (dlls and libs counted too)
[...]

This is a known issue that has been discussed before.  The proposed solution was to cache the contents of each directory in the import path (probably lazily, so that we don't incur up-front costs) so that the compiler can subsequently find a module pathname with just a single hash lookup.

I don't know if anyone set about implementing this, though.

T

-- 
Error: Keyboard not attached. Press F1 to continue. -- Yoon Ha Lee, CONLANG

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Patrick Schluter
in reply to Nick Sabalausky (Abscissa)

Patrick Schluter

Posted in reply to Nick Sabalausky (Abscissa)

On Friday, 7 June 2019 at 19:56:27 UTC, Nick Sabalausky (Abscissa) wrote:
> On 6/7/19 8:50 AM, KnightMare wrote:
>> zip-archive allows you to unpack the file in its original form.
>> unpacking allows to see source code.
>> 
>
> That's an interesting approach to the issue: Just allow a package to be either a directory tree OR a tar/tarball archive of a directory tree. Then, the language itself wouldn't need any provisions at all to support multiple packages in one file, and the compiler could still read an entire multi-module package by accessing only one file.

Isn't it what Java does? A jar file is nothing more than a zip file.

June 08, 2019

Re: Speeding up importing Phobos files

Posted by Mike Franklin
in reply to Seb

Mike Franklin

Posted in reply to Seb

On Friday, 7 June 2019 at 16:38:56 UTC, Seb wrote:

> Reading files is really cheap, evaluating templates and running CTFE isn't. That's why importing Phobos modules is slow - not because of the files it imports, but because of all the CTFE these imports trigger.

Yes that make much more sense to me.  But, if that's the case, what's all the concern from Walter and Andrei expressed in this thread and in the conversations linked below?

https://forum.dlang.org/post/q7dpmg$29oq$1@digitalmars.com
https://github.com/dlang/druntime/pull/2634#issuecomment-499494019
https://github.com/dlang/druntime/pull/2222#issuecomment-398390889

Mike

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation