Speeding up importing Phobos files (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Speeding up importing Phobos files (page 5)

January 22, 2019

Re: Speeding up importing Phobos files

Posted by Adam D. Ruppe
in reply to Steven Schveighoffer

Adam D. Ruppe

Posted in reply to Steven Schveighoffer

On Monday, 21 January 2019 at 21:38:21 UTC, Steven Schveighoffer wrote:
> One note -- I don't think modules like std.datetime were split up for the sake of the compiler parsing speed

Yeah, I think std.datetime was about the size of unittest runs, but std.range, as I recall at least, was specifically to separate stuff to get quicker builds by avoiding the majority of the import for common cases via std.range.primitives which doesn't need to bring in as much code. The local import pattern also helps with this goal - lazy imports to only get what you need when you need it, so the compiler doesn't have to do as much work.

Maybe I am wrong about that, but still, two files that import each other aren't actually two modules. Phobos HAS been making a LOT of progress toward untangling that import mess and addressing specific compile time problems. A few years ago, any phobos import would cost you like a half second. It is down to a quarter second for the hello world example. Which is IMO still quite poor, but a lot better than it was.

But is this caused by finding the files?

$ cd dmd2/src/phobos
$ time find .
real    0m0.003s

I find that very hard to believe.

And let us remember, old D1 phobos wasn't this slow:

$ cat hi.d
import std.stdio;

void main() {
        writefln("Hello!");
}

$ time dmd-1.0 hi.d

real    0m0.042s
user    0m0.032s
sys     0m0.005s

$ time dmd hi.d

real    0m0.434s
user    0m0.383s
sys     0m0.044s



Using the old D compilers reminds me what quick compiles REALLY are. Sigh.

January 22, 2019

Re: Speeding up importing Phobos files

Posted by Adam D. Ruppe
in reply to Adam D. Ruppe

Adam D. Ruppe

Posted in reply to Adam D. Ruppe

BTW

dmd2/src/phobos$ time cat `find . | grep -E '\.d$'` > catted.d

real    0m0.015s
user    0m0.006s
sys     0m0.009s

$ wc catted.d
  319707  1173911 10889167 catted.d


If it were the filesystem at fault, shouldn't that I/O heavy operation take a significant portion of the dmd runtime?

Yes, I know the kernel is caching these things and deferring writes and so on. But it does that for dmd too! Blaming the filesystem doesn't pass the prima facie test, at least on Linux. Maybe Windows is different, I will try that tomorrow, but I remain exceedingly skeptical.

January 22, 2019

Re: Speeding up importing Phobos files

Posted by Neia Neutuladh
in reply to Arun Chandrasekaran

Neia Neutuladh

Posted in reply to Arun Chandrasekaran

On Tue, 22 Jan 2019 00:46:32 +0000, Arun Chandrasekaran wrote:
> If you still think the file read is the culprint, why does recompilation take the same amount of time as the first compilation (albeit kernel file cache)?

And another quick way to test this is to use import() and a hard-coded switch statement instead of IO. Just get rid of all disk access and see how fast you can compile code. I'm betting you'll save 10% at most.

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Mike Franklin
in reply to Walter Bright

Mike Franklin

Posted in reply to Walter Bright

On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
> Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.

The topic of import speed when an implementation is spread over multiple files came up again on a PR I'm working on, so I wanted to share an idea I had.

Why not remove the arbitrary limitation that a module or package must be tied to a file or directory respectively?  That is within one file you could have something like this:

---
package thePackage
{
    module thePackage.module1
    {
        ...
    }

    module thePackage.module2
    {
        ...
    }
}

package thePackage2
{
    ....
}
---

Then when one distributes their library, they could concatenate all the files into one, potentially reducing the overhead required to querying the filesystem.  After loading the file, the entire library would essentially be cached in memory.

It would also help with a number of other helpful patterns.  For example, I currently use the following pattern for creating register maps of memory-mapped IO:

---
final abstract class MyRegisterBank
{
    final abstract class MyRegister
    {
       //static properties for bit-fields
    }
}
// See https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/spi.d for the real thing
---

I could avoid doing silly things like that and use modules in a single file instead, which would be a more appropriate model for that use case.  I currently don't use modules simply because I'd end up with 100's of file to manage for a single MCU.

Another trivial use case would be the ability to test multiple packages and modules on run.dlang.io without the need for any additional infrastructure.

I'm sure the creativity of this community would find other interesting uses cases.

It seems like an arbitrary limitation to have the package/module system hard-coded to the underlying storage technology, and removing that limitation may help with the import speed while also enabling a few interesting use cases.

Mike

June 07, 2019

Re: Speeding up importing Phobos files

Posted by KnightMare
in reply to Mike Franklin

KnightMare

Posted in reply to Mike Franklin

zip-archive allows you to unpack the file in its original form.
unpacking allows to see source code.

ur version - join sources to one file - is more complicated
// file pack/a.d
module pack.one;
//...

// file pack/b.d
module pack.one;
//...

// ur package
package pack
{
    module one {
        // both files here?
    }

    @file( "pack/b.d" ); // ???
    module one {
        // or separate?
    }
    // how to restore individual files? need some special comment or attrs
}

June 07, 2019

Re: Speeding up importing Phobos files

Posted by KnightMare
in reply to KnightMare

KnightMare

Posted in reply to KnightMare

On Friday, 7 June 2019 at 12:50:27 UTC, KnightMare wrote:
> zip-archive allows you to unpack the file in its original form.
> unpacking allows to see source code.
>

u can unzip w/o any special tools - OSes can work with zip usually
so, yes, 1stage - using ZIP as one file is ok.

2stage: pack sources to 1 file as AST
pro:
- no need unpack stage from mapped files - all string spans can be stored as string-table-indicies allows to pack 2x-3x times
- no need parsing/verifying stage - u already have checheked AST.
contra:
need special tools (or commands to compiler) to unpack source files

June 07, 2019

Re: Speeding up importing Phobos files

Posted by H. S. Teoh
in reply to Mike Franklin

H. S. Teoh

Posted in reply to Mike Franklin

On Fri, Jun 07, 2019 at 09:47:34AM +0000, Mike Franklin via Digitalmars-d wrote: [...]
> It would also help with a number of other helpful patterns.  For example, I currently use the following pattern for creating register maps of memory-mapped IO:
> 
> ---
> final abstract class MyRegisterBank
> {
>     final abstract class MyRegister
>     {
>        //static properties for bit-fields
>     }
> }
> // See https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/spi.d
> for the real thing
> ---

Why final abstract class? If all you have are static properties, you could use structs instead.

Though of course, it still doesn't really address your fundamental concerns here.

[...]
> It seems like an arbitrary limitation to have the package/module system hard-coded to the underlying storage technology, and removing that limitation may help with the import speed while also enabling a few interesting use cases.
[...]

The flip side of this coin is that having one file per module helps with code organization in typical use cases. As opposed to the chaotic situation in C/C++ where you can have one .h file with arbitrary numbers of .c/.cc files that implement the function prototypes, with no obvious mapping between them, or multiple .h files with a single amorphous messy .c/.cc file that implements everything. Or where there are arbitrary numbers of modules nested inside any number of .c/.cc files with no clear separation between them, and where filenames bear no relation with contents, making code maintenance challenging at best, outright nightmarish at worst.

T

-- 
Real men don't take backups. They put their source on a public FTP-server and let the world mirror it. -- Linus Torvalds

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Mike Franklin
in reply to H. S. Teoh

Mike Franklin

Posted in reply to H. S. Teoh

On Friday, 7 June 2019 at 14:45:34 UTC, H. S. Teoh wrote:

> The flip side of this coin is that having one file per module helps with code organization in typical use cases. As opposed to the chaotic situation in C/C++ where you can have one .h file with arbitrary numbers of .c/.cc files that implement the function prototypes, with no obvious mapping between them, or multiple .h files with a single amorphous messy .c/.cc file that implements everything. Or where there are arbitrary numbers of modules nested inside any number of .c/.cc files with no clear separation between them, and where filenames bear no relation with contents, making code maintenance challenging at best, outright nightmarish at worst.

What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import.  Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library.

There may be use cases (like mine with the memory-mapped IO registers) where users may want to actually develop with more than one module per file.  Or when one wants to share multi-package/module cut-and-pastable code examples. But those are special cases.

What's nice is that it changes nothing for users today.  It just removes an arbitrary limitation to help improve import speed while enabling some more flexibility for those who may it.

Mike

June 07, 2019

Re: Speeding up importing Phobos files

Posted by Gregor Mückl
in reply to Mike Franklin

Gregor Mückl

Posted in reply to Mike Franklin

On Friday, 7 June 2019 at 15:34:22 UTC, Mike Franklin wrote:
> What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import.  Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library.
>
> There may be use cases (like mine with the memory-mapped IO registers) where users may want to actually develop with more than one module per file.  Or when one wants to share multi-package/module cut-and-pastable code examples. But those are special cases.
>
> What's nice is that it changes nothing for users today.  It just removes an arbitrary limitation to help improve import speed while enabling some more flexibility for those who may it.
>
> Mike

How would compilation even work with multiple modules per file? Wouldn't the compiler  have to parse all .d files in the whole search path then? That would be the opposite of faster compile times.

June 07, 2019

Re: Speeding up importing Phobos files

Posted by H. S. Teoh
in reply to Mike Franklin

H. S. Teoh

Posted in reply to Mike Franklin

On Fri, Jun 07, 2019 at 03:34:22PM +0000, Mike Franklin via Digitalmars-d wrote: [...]
> What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import.  Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library.
[...]

What *would* be really nice is if dmd could read .zip archives. Then all you need to use a library is to download the .zip into your source tree and run `dmd -i`.


T

-- 
I'm still trying to find a pun for "punishment"...

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation