Thread overview | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 22, 2013 proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Attachments:
| A) Currently, D suffers from a high degree of interdependency between modules; when one wants to use a single symbol (say std.traits.isInputRange), we pull out all of std.traits, which in turn pulls out all of std.array,std.string, etc. This results in slow compile times (relatively to the case where we didn't have to pull all this), and fat binaries: see example in point "D)" below. This has been discussed many times before, and some people have suggested breaking modules into submodules such as: std.range.traits, etc to mitigate this a little, however this requires people to change 'import std.range' to 'import std.range.traits' to benefit from it, and also in many cases this will be ineffective. B) I'd like to propose something different that can potentially dramatically reduce compile time/binary size, while not requiring users to scar their source code as above. *in short: *perform semantic analysis for a function/template/struct/class on demand, if that symbol is encountered starting from main(). * * *in more details:* suppose we compile a binary (dmd -ofmain foo1.d foo2.d main.d) input files are lexed, parsed (code should be syntactically valid) semantic analysis is performed, but doesn't go inside at function/template/struct/class declaration main() symbol is located in symbol table start lazy semantic analysis from the main() function and using a breadth first search (BFS) propagation strategy: a symbol (function/template/struct/class) 's body/return type/template constraints is only semantically analyzed when that symbol is encountered along the BFS path. this strategy could be enabled by a switch -lazy_compilation in dmd. The only time it would differ from existing compilation model would be when some unused code triggers compile error: eg: ---- void foo(){int x=y;} void main(){} ---- dmd main.d //error: y is undefined dmd -lazy_compilation main.d //OK: foo is never mentioned starting from main(), so accept. This would be very useful to speed up the edit/compile/debug cycle. Example2: ---- auto foo(){return "import std.stdio;";} mixin(foo); void fun2(){import b;} void main(){writeln("ok");} ---- lazy semantic analysis will analyze main, foo but not fun2, which is not used. foo is analyzed because it is used in a module-level mixin declaration. C) *caveats:* this works when compiling *binaries*, as we know which symbols end up in the final binary for compiling libraries (-shared/-static), it works if we have a way to specify which symbols are meant to be exported (eg https://www.gnu.org/software/gnulib/manual/html_node/Exported-Symbols-of-Shared-Libraries.html). Is there, currently? We could specify a list of symbols to export to dmd via a command line flag. This could be: dmd -exported_symbols=filename.d main.d bar.d with filename.d containing all exported symbols, eg: ---- module exported_symbols; public import foo.d; //imports all symbols from foo public import bar:baz;//imports just bar.baz void fun(){}//imports fun ---- D) Example showing problem with current situation: ---- module main; version(A) import std.range; else{ //copy paste here body of 'isInputRange' from std.range } void fun(){ auto a=isInputRange!string;} ---- dmd -c main.d: nm main.o|wc -l: 8 file size of main.o: 1.1K cpu time (10 runs): 0.119 s dmd -c -version=A main.d: nm main.o|wc -l: 324 => 40X file size of main.o: 72K => 70X cpu time (10 runs): 2.7 s => 23X Q: Why do we care about compilation speed, etc, since dmd is already fast? A1: Many cases where it matters, eg for the REPL I'm working on, that requires compiling on the fly and needs interactive speed. A2: for large projects, where compilation can become slow |
June 22, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timothee Cour | Timothee Cour:
> C)
> *caveats:*
> this works when compiling *binaries*, as we know which symbols end up in the final binary for compiling libraries
> (-shared/-static), it works if we have a way to specify which
> symbols are meant to be exported (eg
> https://www.gnu.org/software/gnulib/manual/html_node/Exported-Symbols-of-Shared-Libraries.html).
> Is there, currently?
For D perhaps there are better/nicer ways to do this.
Bye,
bearophile
|
June 22, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | D has "export" keyword that I always expected to do exactly this until have found out it is actually platform-dependent and useless. |
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dicebot | On 06/22/2013 11:20 AM, Dicebot wrote: > D has "export" keyword that I always expected to do exactly this until > have found out it is actually platform-dependent and useless. It's buggy and useful. http://d.puremagic.com/issues/show_bug.cgi?id=9816 We should try to strive for -fvisibility=hidden on UNIX because it allows to optimize non-exported symbols and because we need explicit exports for anyhow. |
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timothee Cour | On 06/22/2013 06:45 AM, Timothee Cour wrote:
> Example2:
> ----
> auto foo(){return "import std.stdio;";}
> mixin(foo);
> void fun2(){import b;}
> void main(){writeln("ok");}
> ----
> lazy semantic analysis will analyze main, foo but not fun2, which is not
> used. foo is analyzed because it is used in a module-level mixin
> declaration.
>
Overall it's a good idea. There are already some attempts to shift to lazy semantic analysis, mainly to solve any remaining forward reference issues.
Also for non-optimized builds parsing takes a huge part of the compilation time so that would remain, I don't have detailed numbers though.
|
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On 06/24/2013 02:23 AM, Martin Nowak wrote:
> exports for anyhow.
for Windows that is
|
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak Attachments:
| On Sun, Jun 23, 2013 at 5:36 PM, Martin Nowak <code@dawg.eu> wrote: > On 06/22/2013 06:45 AM, Timothee Cour wrote: > >> Example2: >> ---- >> auto foo(){return "import std.stdio;";} >> mixin(foo); >> void fun2(){import b;} >> void main(){writeln("ok");} >> ---- >> lazy semantic analysis will analyze main, foo but not fun2, which is not used. foo is analyzed because it is used in a module-level mixin declaration. >> >> Overall it's a good idea. There are already some attempts to shift to > lazy semantic analysis, mainly to solve any remaining forward reference > issues. > Also for non-optimized builds parsing takes a huge part of the compilation > time so that would remain, I don't have detailed numbers though. > why 'that would remain' ? in the proposed lazy compilation model, optimization level is irrelevant. The only thing that matters is whether we have to export all symbols or only specified ones. I agree we should require marking those explicitly with 'export' on all platforms, not just windows. But in doing so we must allow to define those exported symbols outside of where they're defined, otherwise it will make code ugly (eg, what if we want to export std.process.kill in a user shared library and std.process.kill isn't marked as export) Here's a possibility module define_exported_symbols; import std.process; export std.process.kill; //export all std.process.kill overloads (just 1 function in this case) export std.process; //export all functions in std.process export std; //export all functions in std But I think the best is to keep the current export semantics (but make it work on all platforms not just windows) and provide library code to help with exporting entire modules/packages: module std.sharedlib; //helper functions for dlls on all platforms void export_module(alias module_)(module_ mymodule){ } void export_symbols(R) (R symbols) if(isInputRange!R){//export a range of symbols } /+ usage: export_module(std.process); //exports all functions in std.process export_symbols(enumerateFunctions(std.process)); //exports all functions in std.process; allows to be more flexible by exporting only a subset of those +/ |
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timothee Cour | It should be possible to "export"(or rather "share") types, mixins, templates, generic unit tests, etc. (shared compile time constructs would just be "copied" to a shared library as they can't be compiled) All public compilable constructs should be automatically exported. A shared keyword added to a function declaration can mark it as "exportable". e.g., module A; shared foo(){ ... }; shared mixin template bar() { ... }; shared template Foo(T) { .... }; shared interface Bar { .... }; shared myunittest(F1, F2, ...) { ... ); shared mycontract(F) { .... }; etc... All shared constructs are added to the export table and available for use. Generic unit tests and contracts allows one to "collect" common unit tests and contracts and apply them to arbitrary functions and classes. By including compile time constructs in a library allows one to group a set of functionality, both run-time and compile-time, at one location. As far as lazy evaluation goes, I think only any reachable symbol from main should be included regardless unless otherwise specified. e.g., suppose we have a scriptable application that uses some statically shared library. It may be that some custom look function lookup is used. One needs a way to insure that the compiler will include symbols that might not be reachable at compile time. In this case one should simply have to mark a module as reachable as to include all shared symbols... or lets say just a group of symbols: import A {foo, bar, FOO*, !BAR*, ... } where the brackets are used to tell the compiler to include all the symbols(with regex capabilities). ! can be used to force exclusion, technically it shouldn't be needed but it could be useful in some cases. |
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On Monday, 24 June 2013 at 01:20:46 UTC, Martin Nowak wrote:
> On 06/24/2013 02:23 AM, Martin Nowak wrote:
>> exports for anyhow.
> for Windows that is
And Aix, unless they have adopted the more common UNIX model meanwhile.
|
June 24, 2013 Re: proposal: lazy compilation model for compiling binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timothee Cour | This is now a bit confusing to me. I just made up my mind to go
with D instead of Go, because Go is too simplistic in my opinion.
Furthermore, calling C from D is a lot easier than from Go. And
now this ... I have too little understanding of D to see what the
impact of this build time issue is. Does this mean build times
come close to what they are in C++ or is this issue only about
builds not being as fast as the D people are used to ..?
Thanks, Oliver
On Saturday, 22 June 2013 at 04:45:31 UTC, Timothee Cour wrote:
> A)
> Currently, D suffers from a high degree of interdependency between modules;
> when one wants to use a single symbol (say std.traits.isInputRange), we
> pull out all of std.traits, which in turn pulls out all of
> std.array,std.string, etc. This results in slow compile times (relatively
> to the case where we didn't have to pull all this), and fat binaries: see
> example in point "D)" below.
>
> This has been discussed many times before, and some people have suggested
> breaking modules into submodules such as: std.range.traits, etc to mitigate
> this a little, however this requires people to change 'import std.range'
> to 'import std.range.traits' to benefit from it, and also in many cases
> this will be ineffective.
>
> B)
> I'd like to propose something different that can potentially dramatically
> reduce compile time/binary size, while not requiring users to scar their
> source code as above.
> ....
|
Copyright © 1999-2021 by the D Language Foundation