Thread overview | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 12, 2012 AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable -Goals 2) and 3) are clearly contradictory, so that calls for a command line switch (eg -hidesource), which should be off by default, which when set will indeed remove any implementation details (where possible, ie for non-template and non-auto-return functions) but as a counterpart also prevent any chance for inlining/CTFE for the corresponding exported API. That choice will be left to the user. -Regarding point 1), it won't be untypical to have a D interface file to be almost as large (and slow to parse) as the original source file, even with the upcoming di file improvements (dmd/pull/945), as D encourages the use of templates/auto-return throughout (a large part of phobos would be left quasi-unchanged). In fact, the fast compile time of D _does_ suffer when there are heavy use of templates, or scaling up. So to make interface files really useful in terms of speeding up compilation, why not directly store the AST (could be text-based like JSON but preferably a portable binary format for speed, call it ".dib" file), with possibly some amount of analysis (eg: version(windows) could be pre-handled). This would be analoguous to precompiled header files (http://en.wikipedia.org/wiki/Precompiled_header), which don't exist in D AFAIK. This could be done by extending the currently incomplete json file generation by dmd, to include AST of implementation of each function we want to export such as templates or stuff to inline). During compilation of a module, "import myfun;" would look for 1) myfun.dib (binary or json precompiled interface file), 2) myfun.di (if still needed), 3) myfun.d. We could even go a step further, borrowing some ideas from the "framework" feature found in OSX to distribute components: a single D framework would combine the AST (~ precompiled .dib headers) of a set of D modules and a set of libraries. The user would then use a framework as follows: dmd -L-framework mylib -L-Lpath/to/mylib main.d or simply: dmd main.d if main.d contains pragma(framework,"mylib") and framework mylib is in the search path As in OSX's frameworks, framework mylib is used both during compilation (resolving import statements in main.d) and linking. Upon encountering an "import myfun;" declaration, the compiler would search the linked in frameworks for a symbol or file representing the corresponding AST of module myfun, and if not found, use the default import mechanism. That will both speed up compilation times and make distribution of libraries and versioning a breeze: single framework to download and to link against (this is different from what rdmd does). On OSX, frameworks appear as a single file in Finder but are actually directories; here we could have either a single file or a directory as well. Finally, regarding point 4), a simple command line switch (eg dmd --pretty-print myfun.di) will pretty-print to stdout the AST, and omit the implementation of templates and auto functions for brevity, so they appear as simple di files (but some options could filter out AST nodes for IDE use, etc). Thanks for your comments! |
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to timotheecour | Currently .di-files are compiler independent. If this should hold for dib-files, too, we'll need a standard ast structure, won't we? |
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tobias Pankrath | On 12-06-2012 12:23, Tobias Pankrath wrote: > Currently .di-files are compiler independent. If this should hold for > dib-files, too, we'll need a standard ast structure, won't we? > Which is a Good Thing (TM). It would /require/ formalization of the language once and for all. -- Alex Rønne Petersen alex@lycus.org http://lycus.org |
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to timotheecour | On 12/06/12 11:07, timotheecour wrote: > There's a current pull request to improve di file generation > (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to > suggest further ideas. > As far as I understand, di interface files try to achieve these > conflicting goals: > > 1) speed up compilation by avoiding having to reparse large files over > and over. > 2) hide implementation details for proprietary reasons > 3) still maintain source code in some form to allow inlining and CTFE > 4) be human readable Is that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2). |
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Alex Rønne Petersen | On 06/12/2012 12:47 PM, Alex Rønne Petersen wrote:
> On 12-06-2012 12:23, Tobias Pankrath wrote:
>> Currently .di-files are compiler independent. If this should hold for
>> dib-files, too, we'll need a standard ast structure, won't we?
>>
>
> Which is a Good Thing (TM). It would /require/ formalization of the
> language once and for all.
>
I do not see how this conclusion could be reached.
|
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don Clugston | On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:
> On 12/06/12 11:07, timotheecour wrote:
>> There's a current pull request to improve di file generation
>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to
>> suggest further ideas.
>> As far as I understand, di interface files try to achieve these
>> conflicting goals:
>>
>> 1) speed up compilation by avoiding having to reparse large files over
>> and over.
>> 2) hide implementation details for proprietary reasons
> > 3) still maintain source code in some form to allow inlining
> and CTFE
> > 4) be human readable
>
> Is that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005).
>
> Here's the original post where it was implemented:
> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
> and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142
>
> Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D.
>
> IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).
I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic.
This is a solved problem since the 80's (E.g. Pascal units). Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?
Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?
|
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to foobar | On 12.06.2012 16:09, foobar wrote: > On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote: >> On 12/06/12 11:07, timotheecour wrote: >>> There's a current pull request to improve di file generation >>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to >>> suggest further ideas. >>> As far as I understand, di interface files try to achieve these >>> conflicting goals: >>> >>> 1) speed up compilation by avoiding having to reparse large files over >>> and over. >>> 2) hide implementation details for proprietary reasons >> > 3) still maintain source code in some form to allow inlining >> and CTFE >> > 4) be human readable >> >> Is that actually true? My recollection is that the original motivation >> was only goal (2), but I was fairly new to D at the time (2005). >> >> Here's the original post where it was implemented: >> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html >> and it got partially merged into DMD 0.141 (Dec 4 2005), first usable >> in DMD0.142 >> >> Personally I believe that.di files are *totally* the wrong approach >> for goal (1). I don't think goal (1) and (2) have anything in common >> at all with each other, except that C tried to achieve both of them >> using header files. It's an OK solution for (1) in C, it's a failure >> in C++, and a complete failure in D. >> >> IMHO: If we want goal (1), we should try to achieve goal (1), and stop >> pretending its in any way related to goal (2). > > I absolutely agree with the above and would also add that goal (4) is an > anti-feature. In order to get a human readable version of the API the > programmer should use *documentation*. D claims that one of its goals is > to make it a breeze to provide documentation by bundling a standard tool > - DDoc. There's no need to duplicate this just to provide another format > when DDoc itself supposed to be format agnostic. > Absolutely. DDoc being built-in didn't sound right to me at first, BUT it allows us to essentially being able to say that APIs are covered in the DDoc generated files. Not header files etc. > This is a solved problem since the 80's (E.g. Pascal units). Right, seeing yet another newbie hit it everyday is a clear indication of a simple fact: people would like to think & work in modules rather then seeing guts of old and crappy OBJ file technology. Linking with C != using C tools everywhere. >Per Adam's > post, the issue is tied to DMD's use of OMF/optlink which we all would > like to get rid of anyway. Once we're in proper COFF land, couldn't we > just store the required metadata (binary AST?) in special sections in > the object files themselves? > Seconded. At least lexed form could be very compact, I recall early compressors tried doing the Huffman thing on source code tokens with a certain success. > Another related question - AFAIK the LLVM folks did/are doing work to > make their implementation less platform-depended. Could we leverage this > in ldc to store LLVM bit code as D libs which still retain enough info > for the compiler to replace header files? > -- Dmitry Olshansky |
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don Clugston | On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:
> On 12/06/12 11:07, timotheecour wrote:
>> There's a current pull request to improve di file generation
>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to
>> suggest further ideas.
>> As far as I understand, di interface files try to achieve these
>> conflicting goals:
>>
>> 1) speed up compilation by avoiding having to reparse large files over
>> and over.
>> 2) hide implementation details for proprietary reasons
> > 3) still maintain source code in some form to allow inlining
> and CTFE
> > 4) be human readable
>
> Is that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005).
>
> Here's the original post where it was implemented:
> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
> and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142
>
> Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D.
>
> IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).
I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic.
This is a solved problem since the 80's (E.g. Pascal units). Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?
Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?
|
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to foobar | On 2012-06-12 14:09, foobar wrote: > This is a solved problem since the 80's (E.g. Pascal units). Per Adam's > post, the issue is tied to DMD's use of OMF/optlink which we all would > like to get rid of anyway. Once we're in proper COFF land, couldn't we > just store the required metadata (binary AST?) in special sections in > the object files themselves? Can't the same be done with OMF? I'm not saying I want to keep OMF. -- /Jacob Carlborg |
June 12, 2012 Re: AST files instead of DI interface files for faster compilation and easier distribution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tobias Pankrath | Le 12/06/2012 12:23, Tobias Pankrath a écrit :
> Currently .di-files are compiler independent. If this should hold for
> dib-files, too, we'll need a standard ast structure, won't we?
>
We need it anyway at some point. AST macro is another example.
It would also greatly simplify compiler writing if the D interpreter could be provided as lib (and so run on top of dib file).
I want to mention that LLVM IR + metadata can do a really good job here. In addition, LLVM people are working on a JIT backend, if you know what I mean ;)
|
Copyright © 1999-2021 by the D Language Foundation