View mode: basic / threaded / horizontal-split · Log in · Help
June 12, 2012
AST files instead of DI interface files for faster compilation and easier distribution
There's a current pull request to improve di file generation 
(https://github.com/D-Programming-Language/dmd/pull/945); I'd 
like to suggest further ideas.
As far as I understand, di interface files try to achieve these 
conflicting goals:

1) speed up compilation by avoiding having to reparse large files 
over and over.
2) hide implementation details for proprietary reasons
3) still maintain source code in some form to allow inlining and 
CTFE
4) be human readable

-Goals 2) and 3) are clearly contradictory, so that calls for a 
command line switch (eg -hidesource), which should be off by 
default, which when set will indeed remove any implementation 
details (where possible, ie for non-template and non-auto-return 
functions) but as a counterpart also prevent any chance for 
inlining/CTFE for the corresponding exported API. That choice 
will be left to the user.

-Regarding point 1), it won't be untypical to have a D interface 
file to be almost as large (and slow to parse) as the original 
source file, even with the upcoming di file improvements 
(dmd/pull/945), as D encourages the use of templates/auto-return 
throughout (a large part of phobos would be left 
quasi-unchanged). In fact, the fast compile time of D _does_ 
suffer when there are heavy use of templates, or scaling up.

So to make interface files really useful in terms of speeding up 
compilation, why not directly store the AST (could be text-based 
like JSON but preferably a portable binary format for speed, call 
it ".dib" file), with possibly some amount of analysis (eg: 
version(windows) could be pre-handled). This would be analoguous 
to precompiled header files 
(http://en.wikipedia.org/wiki/Precompiled_header), which don't 
exist in D AFAIK. This could be done by extending the currently 
incomplete json file generation by dmd, to include AST of 
implementation of each function we want to export such as 
templates or stuff to inline). During compilation of a module, 
"import myfun;" would look for 1) myfun.dib (binary or json 
precompiled interface file), 2) myfun.di (if still needed), 3) 
myfun.d.



We could even go a step further, borrowing some ideas from the 
"framework" feature found in OSX to distribute components: a 
single D framework would combine the AST (~ precompiled .dib 
headers) of a set of D modules and a set of libraries.
The user would then use a framework as follows:

    dmd -L-framework mylib -L-Lpath/to/mylib main.d

or simply:

    dmd main.d

if main.d contains pragma(framework,"mylib") and framework mylib 
is in the search path

As in OSX's frameworks, framework mylib is used both during 
compilation (resolving import statements in main.d) and linking. 
Upon encountering an "import myfun;" declaration, the compiler 
would search the linked in frameworks for a symbol or file 
representing the corresponding AST of module myfun, and if not 
found, use the default import mechanism.
That will both speed up compilation times and make distribution 
of libraries and versioning a breeze: single framework to 
download and to link against (this is different from what rdmd 
does). On OSX, frameworks appear as a single file in Finder but 
are actually directories; here we could have either a single file 
or a directory as well.

Finally, regarding point 4), a simple command line switch (eg dmd 
--pretty-print myfun.di) will pretty-print to stdout the AST, and 
omit the implementation of templates and auto functions for 
brevity, so they appear as simple di files (but some options 
could filter out AST nodes for IDE use, etc).

Thanks for your comments!
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
Currently .di-files are compiler independent. If this should hold 
for dib-files, too, we'll need a standard ast structure, won't we?
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On 12-06-2012 12:23, Tobias Pankrath wrote:
> Currently .di-files are compiler independent. If this should hold for
> dib-files, too, we'll need a standard ast structure, won't we?
>

Which is a Good Thing (TM). It would /require/ formalization of the 
language once and for all.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On 12/06/12 11:07, timotheecour wrote:
> There's a current pull request to improve di file generation
> (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to
> suggest further ideas.
> As far as I understand, di interface files try to achieve these
> conflicting goals:
>
> 1) speed up compilation by avoiding having to reparse large files over
> and over.
> 2) hide implementation details for proprietary reasons
> 3) still maintain source code in some form to allow inlining and CTFE
> 4) be human readable

Is that actually true? My recollection is that the original motivation 
was only goal (2), but I was fairly new to D at the time (2005).

Here's the original post where it was implemented:
http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in 
DMD0.142

Personally I believe that.di files are *totally* the wrong approach for 
goal (1). I don't think goal (1) and (2) have anything in common at all 
with each other, except that C tried to achieve both of them using 
header files. It's an OK solution for (1) in C, it's a failure in C++, 
and a complete failure in D.

IMHO: If we want goal (1), we should try to achieve goal (1), and stop 
pretending its in any way related to goal (2).
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On 06/12/2012 12:47 PM, Alex Rønne Petersen wrote:
> On 12-06-2012 12:23, Tobias Pankrath wrote:
>> Currently .di-files are compiler independent. If this should hold for
>> dib-files, too, we'll need a standard ast structure, won't we?
>>
>
> Which is a Good Thing (TM). It would /require/ formalization of the
> language once and for all.
>

I do not see how this conclusion could be reached.
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:
> On 12/06/12 11:07, timotheecour wrote:
>> There's a current pull request to improve di file generation
>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd 
>> like to
>> suggest further ideas.
>> As far as I understand, di interface files try to achieve these
>> conflicting goals:
>>
>> 1) speed up compilation by avoiding having to reparse large 
>> files over
>> and over.
>> 2) hide implementation details for proprietary reasons
> > 3) still maintain source code in some form to allow inlining
> and CTFE
> > 4) be human readable
>
> Is that actually true? My recollection is that the original 
> motivation was only goal (2), but I was fairly new to D at the 
> time (2005).
>
> Here's the original post where it was implemented:
> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
> and it got partially merged into DMD 0.141 (Dec 4 2005), first 
> usable in DMD0.142
>
> Personally I believe that.di files are *totally* the wrong 
> approach for goal (1). I don't think goal (1) and (2) have 
> anything in common at all with each other, except that C tried 
> to achieve both of them using header files. It's an OK solution 
> for (1) in C, it's a failure in C++, and a complete failure in 
> D.
>
> IMHO: If we want goal (1), we should try to achieve goal (1), 
> and stop pretending its in any way related to goal (2).

I absolutely agree with the above and would also add that goal 
(4) is an anti-feature. In order to get a human readable version 
of the API the programmer should use *documentation*. D claims 
that one of its goals is to make it a breeze to provide 
documentation by bundling a standard tool - DDoc. There's no need 
to duplicate this just to provide another format when DDoc itself 
supposed to be format agnostic.

This is a solved problem since the 80's (E.g. Pascal units). Per 
Adam's post, the issue is tied to DMD's use of OMF/optlink which 
we all would like to get rid of anyway. Once we're in proper COFF 
land, couldn't we just store the required metadata (binary AST?) 
in special sections in the object files themselves?

Another related question - AFAIK the LLVM folks did/are doing 
work to make their implementation less platform-depended. Could 
we leverage this in ldc to store LLVM bit code as D libs which 
still retain enough info for the compiler to replace header files?
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On 12.06.2012 16:09, foobar wrote:
> On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:
>> On 12/06/12 11:07, timotheecour wrote:
>>> There's a current pull request to improve di file generation
>>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to
>>> suggest further ideas.
>>> As far as I understand, di interface files try to achieve these
>>> conflicting goals:
>>>
>>> 1) speed up compilation by avoiding having to reparse large files over
>>> and over.
>>> 2) hide implementation details for proprietary reasons
>> > 3) still maintain source code in some form to allow inlining
>> and CTFE
>> > 4) be human readable
>>
>> Is that actually true? My recollection is that the original motivation
>> was only goal (2), but I was fairly new to D at the time (2005).
>>
>> Here's the original post where it was implemented:
>> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
>> and it got partially merged into DMD 0.141 (Dec 4 2005), first usable
>> in DMD0.142
>>
>> Personally I believe that.di files are *totally* the wrong approach
>> for goal (1). I don't think goal (1) and (2) have anything in common
>> at all with each other, except that C tried to achieve both of them
>> using header files. It's an OK solution for (1) in C, it's a failure
>> in C++, and a complete failure in D.
>>
>> IMHO: If we want goal (1), we should try to achieve goal (1), and stop
>> pretending its in any way related to goal (2).
>
> I absolutely agree with the above and would also add that goal (4) is an
> anti-feature. In order to get a human readable version of the API the
> programmer should use *documentation*. D claims that one of its goals is
> to make it a breeze to provide documentation by bundling a standard tool
> - DDoc. There's no need to duplicate this just to provide another format
> when DDoc itself supposed to be format agnostic.
>
Absolutely. DDoc being built-in didn't sound right to me at first, BUT 
it allows us to essentially being able to say that APIs are covered in 
the DDoc generated files. Not header files etc.

> This is a solved problem since the 80's (E.g. Pascal units).

Right, seeing yet another newbie hit it everyday is a clear indication 
of a simple fact: people would like to think & work in modules rather 
then seeing guts of old and crappy OBJ file technology. Linking with C 
!= using C tools everywhere.

>Per Adam's
> post, the issue is tied to DMD's use of OMF/optlink which we all would
> like to get rid of anyway. Once we're in proper COFF land, couldn't we
> just store the required metadata (binary AST?) in special sections in
> the object files themselves?
>
Seconded. At least lexed form could be very compact, I recall early 
compressors tried doing the Huffman thing on source code tokens with a 
certain success.

> Another related question - AFAIK the LLVM folks did/are doing work to
> make their implementation less platform-depended. Could we leverage this
> in ldc to store LLVM bit code as D libs which still retain enough info
> for the compiler to replace header files?
>


-- 
Dmitry Olshansky
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:
> On 12/06/12 11:07, timotheecour wrote:
>> There's a current pull request to improve di file generation
>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd 
>> like to
>> suggest further ideas.
>> As far as I understand, di interface files try to achieve these
>> conflicting goals:
>>
>> 1) speed up compilation by avoiding having to reparse large 
>> files over
>> and over.
>> 2) hide implementation details for proprietary reasons
> > 3) still maintain source code in some form to allow inlining
> and CTFE
> > 4) be human readable
>
> Is that actually true? My recollection is that the original 
> motivation was only goal (2), but I was fairly new to D at the 
> time (2005).
>
> Here's the original post where it was implemented:
> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
> and it got partially merged into DMD 0.141 (Dec 4 2005), first 
> usable in DMD0.142
>
> Personally I believe that.di files are *totally* the wrong 
> approach for goal (1). I don't think goal (1) and (2) have 
> anything in common at all with each other, except that C tried 
> to achieve both of them using header files. It's an OK solution 
> for (1) in C, it's a failure in C++, and a complete failure in 
> D.
>
> IMHO: If we want goal (1), we should try to achieve goal (1), 
> and stop pretending its in any way related to goal (2).

I absolutely agree with the above and would also add that goal 
(4) is an anti-feature. In order to get a human readable version 
of the API the programmer should use *documentation*. D claims 
that one of its goals is to make it a breeze to provide 
documentation by bundling a standard tool - DDoc. There's no need 
to duplicate this just to provide another format when DDoc itself 
supposed to be format agnostic.

This is a solved problem since the 80's (E.g. Pascal units). Per 
Adam's post, the issue is tied to DMD's use of OMF/optlink which 
we all would like to get rid of anyway. Once we're in proper COFF 
land, couldn't we just store the required metadata (binary AST?) 
in special sections in the object files themselves?

Another related question - AFAIK the LLVM folks did/are doing 
work to make their implementation less platform-depended. Could 
we leverage this in ldc to store LLVM bit code as D libs which 
still retain enough info for the compiler to replace header files?
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
On 2012-06-12 14:09, foobar wrote:

> This is a solved problem since the 80's (E.g. Pascal units). Per Adam's
> post, the issue is tied to DMD's use of OMF/optlink which we all would
> like to get rid of anyway. Once we're in proper COFF land, couldn't we
> just store the required metadata (binary AST?) in special sections in
> the object files themselves?

Can't the same be done with OMF? I'm not saying I want to keep OMF.

-- 
/Jacob Carlborg
June 12, 2012
Re: AST files instead of DI interface files for faster compilation and easier distribution
Le 12/06/2012 12:23, Tobias Pankrath a écrit :
> Currently .di-files are compiler independent. If this should hold for
> dib-files, too, we'll need a standard ast structure, won't we?
>

We need it anyway at some point. AST macro is another example.

It would also greatly simplify compiler writing if the D interpreter 
could be provided as lib (and so run on top of dib file).

I want to mention that LLVM IR + metadata can do a really good job here. 
In addition, LLVM people are working on a JIT backend, if you know what 
I mean ;)
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home