March 22, 2009
Actually, dmd is so fast I never bother with these "build" utilities.  I just send it all the files and have it rebuild everytime, deleting all the o files afterward.

This is very fast, even for larger projects.  It appears (to me) the static cost of calling dmd is much greater than the dynamic cost of compiling a file.  These toolkits always compile a, then b, then c, which takes like 2.5 times as long as compiling a, b, and c at once.

That said, if dmd were made to link into other programs, these toolkits could hook into it, and have the fixed cost only once (theoretically) - but still dynamically decide which files to compile.  This seems ideal.

-[Unknown]


davidl wrote:
> 
> 1. compiler know in what situation a file need to be recompiled
> 
> Consider the file given the same header file, then the obj file of this will be required for linking, all other files import this file shouldn't require any recompilation in this case. If a file's header file changes, thus the interface changes, all files import this file should be recompiled.
> Compiler can emit building command like rebuild does.
> 
> I would enjoy:
> 
> dmd -buildingcommand abc.d  > responsefile
> 
> dmd @responsefile
> 
> I think we need to eliminate useless recompilation as much as we should with consideration of the growing d project size.
> 
> 2. maintaining the build without compiler support costs
> 
March 23, 2009
Kristian Kilpi wrote:
> On Sun, 22 Mar 2009 14:14:39 +0200, Christopher Wright <dhasenan@gmail.com> wrote:
> 
>> Kristian Kilpi wrote:
>>> #includes/imports are redundant information: the source code of course describes what's used in it. So, the compiler could be aware of the whole project (and the libraries used) instead of one file at the time.
>>
>> That's not sufficient. I'm using SDL right now; if I type 'Surface s;', should that import sdl.surface or cairo.Surface? How is the compiler to tell? How should the compiler find out where to look for classes named Surface? Should it scan everything under /usr/local/include/d/? That's going to be pointlessly expensive.
> 
> Such things should of course be told to the compiler somehow. By using the project configuration, or by other means. (It's only a matter of definition.)
> 
> For example, if my project contains the Surface class, then 'Surface s;' should of course refer to it. If some library (used by the project) also has the Surface class, then one should use some other way to refer it (e.g. sdl.Surface).

Then I want to deal with a library type with the same name as my builtin type.

You can come up with a convention that does the right thing 90% of the time, but produces strange errors on occasion.

> But my point was that the compilers today do not have knowledge about the projects as a whole. That makes this kind of 'scanning' too expensive (in the current compiler implementations). But if the compilers were build differently that wouldn't have to be true.

If you want a system that accepts plugins, you will never have access to the entire project. If you are writing a library, you will never have access to the entire project. So a compiler has to address those needs, too.

> If I were to create/design a compiler (which I am not ;) ), it would be something like this:
> 
> Every file is cached (why to read and parse files over and over again, if not necessary). These cache files would contain all the information (parse trees, interfaces, etc) needed during the compilation (of the whole project). Also, they would contain the compilation results too (i.e. assembly). So, these cache/database files would logically replace the old object files.
> 
> That is, there would be database for the whole project. When something gets changed, the compiler knows what effect it has and what's required to do.

All this is helpful for developers. It's not helpful if you are merely compiling everything once, but then, the overhead would only be experienced on occasion.

> And finally, I would also change the format of libraries. A library would be one file only. No more header/.di -files; one compact file containing all the needed information (in a binary formated database that can be read very quickly).

Why binary? If your program can operate efficiently with a textual representation, it's easier to test, easier to debug, and less susceptible to changes in internal structures.

Additionally, a database in a binary format will require special tools to examine. You can't just pop it open in a text editor to see what functions are defined.
March 23, 2009
Andrei Alexandrescu wrote:
> grauzone wrote:
>> My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?
> 
> Hold off on that for now.
> 
>> Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?
> 
> rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler.

Is this just an "interesting idea", or are you actually considering implementing it?

Anyway, maybe you could pressure Walter to fix that dmd bug, that stops dsss from being efficient. I can't advertise this enough.

> Andrei
March 23, 2009
grauzone wrote:
> Andrei Alexandrescu wrote:
>> grauzone wrote:
>>> My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?
>>
>> Hold off on that for now.
>>
>>> Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?
>>
>> rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler.
> 
> Is this just an "interesting idea", or are you actually considering implementing it?

I would if there was a compelling case made in favor of it.

Andrei
March 24, 2009
"Christopher Wright" <dhasenan@gmail.com> wrote in message news:gq6lms$1815$1@digitalmars.com...
>> And finally, I would also change the format of libraries. A library would be one file only. No more header/.di -files; one compact file containing all the needed information (in a binary formated database that can be read very quickly).
>
> Why binary? If your program can operate efficiently with a textual representation, it's easier to test, easier to debug, and less susceptible to changes in internal structures.
>
> Additionally, a database in a binary format will require special tools to examine. You can't just pop it open in a text editor to see what functions are defined.

"If your program can operate efficiently with a textual representation..."

I think that's the key right there. Most of the time, parsing a sensibly-designed text format is going to be a bit slower than reading in an equivalent sensibly-designed (as opposed to over-engineered [pet-peeve]ex: GOLD Parser Builder's .cgt format[/pet-peeve]) binary format. First off, there's just simply more raw data to be read off the disk and processed, then you've got the actual tokenizing/syntax-parsing itself, and then anything that isn't supposed to be interpreted as a string (like ints and bools) need to get converted to their proper internal representations. And then for saving, you go through all the same, but in reverse. (Also, mixed human/computer editing of a text file can sometimes be problematic.)

With a sensibly-designed binary format (and a sensible systems language like D, as opposed to C# or Java), all you really need to do is load a few chunks into memory and apply some structs over top of them. Toss in some trivial version checks and maybe some endian fixups and you're done. Very little processing and memory is needed.

I can certainly appreciate the other benefits of text formats, though, and certainly agree that there are cases where the performance of using a text format would be perfectly acceptable.

But it can add up. And I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.


March 24, 2009
Nick Sabalausky:
> I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.

Maybe not much, because today textual files can be compressed and decomperssed on the fly. CPUs are now fast enough that even with compression the I/O is usually the bottleneck anyway.

Bye,
bearophile
March 24, 2009
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:gqbe2k$13al$1@digitalmars.com...
> Nick Sabalausky:
>> I often wonder how much faster and more
>> memory-efficient things like linux and the web could have been if they
>> weren't so big on sticking damn near everything into "convenient" text
>> formats.
>
> Maybe not much, because today textual files can be compressed and decomperssed on the fly. CPUs are now fast enough that even with compression the I/O is usually the bottleneck anyway.
>

I've become more and more wary of this "CPUs are now fast enough..." phrase that keeps getting tossed around these days. The problem is, that argument gets used SO much, that on this fastest computer I've ever owned, I've actually experienced *basic text-entry boxes* (with no real bells or whistles or anything) that had *seconds* of delay. That never once happened to me on my "slow" Apple 2.

The unfortunate truth is that the speed and memory of modern systems are constantly getting used to rationalize shoddy bloatware practices and we wind up with systems that are even *slower* than they were back on less-powerful hardware. It's pathetic, and drives me absolutely nuts.


March 24, 2009
Nick Sabalausky:
> That never once happened to me on my "slow" Apple 2.<

See here too :-) http://hubpages.com/hub/_86_Mac_Plus_Vs_07_AMD_DualCore_You_Wont_Believe_Who_Wins

Yet, what I have written is often true :-)
Binary data can't be compressed as well as textual data, and lzop is I/O bound in most situations:
http://www.lzop.org/

Bye,
bearophile
March 24, 2009
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:gqbgma$189l$1@digitalmars.com...
> Nick Sabalausky:
>> That never once happened to me on my "slow" Apple 2.<
>
> See here too :-) http://hubpages.com/hub/_86_Mac_Plus_Vs_07_AMD_DualCore_You_Wont_Believe_Who_Wins
>

Excellent article :)

> Yet, what I have written is often true :-)
> Binary data can't be compressed as well as textual data,

Doesn't really matter, since binary data (assuming a format that isn't over-engineered) is already smaller than the same data in text form. Text compresses well *because* it contains so much more excess redundant data than binary data does. I could stick 10GB of zeros to the end of a 1MB binary file and suddenly it would compress far better than any typical text file.

> and lzop is I/O bound in most situations: http://www.lzop.org/
>

I'm not really sure why you're bringing up compression...? Do you mean that the actual disk access time of a text format can be brought down to the time of an equivalent binary format by storing the text file in a compressed form?


March 24, 2009
Nick Sabalausky wrote:
> But it can add up. And I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.

Most programs only need to load up text on startup. So the cost of parsing the config file is linear in the number of times you start the application, and linear in the size of the config file.

If there were a binary database format in place of libraries, I would be fine with it, as long as there were a convenient way to get the textual version.