Jump to page: 1 28  
Page
Thread overview
Potential of a compiler that creates the executable at once
Feb 10, 2022
rempas
Feb 10, 2022
Araq
Feb 10, 2022
rempas
Feb 10, 2022
bauss
Feb 10, 2022
Mark
Feb 11, 2022
rempas
Feb 10, 2022
Walter Bright
Feb 11, 2022
rempas
Feb 11, 2022
max haughton
Feb 11, 2022
rempas
Feb 11, 2022
user1234
Feb 11, 2022
user1234
Feb 11, 2022
H. S. Teoh
Feb 11, 2022
Stanislav Blinov
Feb 11, 2022
max haughton
Feb 11, 2022
rempas
Feb 12, 2022
user1234
Feb 12, 2022
Walter Bright
Feb 11, 2022
H. S. Teoh
Feb 11, 2022
rempas
Feb 11, 2022
H. S. Teoh
Feb 12, 2022
rempas
Feb 12, 2022
forkit
Feb 12, 2022
rempas
Feb 12, 2022
user1234
Feb 11, 2022
user1234
Feb 11, 2022
rempas
Feb 11, 2022
H. S. Teoh
Feb 12, 2022
Walter Bright
Feb 12, 2022
Walter Bright
Feb 12, 2022
forkit
Feb 12, 2022
rempas
Feb 12, 2022
forkit
Feb 12, 2022
rempas
Feb 12, 2022
Paulo Pinto
Feb 12, 2022
rempas
Feb 12, 2022
Paulo Pinto
Feb 12, 2022
rempas
Feb 12, 2022
H. S. Teoh
Feb 12, 2022
rempas
Feb 12, 2022
H. S. Teoh
Feb 12, 2022
rempas
Feb 12, 2022
max haughton
Feb 12, 2022
rempas
Feb 12, 2022
Walter Bright
Feb 12, 2022
rempas
Feb 12, 2022
max haughton
Feb 13, 2022
Walter Bright
Feb 13, 2022
max haughton
Feb 11, 2022
Dennis
Feb 11, 2022
rempas
Feb 12, 2022
Walter Bright
Feb 12, 2022
rempas
Feb 12, 2022
Walter Bright
Feb 12, 2022
Walter Bright
Feb 10, 2022
Dave P.
Feb 10, 2022
max haughton
Feb 10, 2022
Dave P.
Feb 10, 2022
Walter Bright
Feb 11, 2022
rikki cattermole
Feb 11, 2022
Walter Bright
Feb 11, 2022
max haughton
Feb 11, 2022
Walter Bright
Feb 11, 2022
Dennis
Feb 12, 2022
Walter Bright
Feb 11, 2022
Dennis
Feb 11, 2022
sfp
Feb 11, 2022
max haughton
Feb 12, 2022
Walter Bright
Feb 12, 2022
John Colvin
Feb 12, 2022
Walter Bright
Feb 11, 2022
rempas
Feb 11, 2022
Era Scarecrow
Feb 11, 2022
Walter Bright
Feb 11, 2022
max haughton
Feb 12, 2022
Walter Bright
Feb 11, 2022
rempas
Feb 11, 2022
Patrick Schluter
Feb 11, 2022
rempas
Feb 11, 2022
Era Scarecrow
February 10, 2022

A couple of months ago, I found out about a language called Vox which uses a design that I haven't seen before by any other compiler which is to not create object files and then link them together but instead, always create an executable at once. This means that every time we change something in our code, we have to recompile the whole thing. Naturally, you will say that this is a huge problem because we will have to wait a lot of times every time we make a small change to our project but here is the thing... With this design, the compilation times can become really really fast (of course the design of the compiler matters too)!

At some point about 3 months ago, the creator of the language said that at that point, Vox can compile 1.2M LoC/S which is really really fast and this is a point that 99% of the projects will not reach so your project will always compiler in less than a second no matter what! What is even more impressive is that Vox is single thread so when parsing the files for symbols and errors, why could get a much bigger performance boost if we had multithread support!

Of course, not creating object files and then link them means that we don't have to create a lot of object files and then link them all into a big executable but rather start creating this executable and add everything up. You can understand how this can save a lot of time! And CPUs are so fast in our days that we can compile Million lines of code in less than a second using multi-thead support so even then very rare huge projects will compile very fast.
What's even more impressive is that Vox is not even the fastest compiler out there. TCC is even faster (about 4-5 times)! I have personally tried to see how fast TCC is able to compile using my CPU which is Ryzen 5 2400G. I was able to compile 4M LoC in 700ms! Yeah, the speeds are crazy! And my CPU is an average one, if you were to build a PC now, you would get something that is at least 20% faster with at least 2 more threads!

However, this is not the best test. This was only an one-line functions that had the same assembly code in it without any preprocess and libraries linked so I don't know if this played any role but that was 8 files using 8 threads and the speed is just unreal! And TCC DOES create object files and then links them. How faster it could be if it used the same design Vox uses (And how slower would Vox be if it used the same design regular compilers use?)?

Of course, TCC doesn't produce optimized code but still, even when compared with GCC's "-O0", it generates code 4-7 times faster than GCC so if TCC could optimize code as much as GCC and was using the design Vox used, I can see it been able to compile around 1-1.5M LoC/s!

I am personally really interested and inspired to make my own compiler by this design. This design also solves a lot of problems that we would have to take into account in the other classic method. One thing that I thought was the ability to be able to also export your project as a library (mostly shared/dynamic) so in case you have something really huge like 10+M LoC (Linux kernel I'm talking to you!), you could split it to "sub projects" that will be libraries and then link them all together.

Another idea would be to check the type of the files that are passed to the compiler and if they are source files, do not create object files as they would not be kept anyways. So the following would apply:

my_lang -c test3.lang // compile mode! Outputs the object files "test3.o"

my_lang -c test1.lang test2.lang test3.o -o=TEST // Create executable. "test1.lang" and "test2.lang" are source files so we won't create object files for them but rather will go straight to create a binary out of them. "test3.o" is an object files so we will "copy-past" its symbols to the final binary file.

This is probably the best of both worlds!

So I thought about sharing this and see what your thoughts are! How fast DMD could be using this design? Or even better if we created a new, faster backend for DMD that would be faster than the current one? D could be very competitive!

February 10, 2022

On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:

>

This is probably the best of both worlds!

It's a very bad idea, it's in fact so bad that I wouldn't call it a "design":

  • Since everything is recompiled all the time regardless, there is no incentive for "modularity" in the language design. Nor is there any incentive to keep the compiler's internals clean. Soon everything in the compiler operates on an enormous mutable graph internally, encouraging many, many bugs.
  • You'll likely run into memory management problems too as you cannot free memory as everything is connected to everything else. Even if you are willing to use a GC the GC cannot help you much as your liveset simply keeps growing.
  • Every compiler bugfix tends to add code to a compiler, so it'll get slower over time.
  • The same is true for the memory consumption, it'll get worse over time.
  • Every optimization you add to the compiler must not destroy your lovely compile-times. So everything in the compiler is speed-critical and has to be optimized. Almost anything you do ends up being on the critical path.
  • This does not only affect optimizations (which can depend on algorithms that are O(n^3) btw) but also all sorts of linting phases. And static analysis gets more important over time too.

In summary: People expect optimizers and static analysis to get better too and demand more of their tools. Your "design" doesn't allow for this. And in an IDE setting you might be able to skip all the expensive optimization steps, but not the static analyser steps.

February 10, 2022

On Thursday, 10 February 2022 at 10:38:05 UTC, Araq wrote:

>

It's a very bad idea, it's in fact so bad that I wouldn't call it a "design":

  • Since everything is recompiled all the time regardless, there is no incentive for "modularity" in the language design. Nor is there any incentive to keep the compiler's internals clean. Soon everything in the compiler operates on an enormous mutable graph internally, encouraging many, many bugs.
  • You'll likely run into memory management problems too as you cannot free memory as everything is connected to everything else. Even if you are willing to use a GC the GC cannot help you much as your liveset simply keeps growing.
  • Every compiler bugfix tends to add code to a compiler, so it'll get slower over time.
  • The same is true for the memory consumption, it'll get worse over time.
  • Every optimization you add to the compiler must not destroy your lovely compile-times. So everything in the compiler is speed-critical and has to be optimized. Almost anything you do ends up being on the critical path.
  • This does not only affect optimizations (which can depend on algorithms that are O(n^3) btw) but also all sorts of linting phases. And static analysis gets more important over time too.

In summary: People expect optimizers and static analysis to get better too and demand more of their tools. Your "design" doesn't allow for this. And in an IDE setting you might be able to skip all the expensive optimization steps, but not the static analyser steps.

Thank you for your reply! I suppose you are right and I'm glad I asked people with more experience than me. It would be fun to hear more negative thoughts to see all the things that I'm missing.

February 10, 2022

On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:

>

At some point about 3 months ago, the creator of the language said that at that point, Vox can compile 1.2M LoC/S which is really really fast and this is a point that 99% of the projects will not reach so your project will always compiler in less than a second no matter what! What is even more impressive is that Vox is single thread so when parsing the files for symbols and errors, why could get a much bigger performance boost if we had multithread support!

You see, there's a large misconception here.

Typically slow compile times aren't due to the LoC a project has, but rather what happens during the compilation.

Ex. template instantiation, functions executed at ctfe, preprocessing, optimization etc.

I've seen projects with only a couple thousand lines of code compile slower than projects with hundreds of thousands of lines of code.

Generally most compiles can read large source files and parse their tokens etc. really fast, it's usually what happens afterwards that are the bottleneck.

Say if you have a project that is compiling very slow, usually you won't start out by cutting the amount of lines you have, because that's often not as easy or even possible, but rather you profile where the compiler is spending most of its time and then you attempt to resolve it, ex. perhaps you're running nested loops that are unnecessary etc. at compile-time and so on.

February 10, 2022
This is actually the reason behind why dmd will create a single object file when given multiple source files on the command line. It's also why dmd can create a library directly.

I've toyed with the idea of generating an executable directly many times.
February 10, 2022

On Thursday, 10 February 2022 at 11:54:59 UTC, bauss wrote:

>

On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:

>

At some point about 3 months ago, the creator of the language said that at that point, Vox can compile 1.2M LoC/S which is really really fast and this is a point that 99% of the projects will not reach so your project will always compiler in less than a second no matter what! What is even more impressive is that Vox is single thread so when parsing the files for symbols and errors, why could get a much bigger performance boost if we had multithread support!

You see, there's a large misconception here.

Typically slow compile times aren't due to the LoC a project has, but rather what happens during the compilation.

Ex. template instantiation, functions executed at ctfe, preprocessing, optimization etc.

If you generate an executable directly (without going through compilation to object files and then linking) then you can save some compile time on these tasks, no? For instance, you can maintain some sort of global cache so that repeated instantiations of the same template (in different compilation units) are detected during compilation; this then saves you time on compiling something that you have already compiled before. I assume that such repeated instantiations are very common when there is heavy usage of the standard library.

The same goes for identical CTFEs and any other compilation step that can potentially repeat in different compilation units.

Assuming link-time optimization, the end result (the executable) should be the same, but the compile times will be different.

February 10, 2022

On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:

>

[...]

I think it would be interesting to combine a compiler and a linker into a single executable. Not necessarily for speed reasons, but for better diagnostics and the possibility of type checking external symbols. Linker errors can sometimes be hard to understand in the presence of inlining and optimizations. The linker will report references to symbols not present in your code or present in completely different places.

For example:

extern(D) int some_func(int x);

pragma(inline, true)
private int foo(int x){
    return some_func(x);
}

pragma(inline, true)
private int bar(int x){
    return foo(x);
}

pragma(inline, true)
private int baz(int x){
    return bar(x);
}

pragma(inline, true)
private int qux(int x){
    return baz(x);
}

int main(){
    return qux(2);
}

When you go to compile it:

Undefined symbols for architecture arm64:
  "__D7example9some_funcFiZi", referenced from:
      __D7example3fooFiZi in example.o
      __D7example3barFiZi in example.o
      __D7example3bazFiZi in example.o
      __D7example3quxFiZi in example.o
      __Dmain in example.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Error: /usr/bin/cc failed with status: 1

The linker sees references to the extern function in places where I never wrote that in my source code. In a nontrivial project this can be quite confusing if you’re not used to this quirk of the linking process.

If the compiler is invoking the linker for you anyway, why can’t it read the object files and libraries and tell you exactly what is missing and where in your code you reference it?

February 10, 2022

On Thursday, 10 February 2022 at 22:06:30 UTC, Dave P. wrote:

>

On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:

>

[...]

I think it would be interesting to combine a compiler and a linker into a single executable. Not necessarily for speed reasons, but for better diagnostics and the possibility of type checking external symbols. Linker errors can sometimes be hard to understand in the presence of inlining and optimizations. The linker will report references to symbols not present in your code or present in completely different places.

[...]

This goes away if you do a debug build, which most (all professionals I'm aware of) people do.

And why should the compiler do something the linker is going to do anyway? It would have to wait until after linking anyway because you might want a symbol to be defined somewhere else.

February 10, 2022

On Thursday, 10 February 2022 at 22:11:13 UTC, max haughton wrote:

>

On Thursday, 10 February 2022 at 22:06:30 UTC, Dave P. wrote:

>

On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:

>

[...]

I think it would be interesting to combine a compiler and a linker into a single executable. Not necessarily for speed reasons, but for better diagnostics and the possibility of type checking external symbols. Linker errors can sometimes be hard to understand in the presence of inlining and optimizations. The linker will report references to symbols not present in your code or present in completely different places.

[...]

This goes away if you do a debug build, which most (all professionals I'm aware of) people do.

That is a debug build:

ldc2 example.d -O0

Undefined symbols for architecture arm64:
  "__D7example9some_funcFiZi", referenced from:
      __D7example3fooFiZi in example.o
      __D7example3barFiZi in example.o
      __D7example3bazFiZi in example.o
      __D7example3quxFiZi in example.o
      __Dmain in example.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Error: /usr/bin/cc failed with status: 1

I’m used to it at this point, but for people new to the C-style model of separate compilation it is extremely confusing. It’s made worse by the name mangling required to get C-linkers to link code from more modern languages.

>

And why should the compiler do something the linker is going to do anyway? It would have to wait until after linking anyway because you might want a symbol to be defined somewhere else.

You would still give the compiler libraries if you wanted them defined elsewhere and in my idea the compiler would also be the linker so there is no “the linker is going to do anyway”.

February 10, 2022
On 2/10/2022 2:06 PM, Dave P. wrote:
> Undefined symbols for architecture arm64:
>    "__D7example9some_funcFiZi", referenced from:
>        __D7example3fooFiZi in example.o
>        __D7example3barFiZi in example.o
>        __D7example3bazFiZi in example.o
>        __D7example3quxFiZi in example.o
>        __Dmain in example.o
> ld: symbol(s) not found for architecture arm64

Things I have never been able to explain, even to long time professional programmers:

1. what "undefined symbol" means

2. what "multiply defined symbol" means

3. how linkers resolve symbols

Our own runtime library illustrates this bafflement. In druntime, there are these "hooks" where one can replace the default function that deals with assertion errors.

Such hooks are entirely unnecessary.

To override a symbol in a library, just write your own function with the same name and link it in before the library.

I have never been able to explain these to people. I wonder if it is because it is so simple, people think "that can't be right". With the hook thing, they'll ask me to re-explain it several times, then they'll say "are you sure?" and they still don't believe it.
« First   ‹ Prev
1 2 3 4 5 6 7 8