Jump to page: 1 26  
Page
Thread overview
LLVM IR influence on compiler debugging
Jun 29, 2012
bearophile
Jun 29, 2012
Sönke Ludwig
Jun 29, 2012
Don Clugston
Jul 06, 2012
Kai Nacke
Jul 06, 2012
Adam Wilson
Jul 07, 2012
bearophile
Jul 07, 2012
Jonathan M Davis
Jul 07, 2012
Adam Wilson
Jul 07, 2012
Jonathan M Davis
Jul 07, 2012
Jacob Carlborg
Jul 07, 2012
Walter Bright
Jul 07, 2012
Adam Wilson
Jul 08, 2012
Jacob Carlborg
Jul 09, 2012
Simen Kjaeraas
Jul 09, 2012
Jonathan M Davis
Jul 10, 2012
Jacob Carlborg
Jul 10, 2012
Simen Kjaeraas
Jul 07, 2012
Walter Bright
Jul 07, 2012
Adam Wilson
Jul 07, 2012
Walter Bright
Jul 07, 2012
Paulo Pinto
Jul 07, 2012
Jacob Carlborg
Jul 07, 2012
Walter Bright
Jul 07, 2012
Jonathan M Davis
Jul 07, 2012
Walter Bright
Jul 07, 2012
Adam Wilson
Jul 07, 2012
Walter Bright
Jul 07, 2012
Adam Wilson
Jul 07, 2012
Walter Bright
Jul 07, 2012
Adam Wilson
Jul 07, 2012
Jonathan M Davis
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Jonathan M Davis
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Adam Wilson
Jul 07, 2012
Timon Gehr
Jul 07, 2012
Adam Wilson
Jul 08, 2012
Jonathan M Davis
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Walter Bright
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Jacob Carlborg
Jul 08, 2012
Sean Cavanaugh
Jul 08, 2012
Jonathan M Davis
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Adam Wilson
Jul 08, 2012
Timon Gehr
Jul 08, 2012
Adam Wilson
Jul 07, 2012
bearophile
June 29, 2012
This is a very easy to read article about the design of LLVM:
http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128

It explains what the IR is:

>The most important aspect of its design is the LLVM Intermediate Representation (IR), which is the form it uses to represent code in the compiler. LLVM IR [...] is itself defined as a first class language with well-defined semantics.<

>In particular, LLVM IR is both well specified and the only interface to the optimizer. This property means that all you need to know to write a front end for LLVM is what LLVM IR is, how it works, and the invariants it expects. Since LLVM IR has a first-class textual form, it is both possible and reasonable to build a front end that outputs LLVM IR as text, then uses UNIX pipes to send it through the optimizer sequence and code generator of your choice. It might be surprising, but this is actually a pretty novel property to LLVM and one of the major reasons for its success in a broad range of different applications. Even the widely successful and relatively well-architected GCC compiler does not have this property: its GIMPLE mid-level representation is not a self-contained representation.<

That IR has a great effect on making it simpler to debug the compiler, I think this is important (and I think it partially explains why Clang was created so quickly):

>Compilers are very complicated, and quality is important, therefore testing is critical. For example, after fixing a bug that caused a crash in an optimizer, a regression test should be added to make sure it doesn't happen again. The traditional approach to testing this is to write a .c file (for example) that is run through the compiler, and to have a test harness that verifies that the compiler doesn't crash. This is the approach used by the GCC test suite, for example. The problem with this approach is that the compiler consists of many different subsystems and even many different passes in the optimizer, all of which have the opportunity to change what the input code looks like by the time it gets to the previously buggy code in question. If something changes in the front end or an earlier optimizer, a test case can easily fail to test what it is supposed to be testing. By using the textual form of LLVM IR with the modular optimizer, the LLVM test suite has highly focused regression tests that can load LLVM IR from disk, run it through exactly one optimization pass, and verify the expected behavior. Beyond crashing, a more complicated behavioral test wants to verify that an optimization is actually performed. [...] While this might seem like a really trivial example, this is very difficult to test by writing .c files: front ends often do constant folding as they parse, so it is very difficult and fragile to write code that makes its way downstream to a constant folding optimization pass. Because we can load LLVM IR as text and send it through the specific optimization pass we're interested in, then dump out the result as another text file, it is really straightforward to test exactly what we want, both for regression and feature tests.<

Bye,
bearophile
June 29, 2012
I implemented a compiler back end with LLVM some time ago. The IM helped a lot in both, spotting errors in IM codegen and issues with target codegen (e.g. because of some misconfiguration). You always have the high level IM available as text and the unoptimized target assembler usually is pretty similar to the IM code and thus provides a great guide deciphering the assembler.

Also the fact that you can output and modify a module as IM code to try certain things is really useful sometimes.
June 29, 2012
On 29/06/12 08:04, bearophile wrote:
> This is a very easy to read article about the design of LLVM:
> http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128
>
> That IR has a great effect on making it simpler to debug the compiler, I
> think this is important (and I think it partially explains why Clang was
> created so quickly):

It's a good design, especially for optimisation tests. Although I can't see an immediate application of this for D. DMD's backend is nearly bug-free. (By which I mean, it has 100X fewer bugs than the front-end).
July 06, 2012
On 29.06.2012 11:27, Don Clugston wrote:
> It's a good design, especially for optimisation tests. Although I can't
> see an immediate application of this for D.

LDC (https://github.com/ldc-developers/ldc/) uses LLVM.

Kai
July 06, 2012
On Fri, 29 Jun 2012 02:27:19 -0700, Don Clugston <dac@nospam.com> wrote:

> On 29/06/12 08:04, bearophile wrote:
>> This is a very easy to read article about the design of LLVM:
>> http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128
>>
>> That IR has a great effect on making it simpler to debug the compiler, I
>> think this is important (and I think it partially explains why Clang was
>> created so quickly):
>
> It's a good design, especially for optimisation tests. Although I can't see an immediate application of this for D. DMD's backend is nearly bug-free. (By which I mean, it has 100X fewer bugs than the front-end).

Sure, but LLVM is just as bug free and spanks the current DMD backend in perf tests. Just because something is well tested and understood doesn't automatically make it superior. Also worth consideration is that moving to LLVM would neatly solve an incredible number of sticky points with the current backend, not the least of which is it's license. And lets not ven talk about the automatic multi-arch support we'd get.

My guess is that, unless something changes significantly, DMD will remain a niche tool; useful as a reference/research compiler, but for actual work people will use LDC or GDC.

At the moment, the ONLY reasons I use DMD are to test my changes to the compiler and that LLVM doesn't yet support SEH. As soon as LDC supports SEH, and it will (I hear 3.2 will), I will move all my work to LDC. So what if it's a version or two behind, it has superior code generation and better Windows support (COFF/x64 anybody?).

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/
July 07, 2012
Adam Wilson:

> moving to LLVM would neatly solve an incredible number of sticky points with the current backend,

I remember some small limits in the LLVM back-end, like not being able to use zero bits to implement fixed-size zero length arrays. And something regarding gotos in inline asm. I don't know if those little limits are now removed.


> My guess is that, unless something changes significantly, DMD will remain a niche tool; useful as a reference/research compiler, but for actual work people will use LDC or GDC.

The D reference compiler can't be DMD forever.


> At the moment, the ONLY reasons I use DMD are to test my changes to the compiler and that LLVM doesn't yet support SEH. As soon as LDC supports SEH, and it will (I hear 3.2 will),

Is LDC2 going to work on Windows32 bit too?

Bye,
bearophile
July 07, 2012
On Saturday, July 07, 2012 02:10:49 bearophile wrote:
> > My guess is that, unless something changes significantly, DMD will remain a niche tool; useful as a reference/research compiler, but for actual work people will use LDC or GDC.
> 
> The D reference compiler can't be DMD forever.

Why not? Having multiple compilers is great, but I seriously doubt that Walter is going to work on any other compiler (I don't believe that he _can_ legally work on any other - except maybe if he writes a new one himself - because he'd get into licensing issues with dmc), and unless you're talking about years (decades?) from now, I very much doubt that the reference compiler is going to be a compiler that Walter Bright can't work on.

I see no problem with dmd being the reference compiler and continuing to be so. And if other compilers get used more because their backends are faster, that's fine too.

- Jonathan M Davis
July 07, 2012
On Fri, 06 Jul 2012 17:59:36 -0700, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Saturday, July 07, 2012 02:10:49 bearophile wrote:
>> > My guess is that, unless something changes significantly, DMD
>> > will remain a niche tool; useful as a reference/research
>> > compiler, but for actual work people will use LDC or GDC.
>>
>> The D reference compiler can't be DMD forever.
>
> Why not? Having multiple compilers is great, but I seriously doubt that Walter
> is going to work on any other compiler (I don't believe that he _can_ legally
> work on any other - except maybe if he writes a new one himself - because he'd
> get into licensing issues with dmc), and unless you're talking about years
> (decades?) from now, I very much doubt that the reference compiler is going to
> be a compiler that Walter Bright can't work on.
>
> I see no problem with dmd being the reference compiler and continuing to be
> so. And if other compilers get used more because their backends are faster,
> that's fine too.
>
> - Jonathan M Davis

Walter can't use LLVM? Why not? He wouldn't have to work on LLVM and the glue code is considered front-end. I admit I am not terribly well informed of the legal issues here. But it seems to me that bolting the DMDFE onto a different back--end can't be a problem because the agreement only covers the DMCBE, and the DMDFE is 100% Walter owned, he can do with it what he pleases and all Symantec can do is pout..

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/
July 07, 2012
On Friday, July 06, 2012 18:07:54 Adam Wilson wrote:
> Walter can't use LLVM? Why not? He wouldn't have to work on LLVM and the glue code is considered front-end. I admit I am not terribly well informed of the legal issues here. But it seems to me that bolting the DMDFE onto a different back--end can't be a problem because the agreement only covers the DMCBE, and the DMDFE is 100% Walter owned, he can do with it what he pleases and all Symantec can do is pout.

Walter refuses to look at the code for any other compiler. He has been well served in the past by being able to say that he has never looked at the code of another compiler when the lawyers come knocking. So, as I understand it, anything that would require him to even _look_ at the backend's code, let alone work on it, would make it so he won't do it. And I very much doubt that he'd want to work on a compiler where he can't work on the backend (plus, I would assume that you'd have to look at the backend to work on the glue code, so he'd be restricted entirely to the frontend-specific portions of the compiler).

- Jonathan M Davis
July 07, 2012
On 7/6/2012 4:50 PM, Adam Wilson wrote:
> My guess is that, unless something changes significantly, DMD will remain a
> niche tool; useful as a reference/research compiler, but for actual work people
> will use LDC or GDC.

A more diverse ecosystem that supports D is only for the better.

« First   ‹ Prev
1 2 3 4 5 6