View mode: basic / threaded / horizontal-split · Log in · Help
June 29, 2012
LLVM IR influence on compiler debugging
This is a very easy to read article about the design of LLVM:
http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128

It explains what the IR is:

>The most important aspect of its design is the LLVM Intermediate 
>Representation (IR), which is the form it uses to represent code 
>in the compiler. LLVM IR [...] is itself defined as a first 
>class language with well-defined semantics.<

>In particular, LLVM IR is both well specified and the only 
>interface to the optimizer. This property means that all you 
>need to know to write a front end for LLVM is what LLVM IR is, 
>how it works, and the invariants it expects. Since LLVM IR has a 
>first-class textual form, it is both possible and reasonable to 
>build a front end that outputs LLVM IR as text, then uses UNIX 
>pipes to send it through the optimizer sequence and code 
>generator of your choice. It might be surprising, but this is 
>actually a pretty novel property to LLVM and one of the major 
>reasons for its success in a broad range of different 
>applications. Even the widely successful and relatively 
>well-architected GCC compiler does not have this property: its 
>GIMPLE mid-level representation is not a self-contained 
>representation.<

That IR has a great effect on making it simpler to debug the 
compiler, I think this is important (and I think it partially 
explains why Clang was created so quickly):

>Compilers are very complicated, and quality is important, 
>therefore testing is critical. For example, after fixing a bug 
>that caused a crash in an optimizer, a regression test should be 
>added to make sure it doesn't happen again. The traditional 
>approach to testing this is to write a .c file (for example) 
>that is run through the compiler, and to have a test harness 
>that verifies that the compiler doesn't crash. This is the 
>approach used by the GCC test suite, for example. The problem 
>with this approach is that the compiler consists of many 
>different subsystems and even many different passes in the 
>optimizer, all of which have the opportunity to change what the 
>input code looks like by the time it gets to the previously 
>buggy code in question. If something changes in the front end or 
>an earlier optimizer, a test case can easily fail to test what 
>it is supposed to be testing. By using the textual form of LLVM 
>IR with the modular optimizer, the LLVM test suite has highly 
>focused regression tests that can load LLVM IR from disk, run it 
>through exactly one optimization pass, and verify the expected 
>behavior. Beyond crashing, a more complicated behavioral test 
>wants to verify that an optimization is actually performed. 
>[...] While this might seem like a really trivial example, this 
>is very difficult to test by writing .c files: front ends often 
>do constant folding as they parse, so it is very difficult and 
>fragile to write code that makes its way downstream to a 
>constant folding optimization pass. Because we can load LLVM IR 
>as text and send it through the specific optimization pass we're 
>interested in, then dump out the result as another text file, it 
>is really straightforward to test exactly what we want, both for 
>regression and feature tests.<

Bye,
bearophile
June 29, 2012
Re: LLVM IR influence on compiler debugging
I implemented a compiler back end with LLVM some time ago. The IM helped 
a lot in both, spotting errors in IM codegen and issues with target 
codegen (e.g. because of some misconfiguration). You always have the 
high level IM available as text and the unoptimized target assembler 
usually is pretty similar to the IM code and thus provides a great guide 
deciphering the assembler.

Also the fact that you can output and modify a module as IM code to try 
certain things is really useful sometimes.
June 29, 2012
Re: LLVM IR influence on compiler debugging
On 29/06/12 08:04, bearophile wrote:
> This is a very easy to read article about the design of LLVM:
> http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128
>
> That IR has a great effect on making it simpler to debug the compiler, I
> think this is important (and I think it partially explains why Clang was
> created so quickly):

It's a good design, especially for optimisation tests. Although I can't 
see an immediate application of this for D. DMD's backend is nearly 
bug-free. (By which I mean, it has 100X fewer bugs than the front-end).
July 06, 2012
Re: LLVM IR influence on compiler debugging
On 29.06.2012 11:27, Don Clugston wrote:
> It's a good design, especially for optimisation tests. Although I can't
> see an immediate application of this for D.

LDC (https://github.com/ldc-developers/ldc/) uses LLVM.

Kai
July 06, 2012
Re: LLVM IR influence on compiler debugging
On Fri, 29 Jun 2012 02:27:19 -0700, Don Clugston <dac@nospam.com> wrote:

> On 29/06/12 08:04, bearophile wrote:
>> This is a very easy to read article about the design of LLVM:
>> http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128
>>
>> That IR has a great effect on making it simpler to debug the compiler, I
>> think this is important (and I think it partially explains why Clang was
>> created so quickly):
>
> It's a good design, especially for optimisation tests. Although I can't  
> see an immediate application of this for D. DMD's backend is nearly  
> bug-free. (By which I mean, it has 100X fewer bugs than the front-end).

Sure, but LLVM is just as bug free and spanks the current DMD backend in  
perf tests. Just because something is well tested and understood doesn't  
automatically make it superior. Also worth consideration is that moving to  
LLVM would neatly solve an incredible number of sticky points with the  
current backend, not the least of which is it's license. And lets not ven  
talk about the automatic multi-arch support we'd get.

My guess is that, unless something changes significantly, DMD will remain  
a niche tool; useful as a reference/research compiler, but for actual work  
people will use LDC or GDC.

At the moment, the ONLY reasons I use DMD are to test my changes to the  
compiler and that LLVM doesn't yet support SEH. As soon as LDC supports  
SEH, and it will (I hear 3.2 will), I will move all my work to LDC. So  
what if it's a version or two behind, it has superior code generation and  
better Windows support (COFF/x64 anybody?).

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/
July 07, 2012
Re: LLVM IR influence on compiler debugging
Adam Wilson:

> moving to LLVM would neatly solve an incredible number of 
> sticky points with the current backend,

I remember some small limits in the LLVM back-end, like not being 
able to use zero bits to implement fixed-size zero length arrays. 
And something regarding gotos in inline asm. I don't know if 
those little limits are now removed.


> My guess is that, unless something changes significantly, DMD 
> will remain a niche tool; useful as a reference/research 
> compiler, but for actual work people will use LDC or GDC.

The D reference compiler can't be DMD forever.


> At the moment, the ONLY reasons I use DMD are to test my 
> changes to the compiler and that LLVM doesn't yet support SEH. 
> As soon as LDC supports SEH, and it will (I hear 3.2 will),

Is LDC2 going to work on Windows32 bit too?

Bye,
bearophile
July 07, 2012
Re: LLVM IR influence on compiler debugging
On Saturday, July 07, 2012 02:10:49 bearophile wrote:
> > My guess is that, unless something changes significantly, DMD
> > will remain a niche tool; useful as a reference/research
> > compiler, but for actual work people will use LDC or GDC.
> 
> The D reference compiler can't be DMD forever.

Why not? Having multiple compilers is great, but I seriously doubt that Walter 
is going to work on any other compiler (I don't believe that he _can_ legally 
work on any other - except maybe if he writes a new one himself - because he'd 
get into licensing issues with dmc), and unless you're talking about years 
(decades?) from now, I very much doubt that the reference compiler is going to 
be a compiler that Walter Bright can't work on.

I see no problem with dmd being the reference compiler and continuing to be 
so. And if other compilers get used more because their backends are faster, 
that's fine too.

- Jonathan M Davis
July 07, 2012
Re: LLVM IR influence on compiler debugging
On Fri, 06 Jul 2012 17:59:36 -0700, Jonathan M Davis <jmdavisProg@gmx.com>  
wrote:

> On Saturday, July 07, 2012 02:10:49 bearophile wrote:
>> > My guess is that, unless something changes significantly, DMD
>> > will remain a niche tool; useful as a reference/research
>> > compiler, but for actual work people will use LDC or GDC.
>>
>> The D reference compiler can't be DMD forever.
>
> Why not? Having multiple compilers is great, but I seriously doubt that  
> Walter
> is going to work on any other compiler (I don't believe that he _can_  
> legally
> work on any other - except maybe if he writes a new one himself -  
> because he'd
> get into licensing issues with dmc), and unless you're talking about  
> years
> (decades?) from now, I very much doubt that the reference compiler is  
> going to
> be a compiler that Walter Bright can't work on.
>
> I see no problem with dmd being the reference compiler and continuing to  
> be
> so. And if other compilers get used more because their backends are  
> faster,
> that's fine too.
>
> - Jonathan M Davis

Walter can't use LLVM? Why not? He wouldn't have to work on LLVM and the  
glue code is considered front-end. I admit I am not terribly well informed  
of the legal issues here. But it seems to me that bolting the DMDFE onto a  
different back--end can't be a problem because the agreement only covers  
the DMCBE, and the DMDFE is 100% Walter owned, he can do with it what he  
pleases and all Symantec can do is pout..

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/
July 07, 2012
Re: LLVM IR influence on compiler debugging
On Friday, July 06, 2012 18:07:54 Adam Wilson wrote:
> Walter can't use LLVM? Why not? He wouldn't have to work on LLVM and the
> glue code is considered front-end. I admit I am not terribly well informed
> of the legal issues here. But it seems to me that bolting the DMDFE onto a
> different back--end can't be a problem because the agreement only covers
> the DMCBE, and the DMDFE is 100% Walter owned, he can do with it what he
> pleases and all Symantec can do is pout.

Walter refuses to look at the code for any other compiler. He has been well 
served in the past by being able to say that he has never looked at the code 
of another compiler when the lawyers come knocking. So, as I understand it, 
anything that would require him to even _look_ at the backend's code, let 
alone work on it, would make it so he won't do it. And I very much doubt that 
he'd want to work on a compiler where he can't work on the backend (plus, I 
would assume that you'd have to look at the backend to work on the glue code, 
so he'd be restricted entirely to the frontend-specific portions of the 
compiler).

- Jonathan M Davis
July 07, 2012
Re: LLVM IR influence on compiler debugging
On 7/6/2012 4:50 PM, Adam Wilson wrote:
> My guess is that, unless something changes significantly, DMD will remain a
> niche tool; useful as a reference/research compiler, but for actual work people
> will use LDC or GDC.

A more diverse ecosystem that supports D is only for the better.
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home