February 19, 2003 C Intermediate Language | ||||
---|---|---|---|---|
| ||||
http://manju.cs.berkeley.edu/cil/index.html I was amused that CIL is written in OCaml. OCaml just continues to amaze. The CIL license is loose, so this tool might have uses for D. I can envision a D front end written in OCaml that is one-quarter its present size and twice as robust. The CIL tool has processed the ENTIRE linux kernel successfully, quirks and all. -M. --------------------------------------------------------------- CIL (C Intermediate Language) is a high-level representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs. CIL is both lower-level than abstract-syntax trees, by clarifying ambiguous constructs and removing redundant ones, and also higher-level than typical intermediate languages designed for compilation, by maintaining types and a close relationship with the source program. The main advantage of CIL is that it compiles all valid C programs into a few core constructs with a very clean semantics. Also CIL has a syntax-directed type system that makes it easy to analyze and manipulate C programs. Furthermore, the CIL front-end is able to process not only ANSI-C programs but also those using Microsoft C or GNU C extensions. If you do not use CIL and want instead to use just a C parser and analyze programs expressed as abstract-syntax trees then your analysis will have to handle a lot of ugly corners of the language (let alone the fact that parsing C itself is not a trivial task). See Section 15 for some examples of such extreme programs that CIL simplifies for you. In essence, CIL is a highly-structured, 'clean' subset of C. CIL features a reduced number of syntactic and conceptual forms. For example, all looping constructs are reduced to a single form, all function bodies are given explicit return statements, syntactic sugar like "->" is eliminated and function arguments with array types become pointers. (For an extensive list of how CIL simplifies C programs, see Section 3.) This reduces the number of cases that must be considered when manipulating a C program. CIL also separates type declarations from code and flattens scopes within function bodies. This structures the program in a manner more amenable to rapid analysis and transformation. CIL computes the types of all program expressions, and makes all type promotions and casts explicit. CIL supports all GCC and MSVC extensions except for nested functions and complex numbers. Finally, CIL organizes C's imperative features into expressions, instructions and statements based on the presence and absence of side-effects and control-flow. Every statement can be annotated with successor and predecessor information. Thus CIL provides an integrated program representation that can be used with routines that require an AST (e.g. type-based analyses and pretty-printers), as well as with routines that require a CFG (e.g., dataflow analyses). CIL comes accompanied by a number of Perl scripts that perform generally useful operations on code: A driver which behaves as either the gcc or Microsoft VC compiler and can invoke the preprocessor followed by the CIL application. The advantage of this script is that you can easily use CIL and the analyses written for CIL with existing make files. A whole-program merger that you can use as a replacement for your compiler and it learns all the files you compile when you make a project and merges all of the preprocessed source files into a single one. This makes it easy to do whole-program analysis. A patcher makes it easy to create modified copies of the system include files. The CIL driver can then be told to use these patched copies instead of the standard ones. CIL has been tested very extensively. It is able to process the SPECINT95 benchmarks, the Linux kernel, GIMP and other open-source projects. All of these programs are compiled to the simple CIL and then passed to gcc and they still run! We consider the compilation of Linux a major feat especially since Linux contains many of the ugly GCC extensions (see Section 15.2). This adds to about 1,000,000 lines of code that we tested it on. It is also able to process the few Microsoft NT device drivers that we have had access to. CIL was tested against GCC's c-torture testsuite and (except for the tests involving complex numbers and inner functions, which CIL does not currently implement) CIL passes most of the tests. Specifically CIL fails 23 tests out of the 904 c-torture tests that it should pass. GCC itself fails 19 tests. A total of 1400 regression test cases are run automatically on each change to the CIL sources. CIL is relatively independent on the underlying machine and compiler. When you build it CIL will configure itself according to the underlying compiler. However, CIL has only been tested on Intel x86 using the gcc compiler on Linux and cygwin and using the MS Visual C compiler. (See below for specific versions of these compilers that we have used CIL for.) The largest application we have used CIL for is CCured, a compiler that compiles C code into type-safe code by analyzing your pointer usage and inserting runtime checks in the places that cannot be guaranteed statically to be type safe. [Note: the Cyclone folks think they did CCured one better; see their PDF intro which mentions CCured.] |
February 19, 2003 Re: C Intermediate Language | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mark Evans | Pretty neat! Seems like a much easier route than implementing all the back-end pieces of a compiler yet again. Dan "Mark Evans" <Mark_member@pathlink.com> wrote in message news:b2vekk$2940$1@digitaldaemon.com... > http://manju.cs.berkeley.edu/cil/index.html > > I was amused that CIL is written in OCaml. OCaml just continues to amaze. The > CIL license is loose, so this tool might have uses for D. I can envision a D > front end written in OCaml that is one-quarter its present size and twice as > robust. The CIL tool has processed the ENTIRE linux kernel successfully, quirks > and all. -M. > > --------------------------------------------------------------- > > CIL (C Intermediate Language) is a high-level representation along with a set of > tools that permit easy analysis and source-to-source transformation of C programs. > > CIL is both lower-level than abstract-syntax trees, by clarifying ambiguous > constructs and removing redundant ones, and also higher-level than typical intermediate languages designed for compilation, by maintaining types and a > close relationship with the source program. The main advantage of CIL is that it > compiles all valid C programs into a few core constructs with a very clean semantics. Also CIL has a syntax-directed type system that makes it easy to > analyze and manipulate C programs. Furthermore, the CIL front-end is able to > process not only ANSI-C programs but also those using Microsoft C or GNU C extensions. If you do not use CIL and want instead to use just a C parser and > analyze programs expressed as abstract-syntax trees then your analysis will have > to handle a lot of ugly corners of the language (let alone the fact that parsing > C itself is not a trivial task). See Section 15 for some examples of such extreme programs that CIL simplifies for you. > > In essence, CIL is a highly-structured, 'clean' subset of C. CIL features a > reduced number of syntactic and conceptual forms. For example, all looping constructs are reduced to a single form, all function bodies are given explicit > return statements, syntactic sugar like "->" is eliminated and function arguments with array types become pointers. (For an extensive list of how CIL > simplifies C programs, see Section 3.) This reduces the number of cases that > must be considered when manipulating a C program. CIL also separates type declarations from code and flattens scopes within function bodies. This structures the program in a manner more amenable to rapid analysis and transformation. CIL computes the types of all program expressions, and makes all > type promotions and casts explicit. CIL supports all GCC and MSVC extensions > except for nested functions and complex numbers. Finally, CIL organizes C's > imperative features into expressions, instructions and statements based on the > presence and absence of side-effects and control-flow. Every statement can be > annotated with successor and predecessor information. Thus CIL provides an integrated program representation that can be used with routines that require an > AST (e.g. type-based analyses and pretty-printers), as well as with routines > that require a CFG (e.g., dataflow analyses). > > CIL comes accompanied by a number of Perl scripts that perform generally useful > operations on code: A driver which behaves as either the gcc or Microsoft VC > compiler and can invoke the preprocessor followed by the CIL application. The > advantage of this script is that you can easily use CIL and the analyses written > for CIL with existing make files. > > A whole-program merger that you can use as a replacement for your compiler and > it learns all the files you compile when you make a project and merges all of > the preprocessed source files into a single one. This makes it easy to do whole-program analysis. > > A patcher makes it easy to create modified copies of the system include files. > The CIL driver can then be told to use these patched copies instead of the standard ones. > > CIL has been tested very extensively. It is able to process the SPECINT95 benchmarks, the Linux kernel, GIMP and other open-source projects. All of these > programs are compiled to the simple CIL and then passed to gcc and they still > run! We consider the compilation of Linux a major feat especially since Linux > contains many of the ugly GCC extensions (see Section 15.2). This adds to about > 1,000,000 lines of code that we tested it on. It is also able to process the few > Microsoft NT device drivers that we have had access to. CIL was tested against > GCC's c-torture testsuite and (except for the tests involving complex numbers > and inner functions, which CIL does not currently implement) CIL passes most of > the tests. Specifically CIL fails 23 tests out of the 904 c-torture tests that > it should pass. GCC itself fails 19 tests. A total of 1400 regression test cases > are run automatically on each change to the CIL sources. > > CIL is relatively independent on the underlying machine and compiler. When you > build it CIL will configure itself according to the underlying compiler. However, CIL has only been tested on Intel x86 using the gcc compiler on Linux > and cygwin and using the MS Visual C compiler. (See below for specific versions > of these compilers that we have used CIL for.) > > The largest application we have used CIL for is CCured, a compiler that compiles > C code into type-safe code by analyzing your pointer usage and inserting runtime > checks in the places that cannot be guaranteed statically to be type safe. [Note: the Cyclone folks think they did CCured one better; see their PDF intro > which mentions CCured.] > > |
Copyright © 1999-2021 by the D Language Foundation