cil infrastructure for c program analysis and transformation n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
CIL: Infrastructure for C Program Analysis and Transformation PowerPoint Presentation
Download Presentation
CIL: Infrastructure for C Program Analysis and Transformation

Loading in 2 Seconds...

play fullscreen
1 / 22

CIL: Infrastructure for C Program Analysis and Transformation - PowerPoint PPT Presentation


  • 174 Views
  • Uploaded on

CIL: Infrastructure for C Program Analysis and Transformation. George C. Necula, Scott McPeak, S. P. Rahul, Westley Weimer http://www.cs.berkeley.edu/~necula/cil. ETAPS – CC ’02 Friday, April 12. What is CIL?. Distills C language into a few key forms

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CIL: Infrastructure for C Program Analysis and Transformation' - sakura


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cil infrastructure for c program analysis and transformation

CIL: Infrastructure for C Program Analysis and Transformation

George C. Necula, Scott McPeak,

S. P. Rahul, Westley Weimer

http://www.cs.berkeley.edu/~necula/cil

ETAPS – CC ’02 Friday, April 12

what is cil
What is CIL?
  • Distills C language
    • into a few key forms
    • with precise semantics
  • Parser + IR + Program Merger for C
  • Maintains types, close ties to source
  • Highly structured, clean subset of C
  • Handles ANSI/GCC/MSVC
why cil
Why CIL?
  • Analyses and Transformations
  • Easy to use
    • impersonates compiler & linker
    • $ make project CC=cil
  • Easy to work with
    • converts away tricky syntax
    • leaves just the heart of the language
    • separates concepts
c feature separation
C Feature Separation
  • CIL separates language components
    • pure expressions
    • statements with side-effects
    • control-flow
    • embedded CFG
  • Keeps all programmer names
    • temps serialize side-effects
    • simplified scoping
example c lvalues
Example: C Lvalues
  • An exp referring to a region of storage
  • Example: rec[1].fld[2]
  • May involve 1, 2, 3 memory accesses
    • 1 if rec and fld are both arrays
    • 2 if either one is a pointer
    • 3 if rec and fld are both pointers
  • Syntax (AST) is insufficient
cil lvalues
CIL Lvalues
  • An exp referring to a region of storage

lval ::= <base ´ offset>

base ::= Var(varinfo)

| Mem(exp)

offset ::= None

| Field(f ´ offset)

| Index(exp ´ offset)

cil lvalues1
CIL Lvalues
  • Example: rec[1].fld[2] becomes either:

<Var(rec),

Index(1, Field(fld, Index(2, None)))>

or:

<Mem(2 +

Lvalue(<Mem(1 + Lvalue(<Var(rec), None>)),

Field(fld, None)>),

None>

  • Full static and operational semantics
semantics
Semantics
  • CIL gives syntax-directed semantics
  • Example judgment:

environment

meaning

lvalue form

cil source fidelity
CIL output:

struct __anonstruct1 {

int fld[3] ;

};

typedef struct __anonstruct1 * Myptr;

Myptr rec;

(rec + 2)->fld[1] = (int)’h’;

SUIF 2.2.0-4 output:

typedef int __ar_1[3];

struct type_1 {

__ar_1 fld;

};

struct type_1 * rec;

(((((int *)(((char *)&((((struct type_1 *) (rec))))[2])+0U))))[1]) =(104);

CIL Source Fidelity

typedef struct { int fld[3]; } * Myptr;

Myptr rec;

rec[2].fld[1] = ’h’;

corner cases
Corner Cases
  • Your analysis will not have to handle:
    • return ({goto L; p;}) && ({L: 5;});
    • return &(--x ? : z) - & (x++, x);
  • Full handling of
    • GNU-isms, MSVC-isms
    • attributes
    • initializers
corner cases1
Corner Cases
  • Your analysis will not have to handle:
    • return ({goto L; p;}) && ({L: 5;});

int tmp;

goto L;

if (p) { L: tmp = 1; }

else { tmp = 0; }

return tmp;

stackguard transform
StackGuard Transform
  • Cowan et al., USENIX ’98
  • Buffer overrun defense
    • push return addess on private stack
    • pop before returning
    • only change functions with local arrays
  • 40 lines of commented code with CIL
  • Quite easy: uses visitors for tree replacement, explicit returns, etc.
other transforms
Other Transforms
  • Instrument and log all calls: 150 lines
  • Eliminate break, continue, switch: 110
  • 1 memory access per assignment: 100
  • Make each function have a single return statement: 90
  • Make all stack arrays heap-allocated: 75
  • Log all value/addr memory writes: 45
whole program merger
Whole-Program Merger
  • C has incremental linking, compilation
    • coupled with a weak module system!
  • Example (vortex / gcc / c++2c):

/* foo.c */

struct list { int head;

struct list * tail;

};

struct list * mylist;

/* bar.c */

struct chain { int head;

struct chain * tail;

};

extern struct chain * mylist;

merging a project
Merging a Project
  • Determine what files to merge
  • Merge the files
    • handle file-scoped identifiers
    • C uses name equivalence for types
    • but modules need structural equivalence
  • Key: Each global identifier has 1 type!
other merger details
Other Merger Details
  • Remove duplicate declarations
    • every file includes <stdio.h>
  • Match struct pointer with no defined body in file A to defined body in file B
  • Be careful when picking representatives
how does it work
How Does it Work?
  • Make project, pass all files through CIL
  • Run your transform and analysis
  • Emit simplified C
  • Compile simplified C with GCC/MSVC
  • … and it works!
large programs
Large Programs

Used in the CCured and BLAST projects

merged kernel stats
Merged Kernel Stats
  • Stock monolithic Linux 2.4.5 kernel
  • http://manju.cs.berkeley.edu/cil/vmlinux.c
  • Statistics: Before | After
    • 324 files | One 12.5MB file
    • 11.3 M-words | 1.5 M-words
    • 7.3 M-LOC (post-process) | 470 K-LOC
  • $ make CC=“cil –merge” HOSTCC=“cil –merge” LD=“cil –merge” AR=“cil –mode=AR –merge”
conclusion
Conclusion
  • CIL distills C to a precise, simple subset
    • easy to analyze
    • well-defined semantics
    • close to the original source
  • Well-suited to complex analyses and source-to-source transforms
  • Parses ANSI/GCC/MSVC C
  • Rapidly merges large programs
questions
Questions?
  • Try CIL out:
  • http://www.cs.berkeley.edu/~necula/cil
  • Complete source, documentation and test cases freely available