220 likes | 360 Views
This course offers an in-depth exploration of compilers, their structure, and functionality in translating high-level programming languages into optimized machine code. Topics include the role of compilers in modern computing, the complexities of different programming languages, the intricacies of compiler architecture, and the importance of optimization for performance, security, and reliability. Attendees will gain insights into lexical analysis, semantic analysis, intermediate representations, and the distinctions between front-end and back-end compiler processes. Join us in navigating this vital area of computer science!
E N D
CS 671Compilers Prof. Kim Hazelwood Spring 2008
High-Level Programming Languages Compiler Machine Code Error Messages Then What? Machine Code Program Inputs Program Outputs What is a Compiler?
Source Code Interpreter Program Outputs Program Inputs What is an Interpreter?
Source Code IR Generator Intermediate Program Virtual Machine Program Outputs Program Inputs What is a Just-in-Time Compiler?
Why Study Compilers? • Fundamental tool in computer science since 1952 • Remains a vibrant research topic • Machines keep changing • Languages keep changing • Applications keep changing • When to compile keeps changing • Challenging! • Must correctly handle an infinite set of legal programs • Is itself a large program → “Where theory meets practice”
Goals of a Compiler • A compiler’s job is to • Lower the abstraction level • Eliminate overhead from language abstractions • Map source program onto hardware efficiently • Hide hardware weaknesses, utilize hardware strengths • Equal the efficiency of a good assembly programmer • Optimizing compilers should improve the code • Performance* • Code size • Security • Reliability • Power consumption
Compiler High-Level Programming Languages Machine Code An Interface to High-Level Languages • Programmers write in high-level languages • Increases productivity • Easier to maintain/debug • More portable • HLLs also protect the programmer from low-level details • Registers and caches – the register keyword • Instruction selection • Instruction-level parallelism • The catch: HLLs are less efficient
High-Level Languages and Features • C (80’s) … C++ (Early 90’s) … Java (Late 90’s) • Each language had features that spawned new research • C/Fortran/COBOL • User-defined aggregate data types (arrays, structures) • Control-flow and procedures • Prompted data-flow optimizations • C++/Simula/Modula II/Smalltalk • Object orientation (more, smaller procedures) • Prompted inlining • Java • Type safety, bounds checking, garbage collection • Prompted bounds removal, dynamic optimization
An Interface to Computer Architectures • Parallelism • Instruction level • multiple operations at once • want to minimize dependences • Processor level • multiple threads at once • want to minimize synchronization • Memory Hierarchies • Register allocation (only portion explicitly managed in SW) • Code and data layout (helps the hardware cache manager) • Designs driven by how well compilers can leverage new features! Compiler High-Level Programming Languages Machine Code
How Can We Translate Effectively? High-Level Source Code ? Low-Level Machine Code
Idea: Translate in Steps • Series of program representations • Intermediate representations optimized for various manipulations (checking, optimization) • More machine specific, less language specific as translation proceeds
Simplified Compiler Structure Source code (character stream) if (b==0) a = b; Lexical Analysis Token stream Front End Parsing Machine independent Abstract syntax tree Intermediate Code Generation Intermediate code Optimization Back End Machine dependent LIR Assembly code (character stream) CMP CX, 0 CMOVZ CX, DX Register Allocation
hello.c Hello x86 Front End Back End hello.cc IR Hello alpha hello.f Hello sparc hello.ada Why Separate the Front and Back Ends? • Recall: An interface between HLLs and architectures • Option: X*Y compilers or X-front ends + Y-back ends hello.c Hello x86 Compiler hello.cc X FEs Y BEs Hello alpha hello.f Hello sparc hello.ada
Intermediate Code Gen IR Internal Compiler Structure –Front End • Series of filter passes • Source program – Written in a HLL • Lexical analysis – Convert keywords into “tokens” • Parser – Forms a syntax “tree” (statements, expressions, etc.) • Semantic analysis – Type checking, etc. • Intermediate code generator – Three-address code, interface for back end Source Program Lexical Analyzer Token Stream Parser Syntax Tree Semantic Analyzer Syntax Tree
Internal Compiler Structure –Back End • Code optimization – “improves” the intermediate code (most time is spent here) • Consists of machine independent & dependent optimizations • Code generation • Register allocation, instruction selection IR Code Optimizer IR Code Generator Target program
Traditional Compiler Infrastructures • GNU GCC • Targets: everything (pretty much) • Strength: Targets everything (pretty much) • Weakness: Not as extensible as research infrastructures, poor optimization • Stanford SUIF Compiler with Harvard MachSUIF • Targets: Alpha, x86, IPF, C • Strength: high level analysis, parallelization on scientific codes • Intel Open Research Compiler (ORC) • Targets: IPF • Strength: robust with OK code quality • Weakness: Many IR levels
Modern Compiler Infrastructures • IBM Jikes RVM • Targets Linux, x86 or AIX • Strengths: Open-source, wide user base • Weaknesses: In maintenance mode • Microsoft Phoenix • Targets Windows • Strengths: Actively developed • Weaknesses: Closed source, extensive API
What will we learn in this course? • Structure of Compilers • The Front End • The Back End • Advanced Topics • Just-in-time compilation • Dynamic optimization • Power and size optimizations
Required Work • Two Non-Cumulative Exams (15% each) • February 21 and April 10 • Homework (30%) • About 4 assignments • Some are pencil/paper; some are implementation-based • Semester Project (40%) • Groups of 2 • Staged submission • proposal 30% • report 55% • presentation 15% • Late Policy • 2 ^ (days late) points off (where days late > 0)
Course Materials • We will use the Dragon book • Course Website • www.cs.virginia.edu/kim/courses/cs671/ • Lecture Slides • On the course website • I will try to post them before class • Attending class is in your best interest • Other Helpful Books
Next Time… • Read Dragon Chapter 1 • We will begin discussing lexical analysis • Look out for HW1