CS5214
Design
of Optimising Compilers
(Year 2008
Mounting)
Associate Professor Wong Weng Fai
National University of Singapore
Office:
COM1-03-33
Tel: 6516-6902
Email: wongwf at comp.nus.edu.sg
Course
website:
NUS
IVLE CS5214 site
1. Introduction
The
primary goal of this course is to provide an in-depth study of code
optimization techniques used in compilers for state-of-the-art processors. The performance gap between optimized and
unoptimized code continues to widen as modern processors evolve. The evolution includes hardware features such
as pipelined functional units other forms of instruction-level parallelism, and
sophisticated memory hierarchies for exploiting data locality. These hardware features are aimed at yielding
high performance, and are critically dependent on the code being optimized
appropriately. Notably, the emerging
explicitly parallel instruction computing (EPIC) processors such as Intel IA64
and the very recent IBM Cell processors are significantly dependent on a range
of aggressive program optimizations to yield performance. Rather than burdening the programmer with
this additional complexity during code development, modern compilers offer
automatic support for code optimization.
The course
is self-contained and begins with a detailed view of the design of optimizing
compilers. The main focus of the course
will be on instruction level parallelism.
In the later part of the course, thread
level parallelism issues – particular important for today’s multicore
processors - will also be dealt with. In each case, the rationale guiding the
important decisions underlying the design will be discussed. A typical optimizing compiler contains a
sequence of restructuring and code optimization techniques, which will be
covered during the classroom lectures; issues related to interactions among
individual optimizations will also be discussed.
A
significant aspect of this course is the ability to learn industry-strength
compiler optimization technology via on-line demonstrations and a hands-on
laboratory session centered around the Trimaran (www.trimaran.org) system. Students will be encouraged to undertake
projects spanning the course of the semester involving the design,
implementation and validation of such optimizations.
The course
will also discuss the impact of the source language and the target architecture
on the effectiveness of optimizations.
The optimizations covered are also relevant for RISC processors, such as
the IBM RS/6000, Power PC, DEC Alpha, Sun Sparc, HP PA-RISC, and MIPS, and
third-generation programming languages such as Fortran and C. Extensions for optimizing object-oriented languages
such as C ++ will be sketched.
The course
is relevant to students aiming to develop skills leading to careers as systems
programmers and analysts, scientific programmers, computer designers and
architects, technical managers, mathematics and computer science teachers, or
anyone facing the prospect of building, modifying, maintaining, writing or
teaching about compliers. At the
conclusion of the course, the students should be knowledgeable about
optimization techniques used in modern compilers, from the viewpoint of the
compiler user as well as of the compiler designer.
Note: The
IVLE CS5214 site will eventually be the main site of this course – especially after
the close of class registration.
2. Prerequisites
It will be
assumed that students have undergone undergraduate level courses in computer
architecture, data structures, assembly programming, and are familiar with at
least one high-level programming language, preferably C or better still C++. A
background in compilers will be of great use but not an absolute necessity.
3. Venue and Time
Lectures
will be held every Tuesdays at 6.30-8.30pm in COM1/213. The first class will be
held on Tuesday, January 15, 2008.
The
examination will be on Tuesday April 29, 2008 at 7.30pm. The venue is to be
confirmed later.
4. Recommended
There is
no suitable textbook yet. We will rely mainly on the lecture notes. Below are
some recommended texts:
a)
Steven Muchnik. Advanced
Compiler Design and Implementation.
Morgan Kaufmann Publishers, 1997.
b)
Randy Allen, Ken Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-based
Approach. Morgan Kaufmann Publishers, 2001.
c)
Michael Wolfe. High
Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
In addition,
we will be reading a number of landmark papers. These will be placed at the
appropriate website to be announced later.
5. Software
The
software used throughout the course will be the Trimaran compiler-simulator
infrastructure. (http://www.trimaran.org)
A DVD containing a VMware virtual machine with Trimaran installed will be
provided.
6. Assessment
Assessment
for this course will consist of three components.
a)
Programming Assignments (30%) There
will be three or so programming exercises using Trimaran. These are to be done
on an individual basis.
b)
Paper Presentation (20%) The class
will be divided into groups. Each group will give a half hour presentation of
an assigned paper. After the presentation, each member of the group will be
quizzed individually on the content of the paper. The group size will depend on
the class size. If the class is small, a group could very well be a single
person.
c)
Final Examination (50%) There
will be an open-book final examination.
7. Brief Outline of
Course
Topic 1 - Structure of optimizing compilers
Front-end, intermediate languages, optimization phases, code
generation, linker, runtime libraries. Dictionary, abstract, syntax tree,
quadruples, expression trees, linearized expressions, graph based intermediate
representations. Execution frequencies and profiling, completion time of
schedules, initiation interval of pipelined loops, spill costs and register
pressure, amortized memory access costs, regions and compiling by regions,
traces, superblocks and hyperblocks.
Topic 2 – The Program Dependence Graph
Structured vs. unstructured control flow graphs, acyclic vs.
cyclic control flow graphs, reducible vs. irreducible control flow graphs,
dominators, postdominators. Program dependence graphs, computing control
dependence, data dependence tests, direction vectors, distance vectors
Topic 3 - A parametric EPIC processor
HPL-PD, predicated execution, data- and control speculation,
rotating registers, run-time memory disambiguation
Topic 4 - Trimaran
Trimaran and its graph-based intermediate representation,
optimization modules, adding new modules, performance monitoring
Topic 5 - Instruction scheduling
Basic blocks and list scheduling, priority and rank
functions, global scheduling including hyperblock scheduling. Speculation,
modulo-scheduling and software pipelining. Scheduling for multi-cluster VLIW.
Topic 6 - Register allocation
Live ranges, interference graphs, graph coloring, local and
global allocation, register spills, priority functions. Register-allocation for
predicated machines, allocating rotating registers, region-based global
register allocation.
Topic 7 - Global data flow analysis
Problem formulation, data flow equations, forward vs.
backward data flow analysis problems. Static single assignment (SSA) form,
predicate sensitive analysis, constant propagation, value numbering
Topic 8 – Parallelism enhancing optimizations
Theory of loop optimization. Loop distribution, fusion,
interchange, unrolling. SIMDization.
Topic 9 - Interprocedural analysis and optimization
Interprocedural data flow analysis, constant propagation,
alias analysis, inlining, cloning
Topic 10 –
Thread-level parallelism
Compiling for multicore processors, thread-level parallelism
discovery