CS5214

Design of Optimising Compilers

 

(Year 2008 Mounting)

 

Associate Professor Wong Weng Fai

Dept. of Computer Science

National University of Singapore

Office: COM1-03-33

Tel: 6516-6902

Email: wongwf at comp.nus.edu.sg

Course website:

NUS IVLE CS5214 site

 

 

1. Introduction

The primary goal of this course is to provide an in-depth study of code optimization techniques used in compilers for state-of-the-art processors.  The performance gap between optimized and unoptimized code continues to widen as modern processors evolve.  The evolution includes hardware features such as pipelined functional units other forms of instruction-level parallelism, and sophisticated memory hierarchies for exploiting data locality.  These hardware features are aimed at yielding high performance, and are critically dependent on the code being optimized appropriately.  Notably, the emerging explicitly parallel instruction computing (EPIC) processors such as Intel IA64 and the very recent IBM Cell processors are significantly dependent on a range of aggressive program optimizations to yield performance.  Rather than burdening the programmer with this additional complexity during code development, modern compilers offer automatic support for code optimization.

 

The course is self-contained and begins with a detailed view of the design of optimizing compilers.  The main focus of the course will be on instruction level parallelism. In the later part of the course, thread level parallelism issues – particular important for today’s multicore processors - will also be dealt with. In each case, the rationale guiding the important decisions underlying the design will be discussed.  A typical optimizing compiler contains a sequence of restructuring and code optimization techniques, which will be covered during the classroom lectures; issues related to interactions among individual optimizations will also be discussed.

 

A significant aspect of this course is the ability to learn industry-strength compiler optimization technology via on-line demonstrations and a hands-on laboratory session centered around the Trimaran (www.trimaran.org) system.  Students will be encouraged to undertake projects spanning the course of the semester involving the design, implementation and validation of such optimizations.

 

The course will also discuss the impact of the source language and the target architecture on the effectiveness of optimizations.  The optimizations covered are also relevant for RISC processors, such as the IBM RS/6000, Power PC, DEC Alpha, Sun Sparc, HP PA-RISC, and MIPS, and third-generation programming languages such as Fortran and C.  Extensions for optimizing object-oriented languages such as C ++ will be sketched.

 

The course is relevant to students aiming to develop skills leading to careers as systems programmers and analysts, scientific programmers, computer designers and architects, technical managers, mathematics and computer science teachers, or anyone facing the prospect of building, modifying, maintaining, writing or teaching about compliers.  At the conclusion of the course, the students should be knowledgeable about optimization techniques used in modern compilers, from the viewpoint of the compiler user as well as of the compiler designer.

 

Note: The IVLE CS5214 site will eventually be the main site of this course – especially after the close of class registration.

 

2. Prerequisites

It will be assumed that students have undergone undergraduate level courses in computer architecture, data structures, assembly programming, and are familiar with at least one high-level programming language, preferably C or better still C++. A background in compilers will be of great use but not an absolute necessity.

 

3. Venue and Time

Lectures will be held every Tuesdays at 6.30-8.30pm in COM1/213. The first class will be held on Tuesday, January 15, 2008.

 

The examination will be on Tuesday April 29, 2008 at 7.30pm. The venue is to be confirmed later.

 

4. Recommended Readings

There is no suitable textbook yet. We will rely mainly on the lecture notes. Below are some recommended texts:

 

a)    Steven Muchnik. Advanced Compiler Design and Implementation.  Morgan Kaufmann Publishers, 1997.

 

b)    Randy Allen, Ken Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, 2001.

 

c)    Michael Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.

 

In addition, we will be reading a number of landmark papers. These will be placed at the appropriate website to be announced later.

 

5. Software

The software used throughout the course will be the Trimaran compiler-simulator infrastructure. (http://www.trimaran.org) A DVD containing a VMware virtual machine with Trimaran installed will be provided.

 

6. Assessment

Assessment for this course will consist of three components.

a)    Programming Assignments (30%) There will be three or so programming exercises using Trimaran. These are to be done on an individual basis.

 

b)    Paper Presentation (20%) The class will be divided into groups. Each group will give a half hour presentation of an assigned paper. After the presentation, each member of the group will be quizzed individually on the content of the paper. The group size will depend on the class size. If the class is small, a group could very well be a single person.

 

c)    Final Examination (50%) There will be an open-book final examination.

 

7. Brief Outline of Course

 

Topic 1 - Structure of optimizing compilers

Front-end, intermediate languages, optimization phases, code generation, linker, runtime libraries. Dictionary, abstract, syntax tree, quadruples, expression trees, linearized expressions, graph based intermediate representations. Execution frequencies and profiling, completion time of schedules, initiation interval of pipelined loops, spill costs and register pressure, amortized memory access costs, regions and compiling by regions, traces, superblocks and hyperblocks.

 

Topic 2 – The Program Dependence Graph

Structured vs. unstructured control flow graphs, acyclic vs. cyclic control flow graphs, reducible vs. irreducible control flow graphs, dominators, postdominators. Program dependence graphs, computing control dependence, data dependence tests, direction vectors, distance vectors

 

Topic 3 - A parametric EPIC processor

HPL-PD, predicated execution, data- and control speculation, rotating registers, run-time memory disambiguation

 

Topic 4 - Trimaran

Trimaran and its graph-based intermediate representation, optimization modules, adding new modules, performance monitoring

 

Topic 5 - Instruction scheduling

Basic blocks and list scheduling, priority and rank functions, global scheduling including hyperblock scheduling. Speculation, modulo-scheduling and software pipelining. Scheduling for multi-cluster VLIW.

 

Topic 6 - Register allocation

Live ranges, interference graphs, graph coloring, local and global allocation, register spills, priority functions. Register-allocation for predicated machines, allocating rotating registers, region-based global register allocation.

 

Topic 7 - Global data flow analysis

Problem formulation, data flow equations, forward vs. backward data flow analysis problems. Static single assignment (SSA) form, predicate sensitive analysis, constant propagation, value numbering

 

Topic 8 – Parallelism enhancing optimizations

Theory of loop optimization. Loop distribution, fusion, interchange, unrolling. SIMDization.

 

Topic 9 - Interprocedural analysis and optimization

Interprocedural data flow analysis, constant propagation, alias analysis, inlining, cloning

 

Topic 10 – Thread-level parallelism

Compiling for multicore processors, thread-level parallelism discovery