High-level programming languages like Python make programming a breeze, but how do they work? There's a big gap between Python and machine instructions for modern computers. Learn how to translate Python programs all the way to Intel x86 assembly language.
Most compiler courses teach one phase of the compiler at a time, such as parsing, semantic analysis, and register allocation. The problem with that approach is it is difficult to understand how the whole compiler fits together and why each phase is designed the way it is. Instead, each week we implement a successively larger subset of the Python language. The very first subset is a tiny language of arithmetic statements, and by the time we are done the language includes objects, inheritance, and first-class functions.
Prerequisites: Fluency in at least one programming language (Java, C, C++, Python, etc.). Students will do a lot of programming in Python, but prior knowledge of Python is not required. The course will start with a crash course on Python and Python is one of the easiest languages to learn. Prior knowledge of an assembly language helps, but is not required.
Distance Learning: This course is also offered through the CAETE program.
Time: MWF 11:00am-11:50am
Location: ECCS 1B12
Office hours: Thursday 1:00pm-2:00pm
Textbook: None.
Course notes: pdf (Will be updated periodically.)
Recommended books:
- Modern Compiler Implementation in Java 2nd edition by Andrew W. Appel (The book is also available in C and ML.)
- Compilers: Principles, Techniques, and Tools 2nd edition by Aho, Lam, Sethi, and Ullman
- Python in a Nutshell by Alex Martelli
Grading: homework 10%, quizzes 30%, final project 10%, midterm exam 20%, final exam 30%.
Workload: up to 12 hours of out-of-class work per week.
Policies regarding honor code, religious observances, disabilities, and classroom behavior can be found here.
Quizzes for distance students (CAETE): The questions for quiz will be emailed to you on the morning of the posted date of the quiz and you will email the answers back to Jeremy by midnight of the same day. The quizzes are closed book and should be completed alone, just use your brain and only use the computer to type in your answers. For example, do not use the Python interpreter on your computer to help you answer the quiz.
Assignments:
- Integers and variables, due 9/4.
- Parsing with PLY (Python Lex-Yacc), due 9/11.
- Register allocation, due 9/25.
- Polymorphism and heap allocation, due 10/9.
- Functions, due 10/23.
- Objects and classes, due 11/6.
- Final project proposal due 11/13, instructions.
- Final project, first draft of write-up due 11/30.
- Final project, slides due 12/4.
Presentation schedule for the week of 12/7:- Monday, 11:00-11:20, Thomas Nelson and co.
- Monday, 11:20-11:40, Zainab Aljaroudi and co.
- Monday, 11:40-11:55, Daniel Stutzman
- Wednesday, 11:00-11:20, Carlos Tafoya and co.
- Wednesday, 11:20-11:40, Sri Teja Basava and co.
- Wednesday, 11:40-11:55, Christopher Schwaab
- Friday, 11:00-11:20, David Hover and co.
- Friday, 11:20-11:40, Arlen Cox and co.
- Friday, 11:40-11:55, Krista Hasling
- Friday, 11:55-12:15, Paul Niewoonder and co.
- Final project, final write-up and compiler due 12/11.
Instructions for turning in assignments: Assignments are due by 6am of the due date. Upload your compiler using the web-form at the URL given in the email to the google group.
Quizzes (given in class):
Exams:
- Midterm: in class on Oct. 19, solution
- Final: Dec. 12, 4:30pm-7:00pm
Grades are posted here.
- Please send Jeremy a secret code word so that you can identify your grades.
- Assignments are graded on a 4 point scale (0-4).
- Quizes are graded on a 10 point scale.
- Exams are graded on a 100 point scale.
- benchmark programs
- Garbage collector
- Uniprocessor Garbage Collection Techniques by Wilson.
- Chapter 13 of Modern Compiler Implementation in Java by Appel and Palsberg.
- Don't stop the BiBOP by Dybvig, Eby, and Bruggeman.
- Procedure inlining
- Fast and effective procedure inlining by Waddell and Dybvig.
- Type specialization
- Assignments 4 and 5 (a) from ECEN 4553 Fall 2008
- Chapter 17, Dataflow Analysis, of Modern Compiler Implementation in Java by Appel and Palsberg.
- Customization: optimizing compiler technology for SELF by Chambers and Ungar.
- Iterative type analysis and extended message splitting; optimizing dynamically-typed object-oriented programs. By Chambers and Ungar.
- A unified approach to global program optimization. By Kildall.
- Instruction Selection, Instruction Scheduling
- Chapter 9 and 20 of Modern Compiler Implementation in Java by Appel and Palsberg.
- Static Single-Assignment Form (SSA), Dead-Code Elimination, Constant Propagation,
- Assignment 5(b) from ECEN 4553 Fall 2008
- Chapter 19 of Modern Compiler Implementation in Java by Appel and Palsberg.
- Detecting equality of variables in programs. By Alpern, Wegman, and Zadeck.
- Single-pass generation of static single-assignment form for structured languages. By Brandis and Mossenbock.
- Register allocation
- Live Range Splitting in a Graph Coloring Register Allocator by Cooper and Simpson.
- The Priority-Based Coloring Approach to Register Allocation by Chow and Hennessy.
- Code scheduling and register allocation in large basic blocks by Goodman and Hsu.
- Paper on live range splitting in register allocation.
- Runtime C functions: runtime.h, runtime.c, hashtable.h, hashtable.c, hashtable_itr.h, hashtable_itr.c, hashtable_utility.h, hashtable_utility.c, hashtable_private.h.
- Video about pair programming.
- Simple diagram of x86 architecture pdf
- Script to automate regression testing (needs to be updated for Fall 2009): run_tests.py and colors.py
- Class email list: compiler-construction-fall-2009@googlegroups.com, google group web page
- Examples from Python crash course.
- Printing floating point numbers in C, mimicking Python
- Python Lex-Yacc (PLY)
- IA32 Assembly for Compiler Writers
- Debugger for Python: PyDev is a plug-in for Eclipse
- Intel Manuals
- Nice slides about x86 architecture and assembly language
- Introduction to Linux Intel Assembly Language
- Linux Assembly Hello World Tutorial
- GNU Assembler Manual
- Insight Debugger
- Debugging with DDD
- Hash table implementation in C.
- Paper: Garbage Collection in an Uncooperative Environment by Hans-J. Boehm and Mark Weiser pdf
- The Python Bytecode Interpreter: ceval.c, opcode.h
- Python AST definitions: ast.py
Final project ideas and resources:
Resources: