Compiler Construction CS 606 Sohail Aslam Lecture 1
2 Course Organization General course information Homework and project information
3 General Information InstructorSohail Aslam Lectures45 TextCompilers – Principles, Techniques and Tools by Aho, Sethi and Ullman
4 Work Distribution Theory homeworks = 20% exams = 40% Practice build a compiler = 40%
5 ProjectProject Implementation language: subset of java Generated code: Intel x86 assembly Implementation language: C++ Six programming assignments
6 Why Take this Course Reason #1: understand compilers and languages understand the code structure understand language semantics understand relation between source code and generated machine code become a better programmer
7 Why Take this Course Reason #2: nice balance of theory and practice Theory mathematical models: regular expressions, automata, grammars, graphs algorithms that use these models
8 Why Take this Course Reason #2: nice balance of theory and practice Practice Apply theoretical notions to build a real compiler
9 Why Take this Course Reason #3: programming experience write a large program which manipulates complex data structures learn more about C++ and Intel x86
10 What are Compilers Translate information from one representation to another Usually information = program
11 ExamplesExamples Typical Compilers: VC, VC++, GCC, JavaC FORTRAN, Pascal, VB(?) Translators Word to PDF PDF to Postscript
12 In This Course We will study typical compilation: from programs written in high- level languages to low-level object code and machine code
13 Typical Compilation High-level source code Compiler Low-level machine code
14 Source Code int expr( int n ) { int d; d = 4*n*n*(n+1)*(n+1); return d; }
15 Source Code Optimized for human readability Matches human notions of grammar Uses named constructs such as variables and procedures
16 Assembly Code. globl _expr _expr: pushl %ebp movl %esp,%ebp subl $24,%esp movl 8(%ebp),%eax movl %eax,%edx leal 0(,%edx,4),%eax movl %eax,%edx imull 8(%ebp),%edx movl 8(%ebp),%eax incl %eax imull %eax,%edx movl 8(%ebp),%eax incl %eax imull %eax,%edx movl %edx,-4(%ebp) movl -4(%ebp),%edx movl %edx,%eax jmp L2.align 4 L2: leave ret
17 Assembly Code Optimized for hardware Consists of machine instructions Uses registers and unnamed memory locations Much harder to understand by humans
18 How to Translate Correctness: the generated machine code must execute precisely the same computation as the source code
19 How to Translate Is there a unique translation? No! Is there an algorithm for an “ideal translation”? No!
20 How to Translate Translation is a complex process source language and generated code are very different Need to structure the translation