Download presentation
Presentation is loading. Please wait.
Published byRuth Rodgers Modified over 9 years ago
1
1 CST 320 COMPILER METHODS
2
2 Week 1 Introduction Go over syllabus Grammar Review Compiler Overview Preprocessor Symbol Table Preprocessor Directives Adding a lexical analyzer
3
3 Instructor Sherry Yang sherry.yang@oit.edu or csetyang@gmail.com sherry.yang@oit.educsetyang@gmail.com Wilsonville Room 213 Office Hours: Mon/Thurs 4-6 or by appointment Class webpage: http://www.oit.edu/faculty/sherry.yang/CST320 http://www.oit.edu/faculty/sherry.yang/CST320
4
4 Instructor Background Professor of Software Engineering Technology Department of Computer Systems Engineering Technology Ph.D. in Computer Science Senior Software Engineer Application Software Engineer Klamath Falls
5
5 Getting to Know Each Other Pair up with one other person. Find out a little more about the person. Name Year in program Something interesting about the person Any previous compiler experience Introduce the person to the class.
6
6 Course Description This course is designed to introduce the basic concepts of compiler design and operation. Topics include lexical and syntactical analysis, parsing, translation, semantic processing and code generation. In addition, students will implement a small compiler. We might use other tools (Spirit, Pargen, etc.)
7
7 Evaluation Methods 2 Tests40% Homework & Labs35% Project15% Class Participation10% (including in-class exercises)
8
8 Grading Your grade will be calculated as follows:* 90%+ = A 80%+ = B 70%+ = C 60%+ = D 59%- = F * Class participation will be considered in evaluating "borderline" grades. † You must turn in ALL of the labs and complete the project to pass the course with a C or better. Incompletes will be given if you failed to turn in all labs and project.
9
9 Textbook Text: Cooper, Keith D. & Linda Torczon, Engineering A Compiler, 2nd edition, Morgan Kaufmann, 2012. References: Parsons, Introduction to Compiler Construction Aho, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools, Addison-Wesley, 1986. Fischer and LeBlanc, Crafting a Compiler with C, Benjamin Cummings, 1991.
10
10 Student Responsibilities Lecture and Lab Attendance: Students are expected to attend all class sessions. If you know you will be absent on a certain day, please inform the instructor in advance so arrangements can be made to provide you with the materials covered. Please make every effort to attend all class sessions. There will be no make up in-class exercises. Lab sessions will be used as help sessions and to check off lab assignments.
11
11 Student Responsibilities Tests: All tests are open book, open notes. No electronic devices are allowed. There will be no make up tests unless there is an emergency. If you miss a test for any reason, you can do additional project to make it up. In case of emergency, please contact Student Affairs office. They will inform all of your instructors.
12
12 Academic Dishonesty: No plagiarism or cheating is allowed in this class. Please refer to your student handbook regarding policies on academic dishonesty. A copy of the policy is posted on the class webpage. It is okay to get help on your assignments. Please acknowledge all source of help, including them in the program documentation as appropriate. Student Responsibilities
13
13 Homework & Labs: All labs are due via email by midnight on the due date. You must follow the assignment submission guidelines below. All labs must be checked off by the instructor. There will be a check-off list posted for each lab. Student Responsibilities
14
14 Lab Submission Guidelines All labs are due via email by midnight on the due date. The instructor will send out an email upon receiving your lab. If you do not receive an email within 24 hours of submitting the lab, it is YOUR responsibility to contact the instructor by email or phone. If you do not contact the instructor within 48 hours after the due date, the lab is considered late. There will be a 20% penalty per week for late labs. All labs, project and late labs must be turned in by Wednesday of Finals week to be graded.
15
15 Lab Submission Guidelines 1. Zip up all files required to build the lab. 2. Include a “Readme” file as appropriate. 3. The archive should also include any other deliverables as called out in the assignment write-up. 4. The archive will be attached to an email with subject line:CST320 Lab #x – first name last name Email the archive to csetyang@gmail.comcsetyang@gmail.com
16
16 Any student with a disability who anticipates a need for accommodation in this course is encouraged to talk with the instructor about his/her needs as soon as possible.
17
17 Grammar Review Three main concepts Language Machine Grammar Regular vs. Context-Free Languages Notation for describing languages Regular Expression Context-Free grammar Recognizers Finite automata Pushdown Automata
18
18 In-Class Exercise#1 Given ∑={0, 1} L1 = { wv | w, v ∈ ∑* and v = 00}. Define a regular expression to describe L1.
19
19 In-Class Exercise#1 Given ∑={0, 1} L2 = {w| w ∈ ∑* and w contains 3 consecutive 0’s}. Define a deterministic finite automata (DFA) to recognize this language.
20
20 In-Class Exercise#1 Given ∑={0, 1} Lp = {ww r | w ∈ ∑*}. Define a context-free grammar for Lp.
21
21 In-Class Exercise#1 Given ∑={0, 1} Lp = {ww r | w ∈ ∑*}. Define a context-free grammar for Lp. Is Lp regular?
22
22 In-Class Exercise#1 Find the regular expressions for the following automata. Is this a deterministic finite automata?
23
23 In-Class Exercise#1 Remove lambda productions from the following grammar: S -> ABc A -> aaA A -> λ B -> B b B -> λ
24
Conventional Translator source program preprocessor Modified source program library, relocatable object files compiler assembler target assembly program loader / linker relocatable machine code absolute machine code
25
Compilers Lexical Analyzer (scanner) Source Program Parser Tokens Semantic Analysis Parse Tree Optimizer Code Generator Intermediate Representation Target code Symbol Table Uses Regular Expressions to define tokens Is a Finite Automata Structure of tokens is Regular Uses Context-Free Grammar to define program structures Is a Pushdown Automata Structure of program is Context-Free
26
26 Why study compilers? Ties lots of things you know together: Theory (finite automata, grammars) Data structures Modularization Utilization of software tools You might build a parser. The theory of computation/formal language still applies today. As long as we still program with 1-D text. Helps you to be a better programmer
27
27 One-dimensional Text int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int x;cin >> x;if(x>5) cout << “Hello”; else … The formatting has no impact on the meaning of program
28
28 What is a translator? Takes input (SOURCE) and produces output (TARGET) 28 SOURCETARGET ERROR
29
29 Conventional Translator 29 skeletal source program preprocessor source program library, relocatable object files compiler assembler target assembly program loader / linker relocatable machine code absolute machine code
30
30 Translator for Java 30 Java source code Java compiler Java bytecode absolute machine code Java interpreter Bytecode compiler Java bytecode
31
31 Types of Translators Compilers Conventional (textual source code) Imperative, ALGOL-like languages Other paradigms Interpreters Macro processors Text formatters Silicon compilers
32
32 Types of Translators (cont.) Visual programming language Interface Database User interface Operating System
33
33 Conventional Translator 33 skeletal source program preprocessor source program library, relocatable object files compiler assembler target assembly program loader / linker relocatable machine code absolute machine code
34
34 Structure of Compilers Lexical Analyzer (scanner) Modified Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator Intermediate Representation Target machine code Symbol Table skeletal source program preprocessor
35
35 Symbol Table What is a symbol? Variable name Function name Type name Constant Class name Method name …. Any ID that you use in a program
36
36 Symbol Table Information about a symbol Name Type (int, double, char, string, etc.) Use (variable name, constant name, type name, function name, etc.) Value (i.e. value of constant) Scope
37
37 Symbol table operations Insert a symbol into the symbol table Flag as error if symbol already exists in some cases Search for a symbol in the symbol table Delete a symbol from the symbol table
38
38 Symbol table examples w/ preprocessor #define MAX 50 #define SOMESYMBOL #undef SOMESYMBOL #define MIN 10 #define MAX 100
39
39 Code example #define MAX 5 void main() { int x; int y; x = MAX; #define MAX 10 y = MAX; }
40
40 Symbol table example w/ parser (lab 2) void main() { int x; string str1; int x; x = 3; y = 10; str1 = 30; { double x; x = 4.301; }
41
41 Preprocessor Remove all comments If a language is not case sensitive, preprocessor may change the program text to all uppercase or all lowercase. Process preprocessor directives. C/C++ directives: #include #define (unlike C#’s #define, C/C++ can define a constant value) #if / #else / #endif #undef #ifdef #ifndef skeletal source program preprocessor source program
42
42 #include #include “b.h” #define MIN 10 … int x; if (x < MIN) … x = MAX; #define MAX 5 a.h b.h
43
43 #ifdef #ifndef A_H #define A_H … #endif
44
44 #ifdef #if DLEVEL == 0 #define STACK 0 #elif DLEVEL == 1 #define STACK 100 #elif DLEVEL > 5 display( debugptr ); #else #define STACK 200 #endif
45
45 Standalone Preprocessor input.cpp temp.cpp preprocessor #define MAX 50 //this is a comment void main() { int x; //more comments x = MAX; #define MIN 10 int y; x = y – MIN; //blah } void main() { int x; x = 50; int y; x = y – 10; } Produces a modified source file
46
46 Standalone Lexical Analyzer Lexical Analyzer void main() { int x; x = 50; int y; x = y – 10; } Produces a list of tokens void keyword main ID ( symbol ) symbol { symbol int keyword
47
47 Preprocessor & Lexical Analyzer both Produces a list of tokens void keyword main ID ( symbol ) symbol { symbol int keyword #define MAX 50 //this is a comment void main() { int x; //more comments x = MAX; #define MIN 10 int y; x = y – MIN; //blah }
48
48 Output from Lab1 List of tokens void keyword main ID ( symbol ) symbol { symbol int keyword Print out of tokens: voidkeyword mainID (symbol )symbol {symbol Intkeyword …..
49
49 Preprocessor Preprocessor symbols Defined by #define #define MYHEADER_H #define LARGEST 10 Defined in the compilation process Command Line (/D) Preprocessor Definitions
50
50 1. #include 2. //comment 3. #define LARGEST 100 4. void main() 5. { int x, y; 6. x = 10; 7. y = LARGEST; 8. #ifdef MYSYMBOL 9. cout << "X=" << x; 10. #endif 11. #if TEST == 1 12. cout << "1" << endl; 13. #elif TEST == 2 14. cout << "2" << endl; 15. #else 16. cout << "Blah" << endl; 17. #endif 18. cout << “The end” << endl; } In-Class Exercise #2 Show result of preprocessor What’s left in the file? What’s changed in the file?
51
51
52
52
53
53 1. #include 2. //comment 3. #define LARGEST 100 4. void main() 5. { int x, y; 6. x = 10; 7. y = LARGEST; 8. #ifdef MYSYMBOL 9. cout << "X=" << x; 10. #endif 11. #if TEST == 1 12. cout << "1" << endl; 13. #elif TEST == 2 14. cout << "2" << endl; 15. #else 16. cout << "Blah" << endl; 17. #endif 18. cout << “The end” << endl; }
54
54 #include #include “myfile.h” Assumes that myfile.h is in the current directory #include “c:\\somedirectory\myfile.h” Absolute path #include Will look for array in the include folder in the program files folder #include types.h file will be in the sys subdirectory of include
55
55
56
56 Standalone Lexical Analyzer Lexical Analyzer void main() { int x; x = 50; int y; x = y – 10; } Produces a list of tokens void keyword main ID ( symbol ) symbol { symbol int keyword
57
57 Structure of Compilers Lexical Analyzer (scanner) Source Program Tokens int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int x ; cin >> x ; if ( x > 5 ) cout << “Hello” ; else cout << “BOO” ; What about white spaces? Do they matter?
58
58 Tokenize First or as needed? int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int datatype x ID ; symbol cin >> Tokens = Meaningful units in a program Value/Type pairs
59
59 Tokenize First or as needed? Array > someArray; Array< int > Array > someArray; Array< int> >>
60
60 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Syntactic Structure Parse Tree
61
61 Parse Tree (Parser) intx ;cin >> datatypeID Data Declaration Program
62
62 Who is responsible for errors? int x$y; int 32xy; 45b 45ab x = x @ y; Lexical Errors / Token Errors?
63
63 Who is responsible for errors? X = ; Y = x +; Z = [; Syntax errors
64
64 Who is responsible for errors? 45ab One wrong token? Two tokens (45 & ab)? Are whitespaces needed? Either way is okay. Lexical analyzer can catch the illegal token (45ab) Parser can catch the syntax error. Most likely 45 followed by ab will not be syntactically correct.
65
65 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Symbol Table int x; cin >> x; if(x>5) x = “SHERRY”; else cout << “BOO”;
66
66 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator Intermediate Representation Target machine code Symbol Table
67
67 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator Intermediate Representation Target machine code Symbol Table
68
68 Translation Steps: Recognize when input is available. Break input into individual components. Merge individual pieces into meaningful structures. Process structures. Produce output.
69
69 Translation (Compilers) Steps: Break input into individual components.(lexical analysis) Merge individual pieces into meaningful structures. (parsing) Process structures. (semantic analysis) Produce output. (code generation)
70
70 Compilers Two major tasks: Analysis of source Synthesis of target Syntax-directed translation Compilation process driven by syntactic structure of the source being translated
71
71 Interpreters Executes source program without explicitly translating to target code. Control and memory management reside in interpreter, not user program. Allow: Modification of program as it executes. Dynamic typing of variables Portability Huge overhead (time & space)
72
72 Structure of Interpreters Interpreter Source Program Data Program Output
73
73 Misc. Compiler Discussions History of Modern Compilers Front and Back ends One pass vs. Multiple passes Compiler Construction Tools Compiler-Compilers, Compiler-generators, Translator-writing Systems Scanner generator Parse generator Syntax-directed engines Automatic code generator Dataflow engines
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.