Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CST 320 COMPILER METHODS. 2 Week 1 Introduction Go over syllabus Grammar Review Compiler Overview Preprocessor Symbol Table Preprocessor Directives.

Similar presentations


Presentation on theme: "1 CST 320 COMPILER METHODS. 2 Week 1 Introduction Go over syllabus Grammar Review Compiler Overview Preprocessor Symbol Table Preprocessor Directives."— Presentation transcript:

1 1 CST 320 COMPILER METHODS

2 2 Week 1 Introduction Go over syllabus Grammar Review Compiler Overview Preprocessor Symbol Table Preprocessor Directives Adding a lexical analyzer

3 3 Instructor Sherry Yang sherry.yang@oit.edu or csetyang@gmail.com sherry.yang@oit.educsetyang@gmail.com Wilsonville Room 213 Office Hours: Mon/Thurs 4-6 or by appointment Class webpage: http://www.oit.edu/faculty/sherry.yang/CST320 http://www.oit.edu/faculty/sherry.yang/CST320

4 4 Instructor Background Professor of Software Engineering Technology Department of Computer Systems Engineering Technology Ph.D. in Computer Science Senior Software Engineer Application Software Engineer Klamath Falls

5 5 Getting to Know Each Other Pair up with one other person. Find out a little more about the person. Name Year in program Something interesting about the person Any previous compiler experience Introduce the person to the class.

6 6 Course Description This course is designed to introduce the basic concepts of compiler design and operation. Topics include lexical and syntactical analysis, parsing, translation, semantic processing and code generation. In addition, students will implement a small compiler. We might use other tools (Spirit, Pargen, etc.)

7 7 Evaluation Methods 2 Tests40% Homework & Labs35% Project15% Class Participation10% (including in-class exercises)

8 8 Grading Your grade will be calculated as follows:* 90%+ = A 80%+ = B 70%+ = C 60%+ = D 59%- = F * Class participation will be considered in evaluating "borderline" grades. † You must turn in ALL of the labs and complete the project to pass the course with a C or better. Incompletes will be given if you failed to turn in all labs and project.

9 9 Textbook Text: Cooper, Keith D. & Linda Torczon, Engineering A Compiler, 2nd edition, Morgan Kaufmann, 2012. References: Parsons, Introduction to Compiler Construction Aho, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools, Addison-Wesley, 1986. Fischer and LeBlanc, Crafting a Compiler with C, Benjamin Cummings, 1991.

10 10 Student Responsibilities Lecture and Lab Attendance: Students are expected to attend all class sessions. If you know you will be absent on a certain day, please inform the instructor in advance so arrangements can be made to provide you with the materials covered. Please make every effort to attend all class sessions. There will be no make up in-class exercises. Lab sessions will be used as help sessions and to check off lab assignments.

11 11 Student Responsibilities Tests: All tests are open book, open notes. No electronic devices are allowed. There will be no make up tests unless there is an emergency. If you miss a test for any reason, you can do additional project to make it up. In case of emergency, please contact Student Affairs office. They will inform all of your instructors.

12 12 Academic Dishonesty: No plagiarism or cheating is allowed in this class. Please refer to your student handbook regarding policies on academic dishonesty. A copy of the policy is posted on the class webpage. It is okay to get help on your assignments. Please acknowledge all source of help, including them in the program documentation as appropriate. Student Responsibilities

13 13 Homework & Labs: All labs are due via email by midnight on the due date. You must follow the assignment submission guidelines below. All labs must be checked off by the instructor. There will be a check-off list posted for each lab. Student Responsibilities

14 14 Lab Submission Guidelines All labs are due via email by midnight on the due date. The instructor will send out an email upon receiving your lab. If you do not receive an email within 24 hours of submitting the lab, it is YOUR responsibility to contact the instructor by email or phone. If you do not contact the instructor within 48 hours after the due date, the lab is considered late. There will be a 20% penalty per week for late labs. All labs, project and late labs must be turned in by Wednesday of Finals week to be graded.

15 15 Lab Submission Guidelines 1. Zip up all files required to build the lab. 2. Include a “Readme” file as appropriate. 3. The archive should also include any other deliverables as called out in the assignment write-up. 4. The archive will be attached to an email with subject line:CST320 Lab #x – first name last name Email the archive to csetyang@gmail.comcsetyang@gmail.com

16 16 Any student with a disability who anticipates a need for accommodation in this course is encouraged to talk with the instructor about his/her needs as soon as possible.

17 17 Grammar Review Three main concepts Language Machine Grammar Regular vs. Context-Free Languages Notation for describing languages Regular Expression Context-Free grammar Recognizers Finite automata Pushdown Automata

18 18 In-Class Exercise#1 Given ∑={0, 1} L1 = { wv | w, v ∈ ∑* and v = 00}. Define a regular expression to describe L1.

19 19 In-Class Exercise#1 Given ∑={0, 1} L2 = {w| w ∈ ∑* and w contains 3 consecutive 0’s}. Define a deterministic finite automata (DFA) to recognize this language.

20 20 In-Class Exercise#1 Given ∑={0, 1} Lp = {ww r | w ∈ ∑*}. Define a context-free grammar for Lp.

21 21 In-Class Exercise#1 Given ∑={0, 1} Lp = {ww r | w ∈ ∑*}. Define a context-free grammar for Lp. Is Lp regular?

22 22 In-Class Exercise#1 Find the regular expressions for the following automata. Is this a deterministic finite automata?

23 23 In-Class Exercise#1 Remove lambda productions from the following grammar: S -> ABc A -> aaA A -> λ B -> B b B -> λ

24 Conventional Translator source program preprocessor Modified source program library, relocatable object files compiler assembler target assembly program loader / linker relocatable machine code absolute machine code

25 Compilers Lexical Analyzer (scanner) Source Program Parser Tokens Semantic Analysis Parse Tree Optimizer Code Generator Intermediate Representation Target code Symbol Table Uses Regular Expressions to define tokens Is a Finite Automata Structure of tokens is Regular Uses Context-Free Grammar to define program structures Is a Pushdown Automata Structure of program is Context-Free

26 26 Why study compilers? Ties lots of things you know together: Theory (finite automata, grammars) Data structures Modularization Utilization of software tools You might build a parser. The theory of computation/formal language still applies today. As long as we still program with 1-D text. Helps you to be a better programmer

27 27 One-dimensional Text int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int x;cin >> x;if(x>5) cout << “Hello”; else … The formatting has no impact on the meaning of program

28 28 What is a translator? Takes input (SOURCE) and produces output (TARGET) 28 SOURCETARGET ERROR

29 29 Conventional Translator 29 skeletal source program preprocessor source program library, relocatable object files compiler assembler target assembly program loader / linker relocatable machine code absolute machine code

30 30 Translator for Java 30 Java source code Java compiler Java bytecode absolute machine code Java interpreter Bytecode compiler Java bytecode

31 31 Types of Translators Compilers Conventional (textual source code) Imperative, ALGOL-like languages Other paradigms Interpreters Macro processors Text formatters Silicon compilers

32 32 Types of Translators (cont.) Visual programming language Interface Database User interface Operating System

33 33 Conventional Translator 33 skeletal source program preprocessor source program library, relocatable object files compiler assembler target assembly program loader / linker relocatable machine code absolute machine code

34 34 Structure of Compilers Lexical Analyzer (scanner) Modified Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator Intermediate Representation Target machine code Symbol Table skeletal source program preprocessor

35 35 Symbol Table What is a symbol? Variable name Function name Type name Constant Class name Method name …. Any ID that you use in a program

36 36 Symbol Table Information about a symbol Name Type (int, double, char, string, etc.) Use (variable name, constant name, type name, function name, etc.) Value (i.e. value of constant) Scope

37 37 Symbol table operations Insert a symbol into the symbol table Flag as error if symbol already exists in some cases Search for a symbol in the symbol table Delete a symbol from the symbol table

38 38 Symbol table examples w/ preprocessor #define MAX 50 #define SOMESYMBOL #undef SOMESYMBOL #define MIN 10 #define MAX 100

39 39 Code example #define MAX 5 void main() { int x; int y; x = MAX; #define MAX 10 y = MAX; }

40 40 Symbol table example w/ parser (lab 2) void main() { int x; string str1; int x; x = 3; y = 10; str1 = 30; { double x; x = 4.301; }

41 41 Preprocessor Remove all comments If a language is not case sensitive, preprocessor may change the program text to all uppercase or all lowercase. Process preprocessor directives. C/C++ directives: #include #define (unlike C#’s #define, C/C++ can define a constant value) #if / #else / #endif #undef #ifdef #ifndef skeletal source program preprocessor source program

42 42 #include #include “b.h” #define MIN 10 … int x; if (x < MIN) … x = MAX; #define MAX 5 a.h b.h

43 43 #ifdef #ifndef A_H #define A_H … #endif

44 44 #ifdef #if DLEVEL == 0 #define STACK 0 #elif DLEVEL == 1 #define STACK 100 #elif DLEVEL > 5 display( debugptr ); #else #define STACK 200 #endif

45 45 Standalone Preprocessor input.cpp temp.cpp preprocessor #define MAX 50 //this is a comment void main() { int x; //more comments x = MAX; #define MIN 10 int y; x = y – MIN; //blah } void main() { int x; x = 50; int y; x = y – 10; } Produces a modified source file

46 46 Standalone Lexical Analyzer Lexical Analyzer void main() { int x; x = 50; int y; x = y – 10; } Produces a list of tokens void keyword main ID ( symbol ) symbol { symbol int keyword

47 47 Preprocessor & Lexical Analyzer both Produces a list of tokens void keyword main ID ( symbol ) symbol { symbol int keyword #define MAX 50 //this is a comment void main() { int x; //more comments x = MAX; #define MIN 10 int y; x = y – MIN; //blah }

48 48 Output from Lab1 List of tokens void keyword main ID ( symbol ) symbol { symbol int keyword Print out of tokens: voidkeyword mainID (symbol )symbol {symbol Intkeyword …..

49 49 Preprocessor Preprocessor symbols Defined by #define #define MYHEADER_H #define LARGEST 10 Defined in the compilation process Command Line (/D) Preprocessor Definitions

50 50 1. #include 2. //comment 3. #define LARGEST 100 4. void main() 5. { int x, y; 6. x = 10; 7. y = LARGEST; 8. #ifdef MYSYMBOL 9. cout << "X=" << x; 10. #endif 11. #if TEST == 1 12. cout << "1" << endl; 13. #elif TEST == 2 14. cout << "2" << endl; 15. #else 16. cout << "Blah" << endl; 17. #endif 18. cout << “The end” << endl; } In-Class Exercise #2 Show result of preprocessor What’s left in the file? What’s changed in the file?

51 51

52 52

53 53 1. #include 2. //comment 3. #define LARGEST 100 4. void main() 5. { int x, y; 6. x = 10; 7. y = LARGEST; 8. #ifdef MYSYMBOL 9. cout << "X=" << x; 10. #endif 11. #if TEST == 1 12. cout << "1" << endl; 13. #elif TEST == 2 14. cout << "2" << endl; 15. #else 16. cout << "Blah" << endl; 17. #endif 18. cout << “The end” << endl; }

54 54 #include #include “myfile.h” Assumes that myfile.h is in the current directory #include “c:\\somedirectory\myfile.h” Absolute path #include Will look for array in the include folder in the program files folder #include types.h file will be in the sys subdirectory of include

55 55

56 56 Standalone Lexical Analyzer Lexical Analyzer void main() { int x; x = 50; int y; x = y – 10; } Produces a list of tokens void keyword main ID ( symbol ) symbol { symbol int keyword

57 57 Structure of Compilers Lexical Analyzer (scanner) Source Program Tokens int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int x ; cin >> x ; if ( x > 5 ) cout << “Hello” ; else cout << “BOO” ; What about white spaces? Do they matter?

58 58 Tokenize First or as needed? int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int datatype x ID ; symbol cin >> Tokens = Meaningful units in a program Value/Type pairs

59 59 Tokenize First or as needed? Array > someArray; Array< int > Array > someArray; Array< int> >>

60 60 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Syntactic Structure Parse Tree

61 61 Parse Tree (Parser) intx ;cin >> datatypeID Data Declaration Program

62 62 Who is responsible for errors? int x$y; int 32xy; 45b 45ab x = x @ y; Lexical Errors / Token Errors?

63 63 Who is responsible for errors? X = ; Y = x +; Z = [; Syntax errors

64 64 Who is responsible for errors? 45ab One wrong token? Two tokens (45 & ab)? Are whitespaces needed? Either way is okay. Lexical analyzer can catch the illegal token (45ab) Parser can catch the syntax error. Most likely 45 followed by ab will not be syntactically correct.

65 65 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Symbol Table int x; cin >> x; if(x>5) x = “SHERRY”; else cout << “BOO”;

66 66 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator Intermediate Representation Target machine code Symbol Table

67 67 Structure of Compilers Lexical Analyzer (scanner) Source Program Syntax Analysis (Parser) Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator Intermediate Representation Target machine code Symbol Table

68 68 Translation Steps: Recognize when input is available. Break input into individual components. Merge individual pieces into meaningful structures. Process structures. Produce output.

69 69 Translation (Compilers) Steps: Break input into individual components.(lexical analysis) Merge individual pieces into meaningful structures. (parsing) Process structures. (semantic analysis) Produce output. (code generation)

70 70 Compilers Two major tasks: Analysis of source Synthesis of target Syntax-directed translation Compilation process driven by syntactic structure of the source being translated

71 71 Interpreters Executes source program without explicitly translating to target code. Control and memory management reside in interpreter, not user program. Allow: Modification of program as it executes. Dynamic typing of variables Portability Huge overhead (time & space)

72 72 Structure of Interpreters Interpreter Source Program Data Program Output

73 73 Misc. Compiler Discussions History of Modern Compilers Front and Back ends One pass vs. Multiple passes Compiler Construction Tools Compiler-Compilers, Compiler-generators, Translator-writing Systems Scanner generator Parse generator Syntax-directed engines Automatic code generator Dataflow engines


Download ppt "1 CST 320 COMPILER METHODS. 2 Week 1 Introduction Go over syllabus Grammar Review Compiler Overview Preprocessor Symbol Table Preprocessor Directives."

Similar presentations


Ads by Google