Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.

Similar presentations


Presentation on theme: "CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat."— Presentation transcript:

1 CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat

2 CS 404Ahmd Ezzat 2 Lecture 1 Administration Introduction to Compilers – What is a compiler – Compiler phases Lexical Analysis – Tokens, lexemes – Regular expressions

3 CS 404Ahmd Ezzat 3 An Introduction to Compilers A compiler is a program translator Source language->Target language Compiler examples: cc, gcc, javac

4 CS 404Ahmd Ezzat 4 Phases of a Compiler Front end == analysis – Lexical analysis, Syntax analysis, Semantic analysis – Language dependent, machine independent Back end == synthesis – Intermediate code generation, Optimization, Target code generation – Language independent, machine dependent

5 CS 404Ahmd Ezzat 5 Examples of Compiler Phases Lexical analysis – Scanning: transforms characters into “tokens” Syntax analysis – Parsing: transforms token streams into “parse trees” – Structural analysis

6 CS 404Ahmd Ezzat 6 Examples of Compiler Phases (2) Semantic analysis: checks whether the input program “make sense” – Example: type checking Intermediate code generation – Example: three address code Code optimization Target code generation

7 CS 404Ahmd Ezzat 7 Compiler Issues Symbol table: a data structure containing a record for each identifier, with attributes of the identifier Error handling: detection, reporting, recovery Compiler passes: one pass versus multiple passes

8 CS 404Ahmd Ezzat 8 Working Together With Compilers Pre-processors: macros, file handling Assembler: from assembly code to machine code Loaders: place instructions and data in memory Linkers: link several target programs together

9 CS 404Ahmd Ezzat 9 Lexical Analysis Source language -> token streams Token: e.g., identifier, constant, keyword – Classes of sequence of characters – Satisfy certain patterns (or rules) – Data structure returned by lexical analyzer Lexeme: e.g. my_id, count2 – String matches a pattern

10 CS 404Ahmd Ezzat 10 Describe Patterns: Regular Expression Pattern or rules to identify lexemes Precise specification of sets of strings There exists a computational model to evaluate (Finite Automata) There exists tools to process them (LEX)

11 CS 404Ahmd Ezzat 11 Regular Expression Notations Symbols: e.g., a, b, c, 1, 2 Alphabet: finite set of symbols, Σ (sigma) – e.g., Σ = {a,b} String: a sequence of symbols – e.g., hello, ε (epsilon, empty string) Language: a set of strings over an alphabet – e.g., {a, ab, ba} – e.g., the set of all valid C programs

12 CS 404Ahmd Ezzat 12 Regular Expression Definition Every symbol of Σ U {ε} is a regular expression If r1 and r2 are regular expressions, so are – Concatenation: r1r2 – Alternation: r1 | r2 – Repetition: r1* Nothing else is a regular expression

13 CS 404Ahmd Ezzat 13 Regular Expression Extended a+ : one or more a’s a? : zero or one a a{n}: a repeats n times a{n,}: a repeats at least n times a{n,m}: a repeats at least n but no more than m times and more

14 CS 404Ahmd Ezzat 14 Regular Expressions Cannot Do Arithmetic expressions Set of strings over {(,)} with matched parentheses Strings over {a,b} with equal number of b’s following a’s

15 CS 404Ahmd Ezzat 15 Regular Definitions Give names to regular expressions and use them as shorthand Must avoid recursive definitions Example – digit -> 1 | 2 … 9 – int -> Digit+ – letter -> A | B | … Z – Id -> letter (letter | digit)*

16 CS 404Ahmd Ezzat 16 Finite Automata Evaluate regular expressions Recognize certain languages and reject others Two kinds of FA: – Non-deterministic FA (NFA) – Deterministic FA (DFA)

17 CS 404Ahmd Ezzat 17 An NFA Consists of An input alphabet, e.g., Σ = {a,b} A set of states, e.g., S = {s0, s1, s2} A set of transitions from states to states, labeled by elements of Σ or ε A start state, e.g., s0 A set of final states, e.g., F = {s1, s2}

18 CS 404Ahmd Ezzat 18 FA and Language An FA accepts string x if and only if there is some path in the transition graph from the start state to a final state, such that the edge labels along this path spells x The set of strings an FA accepts is said to be the language defined by this FA.

19 CS 404Ahmd Ezzat 19 Deterministic Finite Automata A DFA is a special case of NFA No states has an ε transition For each state s and input symbol a, there is at most one edge labeled a leaving s

20 CS 404Ahmd Ezzat 20 NFA, DFA and Regular Expressions A DFA is an NFA Each NFA can be converted into a DFA One can construct an NFA from a regular expression FAs are used by lexical analyzer to recognize tokens


Download ppt "CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat."

Similar presentations


Ads by Google