CSC 4181 Compiler Construction Context-Free Grammars

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.
COP4020 Programming Languages
1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.
EECS 6083 Intro to Parsing Context Free Grammars
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Compiler Principle and Technology Prof. Dongming LU Mar. 7th, 2014.
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context-Free Grammars
Context-Free Grammars and Parsing
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
PART I: overview material
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
Syntax and Semantics Structure of programming languages.
11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
A Programming Languages Syntax Analysis (1)
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Syntax Analyzer (Parser)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Structure of programming languages
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Introduction to Parsing
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
CS510 Compiler Lecture 4.
Parsing and Parser Parsing methods: top-down & bottom-up
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Programming Languages
Chapter 3 – Describing Syntax
Syntax (1).
Parsing.
Context-Free Grammars
Compiler Construction
Syntax versus Semantics
Compiler Construction (CS-636)
CSE 3302 Programming Languages
Compiler Design 4. Language Grammars
Context-Free Grammars
CSC 4181 Compiler Construction Parsing
Context-Free Grammars
Programming Language Syntax 2
Parsers and control structures
Lecture 7: Introduction to Parsing (Syntax Analysis)
CSC 4181Compiler Construction Context-Free Grammars
R.Rajkumar Asst.Professor CSE
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Theory of Computation Lecture #
Context-Free Grammars
Context-Free Grammars
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

CSC 4181 Compiler Construction Context-Free Grammars Using grammars in parsers CFG

Parsing Process Call the scanner to get tokens Build a parse tree from the stream of tokens A parse tree shows the syntactic structure of the source program. Add information about identifiers in the symbol table Report error, when found, and recover from thee error CFG

Context-Free Grammar a tuple (V, T, P, S) where V is a finite set of nonterminals, containing S, T is a finite set of terminals, P is a set of production rules in the form of aβ where  is in V and β is in (V UT )*, and S is the start symbol. Any string in (V U T)* is called a sentential form. CFG

Examples E  E O E E  (E) E  id O  + O  - O  * O  / S  SS CFG

Backus-Naur Form (BNF) Nonterminals are in < >. Terminals are any other symbols. ::= means . | means or. Examples: <E> ::= <E><O><E>| (<E>) | ID <O> ::= + | - | * | / <S> ::= <S><S> | (<S>)<S> | () |  CFG

Derivation A sequence of replacement of a substring in a sentential form. Definition Let G = (V, T, P, S ) be a CFG, , ,  be strings in (V U T)* and A is in V. A G  if A   is in P. *G denotes a derivation in zero step or more. CFG

Examples S  SS | (S)S | () | S  SS  (S)SS (S)S(S)S  (S)S(())S  ((())())(()) E  E O E | (E) | id O  + | - | * | / E  E O E  (E) O E  (E O E) O E  ((E O E) O E) O E  ((id O E)) O E) O E  ((id + E)) O E) O E  ((id + id)) O E) O E  ((id + id)) * id) + id CFG

Leftmost Derivation Rightmost Derivation Each step of the derivation is a replacement of the leftmost nonterminals in a sentential form. E  E O E  (E) O E  (E O E) O E  (id O E) O E  (id + E) O E  (id + id) O E  (id + id) * E  (id + id) * id Each step of the derivation is a replacement of the rightmost nonterminals in a sentential form. E  E O E  E O id  E * id  (E) * id  (E O E) * id  (E O id) * id  (E + id) * id  (id + id) * id CFG

Right/Left Recursive A grammar is a left recursive if its production rules can generate a derivation of the form A * A X. Examples: E  E O id | (E) | id E  F + id | (E) | id F  E * id | id E  F + id  E * id + id A grammar is a right recursive if its production rules can generate a derivation of the form A * X A. Examples: E  id O E | (E) | id E  id + F | (E) | id F  id * E | id E  id + F  id + id * E CFG

Parse Tree A labeled tree in which the interior nodes are labeled by nonterminals leaf nodes are labeled by terminals the children of an interior node represent a replacement of the associated nonterminal in a derivation corresponding to a derivation E + id E F * CFG

Parse Trees and Derivations + E 1 E 2 E 5 E 3 * E 4 id E  E + E (1)  id + E (2)  id + E * E (3)  id + id * E (4)  id + id * id (5)  E + E * E (2)  E + E * id (3)  E + id * id (4) Preorder numbering + E 1 E 5 E 3 E 2 * E 4 id Reverse or postorder numbering CFG

Grammar: Example List of statements: No statement One statement: s; More than one statement: s; s; s; A statement can be a block of statements. {s; s; s;} Is the following correct? { {s; {s; s;} s; {}} s; } <St> ::=  | s; | s; <St> | { <St> } <St> <St> { <St> } <St> { { <St> } <St> } <St> { { s; <St> } <St>} <St> { { s; { <St> } <St> } <St>} <St> { { s; { s; <St> } <St> } <St>} <St> { { s; { s; s; } <St> } <St>} <St> { { s; { s; s; } s; <St> } <St>} <St> { { s; { s; s; } s; { <St> } <St> } <St>} <St> { { s; { s; s; } s; { } <St> } <St>} <St> { { s; { s; s; } s; { } } <St>} <St> { { s; { s; s;} s; { } } s; } <St> { { s; { s; s;} s; { } } s; } CFG

Abstract Syntax Tree Representation of actual source tokens Interior nodes represent operators. Leaf nodes represent operands. CFG

Abstract Syntax Tree for Expression + E * id3 id2 id1 + id1 id3 * id2 CFG

Abstract Syntax Tree for If Statement ) exp true ( if else elsePart return if true st return CFG

Ambiguous Grammar A grammar is ambiguous if it can generate two different parse trees for one string. Ambiguous grammars can cause inconsistency in parsing. CFG

Example: Ambiguous Grammar E  E + E E  E - E E  E * E E  E / E E  id + E * id3 id2 id1 * E id2 + id1 id3 CFG

Ambiguity in Expressions Which operation is to be done first? solved by precedence An operator with higher precedence is done before one with lower precedence. An operator with higher precedence is placed in a rule (logically) further from the start symbol. solved by associativity If an operator is right-associative (or left-associative), an operand in between 2 operators is associated to the operator to the right (left). Right-associated : W + (X + (Y + Z)) Left-associated : ((W + X) + Y) + Z CFG

Precedence E  E + E E  E - E E  E * E E  E / E E  id E  F F  F * F F  F / F F  id + E * id3 id2 id1 E + * F id3 id2 id1 CFG

Precedence (cont’d) E  E + E | E - E | F F  F * F | F / F | X X  ( E ) | id (id1 + id2) * id3 * id4 * F X + E id4 id3 ) ( id1 id2 CFG

Associativity Left-associative operators E  E + F | E - F | F / F X + E id3 ) ( * id1 id4 id2 Left-associative operators E  E + F | E - F | F F  F * X | F / X | X X  ( E ) | id (id1 + id2) * id3 / id4 = (((id1 + id2) * id3) / id4) CFG

Ambiguity in Dangling Else St  IfSt | ... IfSt  if ( E ) St | if ( E ) St else St E  0 | 1 | … { if (0) { if (1) St } else St } { if (0) { if (1) St else St } } IfSt ) if ( else E St 1 IfSt ) if ( else E St 1 CFG

Disambiguating Rules for Dangling Else St  MatchedSt | UnmatchedSt UnmatchedSt  if (E) St | if (E) MatchedSt else UnmatchedSt MatchedSt  if (E) MatchedSt else MatchedSt| ... E  0 | 1 if (0) if (1) St else St = if (0) if (1) St else St St UnmatchedSt if ( E ) St MatchedSt if ( E ) MatchedSt else MatchedSt CFG

Extended Backus-Naur Form (EBNF) Kleene’s Star/ Kleene’s Closure Seq ::= St {; St} Seq ::= {St ;} St Optional Part IfSt ::= if ( E ) St [else St] E ::= F [+ E ] | F [- E] CFG

Syntax Diagrams Graphical representation of EBNF rules Examples nonterminals: terminals: sequences and choices: Examples X ::= (E) | id Seq ::= {St ;} St E ::= F [+ E ] IfSt id ( ) E id St ; F + E CFG