CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Context-Free Grammars and Parsing 1.

Slides:



Advertisements
Similar presentations
CPSC Compiler Tutorial 4 Midterm Review. Deterministic Finite Automata (DFA) Q: finite set of states Σ: finite set of “letters” (input alphabet)
Advertisements

Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
COP4020 Programming Languages
1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
EECS 6083 Intro to Parsing Context Free Grammars
8/19/2015© Hal Perkins & UW CSEC-1 CSE P 501 – Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Grammars CPSC 5135.
PART I: overview material
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages,
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Introduction to Parsing
CPS 506 Comparative Programming Languages Syntax Specification.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Overview of Previous Lesson(s) Over View  In our compiler model, the parser obtains a string of tokens from the lexical analyzer & verifies that the.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Top-Down Parsing.
Syntax Analyzer (Parser)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
1 Pertemuan 7 & 8 Syntax Analysis (Parsing) Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
CSE P501 – Compilers Parsing Context Free Grammars (CFG) Ambiguous Grammars Next Spring 2014Jim Hogg - UW - CSE - P501C-1.
Syntax and Semantics Structure of programming languages.
Introduction to Parsing
Chapter 3 – Describing Syntax
lec02-parserCFG May 8, 2018 Syntax Analyzer
Parsing & Context-Free Grammars
Programming Languages Translator
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Syntax Specification and Analysis
Lecture 3: Introduction to Syntax (Cont’)
Parsing & Context-Free Grammars Hal Perkins Autumn 2011
Compiler Design 4. Language Grammars
(Slides copied liberally from Ruth Anderson, Hal Perkins and others)
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Introduction to Parsing
Introduction to Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
LL and Recursive-Descent Parsing
BNF 9-Apr-19.
Parsing & Context-Free Grammars Hal Perkins Summer 2004
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
lec02-parserCFG May 27, 2019 Syntax Analyzer
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Context-Free Grammars and Parsing 1

The Plan Parsing overview Context free grammars Grammar problems - ambiguity 2

Parsing The syntax of most programming languages can be specified by a context-free grammar (CGF) Parsing: Given a grammar G and a sentence w in L(G), traverse the derivation (parse tree) for w in some standard order and do something useful at each node –The tree might not be produced explicitly, but the control flow of a parser corresponds to a traversal 3

Old Example a = 1 ; if ( a + 1 ) b = 2 ; 4 program ::= statement | program statement statement ::= assignStmt | ifStmt assignStmt ::= id = expr ; ifStmt ::= if ( expr ) statement expr ::= id | int | expr + expr id ::= a | b | c | i | j | k | n | x | y | z int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 program statement ifStmt assignStmt statement expr assignStmt expr intid id expr int id expr int G w

“Standard Order” For practical reasons we want the parser to be deterministic (no backtracking), and we want to examine the source program from left to right. –(i.e., parse the program in linear time in the order it appears in the source file) 5

Common Orderings Top-down –Start with the root –Traverse the parse tree depth-first, left-to-right (leftmost derivation) –LL(k) Bottom-up –Start at leaves and build up to the root Effectively a rightmost derivation in reverse(!) –LR(k) and subsets (LALR(k), SLR(k), etc.) 6

“Something Useful” At each point (node) in the traversal, perform some semantic action –Construct nodes of full parse tree (rare) –Construct abstract syntax tree (common) –Construct linear, lower-level representation (more common in later parts of a modern compiler) –Generate target code or interpret on the fly (1-pass compilers & interpreters; not common in production compilers – but what we will do for our project) 7

Context-Free Grammars (review) Formally, a grammar G is a tuple where: –N a finite set of non-terminal symbols –Σ a finite set of terminal symbols –P a finite set of productions A subset of N × (N  Σ )* –S the start symbol, a distinguished element of N If not specified otherwise, this is usually assumed to be the non-terminal on the left of the first production 8

Standard Notations a, b, c elements of Σ w, x, y, z elements of Σ* A, B, C elements of N X, Y, Z elements of N Σ , ,  elements of (N Σ )* A  or A ::=  if in P 9

Derivation Relations (1)  A  =>    iff A ::=  in P –derives A =>* w if there is a chain of productions starting with A that generates w –transitive closure 10

Derivation Relations (2) w A  => lm w   iff A ::=  in P –derives leftmost  A w => rm   w iff A ::=  in P –derives rightmost Parsers normally work with only leftmost or rightmost derivations – not random orderings 11

Languages For A in N, L(A) = { w | A =>* w } –i.e., set of strings (words, terminal symbols) generated by nonterminal A If S is the start symbol of grammar G, we define L(G ) = L(S ) 12

Reduced Grammars Grammar G is reduced iff for every production A ::=  in G there is some derivation S =>* x A z => x  z =>* xyz –i.e., no production is useless Convention: we will use only reduced grammars 13

Example Top down, Leftmost derivation for: a = 1 + b ; program ::= statement | program statement statement ::= assignStmt | ifStmt assignStmt ::= id = expr ; ifStmt ::= if ( expr ) stmt expr ::= id | int | expr + expr id ::= a | b | c | i | j | k | n | x | y | z int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 14

Example Grammar S ::= aABe A ::= Abc | b B ::= d Top down, leftmost derivation of: abbcde 15

Ambiguity Grammar G is unambiguous iff every w in L(G ) has a unique leftmost (or rightmost) derivation –Fact: either unique leftmost or unique rightmost implies the other A grammar without this property is ambiguous –Other grammars that generate the same language may be unambiguous We need unambiguous grammars for parsing 16

Example: Ambiguous Grammar for Arithmetic Expressions expr ::= expr + expr | expr - expr | expr * expr | expr / expr | int int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Exercise: show that this is ambiguous –How? Show two different leftmost or rightmost derivations for the same string –Equivalently: show two different parse trees for the same string 17

Example (cont) Give a leftmost derivation of 2+3*4 and show the parse tree 18

Example (cont) Give a different leftmost derivation of 2+3*4 and show the parse tree 19

Another example Give two different derivations of

What’s going on here? This grammar has no notion of precedence or associatively Standard solution –Create a non-terminal for each level of precedence –Isolate the corresponding part of the grammar –Force the parser to recognize higher precedence subexpressions first 21

Classic Expression Grammar expr ::= expr + term | expr – term | term term ::= term * factor | term / factor | factor factor ::= int | ( expr ) int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 22

Check: Derive * 4 23

Check: Derive Note interaction between left- vs right-recursive rules and resulting associativity 24

Check: Derive 5 + (6 + 7) 25

Another Classic Example Grammar for conditional statements stmt ::= if ( cond ) stmt | if ( cond ) stmt else stmt | assign Exercise: show that this is ambiguous –How? 26

One Derivation if ( cond ) if ( cond ) stmt else stmt stmt ::= if ( cond ) stmt | if ( cond ) stmt else stmt | assign 27

Another Derivation if ( cond ) if ( cond ) stmt else stmt stmt ::= if ( cond ) stmt | if ( cond ) stmt else stmt | assign 28

Solving if Ambiguity Fix the grammar to separate if statements with else from if statements with no else –Done in original Java reference grammar –Adds lots of non-terminals Need productions for things like “while statement that contains an unmatched if” and “while statement with only matched ifs”, etc. etc. etc. Use some ad-hoc rule in parser –“else matches closest unpaired if” 29

Parser Tools and Operators Most parser tools can cope with ambiguous grammars –Makes life simpler if used with discipline Typically one can specify operator precedence & associativity –Allows simpler, ambiguous grammar with fewer nonterminals as basis for generated parser, without creating problems 30

Parser Tools and Ambiguous Grammars Possible rules for resolving other problems –Earlier productions in the grammar preferred to later ones –Longest match used if there is a choice Parser tools normally allow for this –But be sure that what the tool does is really what you want 31

Or… If the parser is hand-written, either fudge the grammar or the parser, or cheat where it helps. to be continued… 32