CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring 20081 CSE3302 Programming Languages,

Slides:



Advertisements
Similar presentations
lec02-parserCFG March 27, 2017 Syntax Analyzer
Advertisements

CPSC Compiler Tutorial 4 Midterm Review. Deterministic Finite Automata (DFA) Q: finite set of states Σ: finite set of “letters” (input alphabet)
Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax Lecture 2 - Syntax, Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
Chapter 2 Chang Chi-Chung rev.1. A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Context-Free Grammars and Parsing 1.
8/19/2015© Hal Perkins & UW CSEC-1 CSE P 501 – Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Compiler Principle and Technology Prof. Dongming LU Mar. 7th, 2014.
Syntax and Semantics Structure of programming languages.
Review: –How do we define a grammar (what are the components in a grammar)? –What is a context free grammar? –What is the language defined by a grammar?
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Syntax and Backus Naur Form
Context-Free Grammars and Parsing
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
Lexical and Syntax Analysis
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Syntax and Semantics Structure of programming languages.
COP4020 Programming Languages Syntax Prof. Robert van Engelen (modified by Prof. Em. Chris Lacher)
1 COMP313A Programming Languages Syntax Analysis (2)
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur.
CPS 506 Comparative Programming Languages Syntax Specification.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Top-Down Parsing.
Syntax Analyzer (Parser)
1 Pertemuan 7 & 8 Syntax Analysis (Parsing) Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Lesson 4 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Syntax and Semantics Structure of programming languages.
CSE 3302 Programming Languages
Chapter 3 – Describing Syntax
lec02-parserCFG May 8, 2018 Syntax Analyzer
Parsing & Context-Free Grammars
Programming Languages Translator
CS510 Compiler Lecture 4.
Parsing IV Bottom-up Parsing
4 (c) parsing.
CSE 3302 Programming Languages
Lecture 3: Introduction to Syntax (Cont’)
Parsing & Context-Free Grammars Hal Perkins Autumn 2011
(Slides copied liberally from Ruth Anderson, Hal Perkins and others)
CSE 3302 Programming Languages
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Parsing IV Bottom-up Parsing
BNF 9-Apr-19.
Parsing & Context-Free Grammars Hal Perkins Summer 2004
lec02-parserCFG May 27, 2019 Syntax Analyzer
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
COMPILER CONSTRUCTION
Presentation transcript:

CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

What is Parsing? Given a grammar and a token string: – determine if the grammar can generate the token string? – i.e., is the string a legal program in the language? In other words, to construct a parse tree for the token string. Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

What’s significant about parse tree? A parse tree gives a unique syntactic structure Leftmost, rightmost derivation There is only one leftmost derivation for a parse tree, and symmetrically only one rightmost derivation for a parse tree. Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example expr  expr + expr | expr  expr | ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr number digit 3 * expr number digit 4 expr number digit 5 expr + parse tree leftmost derivation expr numberdigit 3 numberdigit 4 numberdigit 5 expr + * Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

What’s significant about parse tree? A parse tree has a unique meaning, thus provides basis for semantic analysis. (Syntax-directed semantics: semantics are attached to syntactic structure.) expr expr1 expr2 + expr.val = expr1.val + expr2.val Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Relationship among language, grammar, parser Chomsky Hierarchy A language can be described by multiple grammars, while a grammar defines one language. A grammar can be parsed by multiple parsers, while a parser accepts one grammar, thus one language. Should design a language that allows simple grammar and efficient parser For a language, we should construct a grammar that allows fast parsing Given a grammar, we should build an efficient parser Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Ambiguity Ambiguous grammar: There can be multiple parse trees for the same sentence (program) – In other words, multiple leftmost derivations. Why is it bad? – Multiple meanings Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Ambiguity Was this ambiguous? How about this? number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  expr - expr | number Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Deal with Ambiguity Unambiguous Grammar – Rewrite the grammar to avoid ambiguity. Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example of Ambiguity: Precedence expr  expr + expr | expr  expr | ( expr ) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr 3 * parse tree 1 Two different parse tress for expression 3+4*5 expr 5 * parse tree expr 4 Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Ambiguity for Precedence Establish “precedence cascade”: using different structured names for different constructs, adding grammar rules. – Higher precedence : lower in cascade expr  expr + expr | term term  term  term | ( expr ) | number expr  expr + expr | expr  expr | ( expr ) | number Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example of Ambiguity: Associativity expr  expr - expr | ( expr ) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr parse tree 1 Two different parse tress for expression expr 1 - parse tree expr 2 expr.val = 4 expr.val = 2 - is right-associative, which is against common practice in integer arithmetic - is left-associative, which is what we usually assume Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Associativity Left-Associative: + - * / Right-Associative: = What is meant by a=b=c=1? Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Ambiguity for Associativity left-associativity: left-recursion expr  expr - term | term term  (expr) | number expr  expr - expr | ( expr ) | number right-associativity: right-recursion expr  expr = expr | a | b | c expr  term = expr | term term  a | b | c Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Putting Together expr  expr - expr | expr / expr | ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  expr - term | term term  term / factor | factor factor  ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 We want to make - left-associative and / has precedence over - Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example of Ambiguity: Dangling-Else stmt  if( expr ) stmt | if( expr ) stmt else stmt | other-stmt Two different parse trees for “if( expr ) if( expr ) stmt else stmt” stmt (expr) if stmt (expr) parse tree 1parse tree 2 stmt else stmt (expr) if stmt (expr)stmt else Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Dangling-Else stmt  matched_stmt | unmatched_stmt matched_stmt  if( expr ) matched_stmt else matched_stmt | other-stmt unmatched_stmt  if( expr ) stmt | if( expr ) matched_stmt else un matched_stmt Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

EBNF Repetition { } Option [ ] expr  expr - term | term if-stmt  if ( expr ) stmt | if ( expr ) stmt else stmt if-stmt  if ( expr ) stmt [ else stmt ] number  digit | number digitnumber  { digit } signed-number  sign number | number signed-number  [ sign ] number expr  term  term  Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Syntax Diagrams Written from EBNF, not BNF If-statement (more examples on page 101) if else statement expression statement () if-statement Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Parsing Techniques Intuitive analysis and conclusion No formal theorems and rigorous proofs More details: compilers, automata theory Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Parsing Parsing: – Determine if a grammar can generate a given token string. – That is, to construct a parse tree for the token string. Two ways of constructing the parse tree – Top-down (from root towards leaves) Can be constructed more easily by hand – Bottom-up (from leaves towards root) Can handle a larger class of grammars Parser generators tend to use bottom-up methods Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Top-Down Parser Recursive-descent parser: – A special kind of top-down parser: single left-to-right scan, with one lookahead symbol. – Backtracking (trial-and-error) may happen Predictive parser: – The lookahead symbol determines which production to apply, without backtracking Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Recursive-Descent Parser Types in Pascal type  simple | array [ simple ] of type simple  char | integer Input: array [ integer ] of char Parse tree type simple array type [] of integer char simple Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Challenge 1: Top-Down Parser Cannot Handle Left-Recursion expr  expr - term | term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Input: 3 – 4 – 5 Parse tree expr term - exprterm - expr term - Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Left-Recursion void expr(void) { term(); while (token == ‘-’) { match(‘-’); term(); } expr  expr - term | term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  term { - term } term  0 | 1 | … | 9 ( EBNF ) expr  term expr’ expr’  term expr’ | ε term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Challenge 2: Backtracking is Inefficient Backtracking: trial-and-error type  simple | array [ simple ] of type simple  char | integer Input: array [ integer ] of char Parse tree simple char type simple array type [] of X integer (Types in Pascal ) Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Challenge 2: Backtracking is Inefficient We cannot avoid backtracking if the grammar has multiple productions to apply, given a lookahead symbol. Solution: – Change the grammar so that there is only one applicable production that is unambiguously determined by lookahead symbol. subscription  term | term.. term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Avoiding Backtracking by Left Factoring expr  term | term.. term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  term rest rest  term | ε term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 A   A   A’ A’  Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Left Factoring Using EBNF expr  expr | term expr  term expr] if-statement  if ( expr ) statement | if ( expr ) statement else statement if-statement  if ( expr ) statement [ else statement ] Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008