Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.

Slides:



Advertisements
Similar presentations
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Advertisements

ISBN Chapter 3 Describing Syntax and Semantics.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Context-Free Grammars Lecture 7
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3: Formal Translation Models
COP4020 Programming Languages
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Compiler Principle and Technology Prof. Dongming LU Mar. 7th, 2014.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Syntax and Backus Naur Form
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Context-Free Grammars
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Context-Free Grammars and Parsing
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
PART I: overview material
Programming Languages Third Edition Chapter 6 Syntax.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
1 COMP313A Programming Languages Syntax Analysis (2)
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur.
CPS 506 Comparative Programming Languages Syntax Specification.
Chapter 3 Describing Syntax and Semantics
Chapter 4 Top-Down Parsing Recursive-Descent Gang S. Liu College of Computer Science & Technology Harbin Engineering University.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
ISBN Chapter 3 Describing Syntax and Semantics.
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Compiler Principle and Technology Prof. Dongming LU Mar. 18th, 2015.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
©SoftMoore ConsultingSlide 1 Context-Free Grammars.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
CS510 Compiler Lecture 4.
Chapter 3 Context-Free Grammar and Parsing
Syntax (1).
Context-Free Grammars
Syntax versus Semantics
Compiler Construction (CS-636)
Lecture 3: Introduction to Syntax (Cont’)
Context-Free Grammars
Context-Free Grammars
CSC 4181Compiler Construction Context-Free Grammars
CSC 4181 Compiler Construction Context-Free Grammars
Context-Free Grammars
Context-Free Grammars
COMPILER CONSTRUCTION
Presentation transcript:

Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing

2 Læringsmål n Kunne redegøre for forskelle på regulære og kontekstfri sprog (rekursive regler). n Kunne forstå kontekstfri grammatikker beskrevet i fx BNF. n Kunne redegøre for, hvordan kontekstfri sprog kan parses vha. rekursiv nedstigning (syntakstræer). n Kunne opbygge en rekursiv nedstignings parser udfra en simpel kontekstfri grammatik (BNF).

3 The Translation Process n A compiler consist of a number of logical layers and components.

4 Parsing n Parsing (syntax analysis) is the task of determining whether a program is syntactically correct or not. n Doing this the parser determines the syntactic structure of the program – usually in form of a parse tree or syntax tree. n This structure guides the rest of the translation process. n The syntax is defined by grammar rules of a context-free grammar. n Grammar rules are define in a manner similar to regular expressions. The major difference is that grammar rules are recursive. There is no * operation. n There are two general categories of parsing algorithm: n Top-down parsing n Bottom-up parsing

5 Context-free Grammars n A context-free grammar is a specification for the syntactic structure of a programming language. n As a running example, we will use simple integer arithmetic expressions exp -> exp op exp | ( exp ) | number op -> + | - | * where number is a regular expression n The vertical bar | means choice n Concatenation is also use as a standard operation n Remark the recursive nature of the definition of exp n Note also that the rule use regular expressions as symbols. That is: The rule is defined over an alphabet which contain tokens. n We need also a symbol ε for the empty string of tokens.

6 Programming Language n Context-free grammar rules determine a programming language: The set of legal strings of tokens. n For example (34-3)*42 corresponds to the legal string of seven tokens defined by exp: ( number – number ) * number n On the other hand, the string (34-3*42 corresponds to the illegal string of six tokens: (number – number * number n Grammar rules are sometimes called production because they “produce” strings in the language.

7 Backus-Naur Form (BNF) n Grammar rules using this form are said to be in Backus- Naur form (BNF) n A BNF for Pascal will begin with grammar rules such as: program -> program_heading ; program_block. program_heading ->program... program_block -> statements … statements -> statements; statement | statement statement -> if_statement | assign_statement |.. assign_statement -> identifier := exp; program is called the start symbol program, program_heading, program_block, statements, statement, assign_statement are called nonterminals The tokens program, identifier and := are examples of terminals.

8 Derivation n A derivation is a sequence of replacements of structure names by choices on the right-hand sides of grammar rules n As an example we look at a derivation for the arithmetic expression (34 – 3) * 42:

9 Parse Tree n A parse tree corresponding to a derivation is a labeled tree in which: n the interior nodes are labeled by non-terminals, n the leaf nodes are labeled by terminals, n and the children of each internal node represent the replacement of the associated non-terminal

10 Abstract Syntax Tree n A parse tree contains more info than is absolutely necessary for a compiler to produce object code. n Abstract syntax trees can be thought of as a tree representation of a shorthand notation called abstract syntax

11 Ambiguous Grammars n Consider the simple integer arithmetic grammar exp -> exp op exp | ( exp ) | number op -> + | - | * And consider the string 34-3*42. This string has two different parse trees. n Exercise: Draw two different parse trees for the expression 34-3*43

12 Ambiguous Grammars n Consider the simple integer arithmetic grammar exp -> exp op exp | ( exp ) | number op -> + | - | * And consider the string 34-3*42. This string has two different parse trees: n A grammar that generates a string with two distinct parse trees is called an ambiguous grammar (a serious problem) Which one is correct?

13 Ambiguous Grammars n Two basic method are used to deal with ambiguities n Disambiguating rule State a rule that specifies in each ambiguous case which of the parse trees is the correct one. This will correct the ambiguity without changing the grammar, but the grammar rule is no longer only in BNF. n Changing the grammar We can change the grammar into a different grammar that is correct. This will often complicate the grammar.

14 Ambiguous Grammars n To remove the ambiguity in the integer arithmetic grammar, we could simply state a disambiguating rule that establish the relative precedence's of the three operations +, - * and that subtraction is considered to be left associative. n To remove the ambiguity without a disambiguating rule (preferable) we must: n group the operators into groups of equal precedence n Make subtraction (or all operators) left associative Exercise: Draw the syntax tree for 34- 3*42 using this grammar. Are there more than one? Is operator precedence ok?

15 Extended Backus-Naur Form n Repetitive and optional constructs are common in programming languages, and thus in BNF grammar rules. Therefore the BNF notation is sometimes extended to include: n Repetition BNF (left recursive)A -> Aa | b EBNFA -> b {a} n Optional BNFstatement -> if-stmt | other if-stmt -> if( exp ) statement | if( exp ) statement else statement exp -> 0 | 1 EBNFstatement -> if-stmt | other if-stmt -> if( exp ) statement [ else statement] exp -> 0 | 1

16 Syntax diagram n Graphical representations for BNF or EBNF rules are called syntax diagrams. They consist of: n oval boxes indicate terminals n rectangles indicate non-terminals n arrowed lines representing sequencing and choices As an example, consider the grammar rule factor -> ( exp ) | number

17 Exercises Draw the syntax diagram for: if-statement -> if ( exp ) statement | if ( exp ) statement else statement exp -> true | false Write down the derivation and syntax tree for the following expression: 3-(4+5*6)

18 Context-Free Grammar for TINY Exercise: Draw syntax diagrams that defines, this part of the TINY grammar:

19 Top-Down Parsing n A top-down parsing algorithm parses an input string of tokens by tracing out the steps in a leftmost derivation. n Top-down parses come in two forms: n Predictive parsers Attempts to predict the next construction in the input string using one or more look ahead tokens n Backtracking parsers Will try different possibilities for a parse of the input (slow) n There are two kinds of top-down parsers n Recursive-decent parsing (suitable for handwritten parses) n LL(1) parsing (no longer used in practice).

20 Recursive-Decent n The idea of recursive-decent parsing is simple: n We view the grammar rule for a non-terminal A as a definition for a method that will recognize an A n The right-hand side of the grammar specifies the code structure: n A choice correspond to alternatives (if-statements or case-statement) n Non-terminals corresponds to other methods. Recursive Decent Parsing is important in connection with XML. XML parsers of the DOM type use recursive decent.

21 Recursive-Decent – small example n Identifiers descripted in BNF (usually one would use regular expressions) ::= | ::= a|b|…|z ::= 0|1|…|9 C#-code

22 Recursive-Decent: – small example – now in Java!! Java-code

23 Exercises n Write a recursive decent parser for the grammar that defines integers: ::= 0│1│2│3│4│5│6│7│8│9 ::= +|- ::= │ ::= | n Look at the Java-code for the small English grammar. Rewrite the code into C#

24 Exercises - Extra n Modify the grammar for integers so decimals are accepted: ::= 0│1│2│3│4│5│6│7│8│9 ::= +|- ::= │ ::= | n Write a recursive decent parser for the grammar that defines decimals: ::= ::= | ::= 0|1|2|3|4|5|6|7|8|9 ::= +|- ::=.