Last Chapter Review Source code characters combination lexemes tokens pattern Non-Formalization Description Formalization Description Regular Expression.

Slides:

Advertisements

Similar presentations

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.

Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.

Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,

Chapter 4 Lexical and Syntax Analysis Sections 1-4.

ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.

Compiler Constreuction 1 Chapter 4 Syntax Analysis Topics to cover: Context-Free Grammars: Concepts and Notation Writing and rewriting a grammar Syntax.

1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #5 Introduction.

COP4020 Programming Languages

1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.

Chapter 3 Chang Chi-Chung Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Lexical and syntax analysis

(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.

1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.

Languages & Strings String Operations Language Definitions.

BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.

Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.

Grammars CPSC 5135.

Lexical and Syntax Analysis

1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.

Chapter 4. Syntax Analysis (1). 2 Application of a production  A  in a derivation step  i   i+1.

Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.

Bernd Fischer RW713: Compiler and Software Language Engineering.

CPS 506 Comparative Programming Languages Syntax Specification.

CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.

Overview of Previous Lesson(s) Over View  In our compiler model, the parser obtains a string of tokens from the lexical analyzer & verifies that the.

ISBN Chapter 4 Lexical and Syntax Analysis.

Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.

11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.

Syntax Analyzer (Parser)

1 Pertemuan 7 & 8 Syntax Analysis (Parsing) Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.

Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.

LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.

Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.

Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate.

1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )

Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 will be out this evening Due Monday, 2/8 Submit in HW Server AND at start of class on 2/8 A review.

COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.

Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.

CSE 3302 Programming Languages

Lexical Analyzer in Perspective

Chapter 3 – Describing Syntax

lec02-parserCFG May 8, 2018 Syntax Analyzer

4.1 Introduction - Language implementation systems must analyze

Chapter 3 Lexical Analysis.

Programming Languages Translator

CS510 Compiler Lecture 4.

Chapter 4 Lexical and Syntax Analysis.

Lexical and Syntax Analysis

Chapter 3 Context-Free Grammar and Parsing

Introduction to Parsing (adapted from CS 164 at Berkeley)

Chapter 3 – Describing Syntax

Syntax Specification and Analysis

Compiler Construction

Syntax Analysis Sections :.

CSE 3302 Programming Languages

Lexical and Syntax Analysis

CPSC 388 – Compiler Design and Construction

Lecture 7: Introduction to Parsing (Syntax Analysis)

R.Rajkumar Asst.Professor CSE

Compilers Principles, Techniques, & Tools Taught by Jing Zhang

Chapter 4: Lexical and Syntax Analysis Sangho Ha

lec02-parserCFG May 27, 2019 Syntax Analyzer

COMPILER CONSTRUCTION

4.1 Introduction - Language implementation systems must analyze

Faculty of Computer Science and Information System

Presentation transcript:

Last Chapter Review Source code characters combination lexemes tokens pattern Non-Formalization Description Formalization Description Regular Expression letters combination string Language set Alphabet Table Name conjunction exponent Union LUM concatenation LM closure L* Positive closure L + Computer Realization Transition Diagrams Non- deterministic Finite Automata Deterministic Finite Automata equal Minimization of Finite Automata Manual Syntax- directed Subset construction Merge undistinguished state Lex State Enumeration

Chapter 3 Syntax Analysis LexicalAnalyzertoken Get next token Source program Parse tree Rest of Front End Parser Intermediate Representation Symbol Table Contents Context-Free Grammars Top-Down Parsing and Bottom-Up Parsing Automatic generation of parser

expression identifier expression (initial) identifier (rate) num (60) * + syntax analysis ： syntax token - 〉 syntax phrase （ Parse Tree ） object attributeobject Adjective (excellent) noun （ DLUT Student ） initial + rate * 60 Excellent DLUT Student id + id * num Adjective noun

characterstringtoken Lexical Analyzer （ regular expression ） expression sentence Program block program Syntactic analyzer parse tree

3.1 Context-free Grammar Context-free Grammar Definition Regular Expression defines simple language, represents a fixed number of given structure repetition or not specified number of repetition ex ： a (ba) 5, a (ba)* Regular expression cannot define all expressions with properly balanced parentheses and nested block structure ex ： set of paired parentheses strings ， {wcw | w is a and b series} {wcw | w is a and b series}

3.1 Context-free Grammar Context-free Grammar is tetrad （ V T, V N, S, P ） V T : Terminals V N : Nonterminal S : start symbol P :productions,form of production: A   ex ( {id, +, *, , (, )}, {expr, op}, expr, P ) expr  expr op expr expr  (expr) expr   expr expr  id op  + op  *

3.1 Context-free Grammar Simplified Representation Following symbols usually represent terminals 1 ） lowercase letters early in the alphabet, ex:a,b,c 2)Boldface string, ex:id, while 3)digit 0,1, …,9 4)interpunction ， ex:bracket ， comma 5)Operation symbol ， ex:+,- Following symbols usually represent nonterminal 1 ） uppercase letters early in the alphabet,ex:A,B,C 2)Letter S, usually represents start symbol 3)Lowercase ， ex:expr 、 stmt Besides ， 1)Uppercase letters late in the alphabet,such as X,Y is either nonterminal or terminals 2)Lowercase letters late in the alphabet, like u,v,..represents strings of terminals. 3)Lowercase Greek letters represents strings of grammar symbols. 4)If A — >a1 ， A — >a2 ， then A — >a1|a2

3.1 Context-free Grammar Ex: ( {id, +, *, , (, )}, {expr, op}, expr, P ) expr  expr op expr expr  (expr) expr   expr expr  id op  + op  * Simplified representation E  E A E | (E ) |  E | id A  + | *

3.1 Context-free Grammar Context-free Grammar E  E A E | (E ) |  E | id A  + | * Regular expression letter  [A-Za-z] digit  [0-9] id  letter(letter|digit)* Comparison: Context-free Grammar & regular expression

3.1 Context-free Grammar Derivations Productions are treated as rewriting rules, replaces a nonterminal by the body of one of its productions. ex E  E + E | E * E | (E ) |  E | id E   E   (E)   (E + E)   (id + E)   (id + id) Symbol S  *  、 S  + w definition Sentential form 、 sentence 、 context-free language 、 equivalent grammars

3.1 Context-free Grammar E  E + E | E * E | (E ) |  E | id E   E   (E)   (E + E) Leftmost derivation E  lm  E  lm  (E)  lm  (E + E)  lm  (id + E)  lm  (id + id) Rightmost derivation （ canonical derivations ） E  rm  E  rm  (E)  rm  (E + E)  rm  (E + id)  rm  (id + id) Leftmost derivation and rightmost derivation ？？

3.1 Context-free Grammar Parse Tree E  lm  E  lm  (E)  lm  (E + E)  lm  (id + E)  lm  (id + id) E  rm  E  rm  (E)  rm  (E + E)  rm  (E + id)  rm  (id + id)

3.1 Context-free Grammar Ambiguity E  E * E E  E + E  id * E  E * E +E  id * E  E * E +E  id * E + E  id * E + E  id * E + E  id * E + E  id * id + E  id * id + E  id * id + E  id * id + E  id * id + id  id * id + id  id * id + id  id * id + id E E E * + E E id E E E * + E E

3.2 Language and Grammar Context-free Grammar advantage Grammar gives explicit, easy understanding expressions of the expression Automate generate high-efficiency parser Define language hierarchy Grammar-based language is more easier to modified Context-free Grammar disadvantage Grammar can only describes most of the expressions

3.2 Language and Grammar Comparison: Regular Expression and Context-free Grammar Regular expression (a|b) * abgrammar A 0  a A 0 | b A 0 | a A 1 A 1  b A 2 A 2   1 2 begin a 0 a b b

3.2 Language and Grammar Comparison: Regular expression and Context-free Grammar NFA  Context-free Grammar confirm the terminals set For each state, create a nonterminal Ai If state I has a transition to state j on input a,add the production A i  aA j,if i is an accepting state,add A i   1 2 start a 0 a b b Grammar A0  a A0 | b A0 | a A1 A1  b A2 A2  NFA

3.2 Language and Grammar Reason for lexical parser detach Why using regular expression defines the lexical Lexical rule is simple, do not need the context- free grammar. Using regular expression to describe lexical tokens is simple and easy to understand. Lexical analyzer based on regular expression is high-efficient.

3.2 Language and Grammar Reason for detaching the lexical analyses from syntax parsing Simplify the design Improve the compiler’s efficiency Enhance the compiler’s portability Easy for partitioning compiler front-end Modules

3.2 Language and Grammar Verifying the language Generated by a Grammar G : S  (S ) S |  L(G) =set of strings of balanced Parentheses L(G) =set of strings of balanced Parentheses

3.2 Language and Grammar Verifying the language Generated by a Grammar G : S  (S ) S |  L(G) =set of strings of balanced parentheses Show that every sentence derivable is balanced. Inductive proof on the number of steps n in a derivation

3.2 Language and Grammar Verifying the language Generated by a Grammar G : S  (S ) S |  L(G) = set of strings of balanced parentheses Inductive proof on the number of steps n in a derivation Basis ： S   hypothesis ： less than nstep derivations produce balanced parentheses Procedure ： n step leftmost derivation ： S  (S )S  * (x) S  * (x) y

3.2 Language and Grammar Verifying the language Generated by a Grammar G : S  (S ) S |  L(G) = set of strings of balanced parentheses Induction on the length of a sting :balanced parentheses is derivable from S

3.2 Language and Grammar Verifying the language Generated by a Grammar G : S  (S ) S |  L(G) = set of strings of balanced parentheses Induction on the length of a sting :balanced parentheses can be derivate by S Basis ： S   hypothesis ： length less than 2n is derivable from S Procedure ： consider length is2n(n  1) w = (x) y S  (S )S  * (x) S  * (x) y

3.2 Language and Grammar Proper Expression Grammar Expression production ： E  E + E | E * E | (E ) |  E | id Using a hierarchy view to see expression id * id * (id+id) + id * id + id E E E * + E E id E E E * + E E

3.2 Language and Grammar Proper Expression Grammar Using a hierarchy view to see expression id * id * (id+id) + id * id + id id * id * (id+id)

3.2 Language and Grammar Proper Expression Grammar Using a hierarchy view to see expression id * id * (id+id) + id * id + id id * id * (id+id)Grammar expr  expr + term | term

3.2 Language and Grammar Proper Expression Grammar Using a hierarchy view to see expression id * id * (id+id) + id * id + id id * id * (id+id) Grammar expr  expr + term | term term  term * factor | factor

3.2 Language and Grammar Proper Expression Grammar Using a hierarchy view to see expression id * id * (id+id) + id * id + id id * id * (id+id)Grammar expr  expr + term | term term  term * factor | factor factor  id | (expr)

3.2 Language and Grammar expr  expr + term | term term  term * factor | factor factor  id | (expr) expr id term factor id term * factor * expr + id factor term id term * factor Parse tree of id * id * id and id + id * id

3.2 Language and Grammar Eliminating Ambiguity stmt  if expr then stmt | if expr then stmt else stmt | if expr then stmt else stmt | other | other Sentential form ： if expr then if expr then stmt else stmt Two Leftmost derivation ： stmt  if expr then stmt  if expr then if expr then stmt else stmt  if expr then if expr then stmt else stmt stmt  if expr then stmt else stmt  if expr then if expr then stmt else stmt  if expr then if expr then stmt else stmt

3.2 Language and Grammar non-Ambiguous Grammar stmt  matched _stmt | unmatched_stmt | unmatched_stmt matched_stmt  if expr then matched_stmt else matched_stmt | other | other unmatched_stmt  if expr then stmt | if expr then matched_stmt else unmatched_stmt | if expr then matched_stmt else unmatched_stmt

3.2 Language and Grammar Elimination of Left Recursion Grammar left Recursion A+A A+A A+A A+A Immediate left Recursion A  A  A  A  String character  Eliminate immediate left recursion Eliminate immediate left recursion A   A A   A | 

3.2 Language and Grammar ex :Arithmetical Expression Grammar E  E + T | T （ T + T... + T ） T  T * F | F （ F * F... * F ） F  ( E ) | id Grammar after eliminate the left recursive E  TE E  + TE |  T  FT T  * F T |  F  ( E ) | id

3.2 Language and Grammar Non-Immediate left Recursion S  Aa | b A  Sd |  A  Sd |  Translate to Non-Immediate left Recursion S  Aa | b A  Aad | bd |  Then Eliminate left recursive S  Aa | b A  bd A | A A  adA | 

Exercise