Compiler Lecture Note, Syntax Analysis

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Compiler Construction
Bottom-up Parsing A general style of bottom-up syntax analysis, known as shift-reduce parsing. Two types of bottom-up parsing: Operator-Precedence parsing.
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
BOTTOM-UP PARSING Sathu Hareesh Babu 11CS10039 G-8.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
 How to check whether an input string is a sentence of a grammar and how to construct a parse tree for the string.  A Parser for grammar G is a program.
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
The meaning of it all! The role of finite automata and grammars in compiler design.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Top-Down Parsing.
Syntax Analyzer (Parser)
1 Pertemuan 7 & 8 Syntax Analysis (Parsing) Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Syntax Analysis By Noor Dhia Left Recursion: Example1: S → S0s1s | 01 The grammar after eliminate left recursion is: S → 01 S’ S' → 0s1sS’
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Parsing #1 Leonidas Fegaras.
lec02-parserCFG May 8, 2018 Syntax Analyzer
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4 - Parsing CSCE 343.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
CS510 Compiler Lecture 4.
Bottom-Up Parsing.
Parsing and Parser Parsing methods: top-down & bottom-up
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
Top-down parsing cannot be performed on left recursive grammars.
Compiler Construction
CS 404 Introduction to Compiler Design
4 (c) parsing.
Parsing Techniques.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Top-Down Parsing CS 671 January 29, 2008.
4d Bottom Up Parsing.
Lecture 8 Bottom Up Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
Lecture 8: Top-Down Parsing
Bottom Up Parsing.
Parsing IV Bottom-up Parsing
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4: Lexical and Syntax Analysis Sangho Ha
Syntax Analysis - Parsing
BNF 9-Apr-19.
Compiler Construction
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
lec02-parserCFG May 27, 2019 Syntax Analyzer
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Compiler Lecture Note, Syntax Analysis 컴파일러 입문 제 6 장 구 문 분 석

목 차 I. 구문 분석 방법 II. 구문 분석기의 출력 III. Top­down 방법 IV. Bottom-up 방법 Compiler Lecture Note, Syntax Analysis 목 차 I. 구문 분석 방법 II. 구문 분석기의 출력 III. Top­down 방법 IV. Bottom-up 방법

Compiler Lecture Note, Syntax Analysis 구문 분석 방법 Text p.226 ▶ How to check whether an input string is a sentence of a grammar and how to construct a parse tree for the string. ? Parsing : ∈L(G) ▶ A Parser for grammar G is a program that takes as input a string ω and produces as output either a parse tree(or derivation tree) for ω, if ω is a sentence of G, or an error message indicating that ω is not sentence of G. A sequence of tokens parser correct sentence : parse tree incorrect sentence : error message

▶ Two basic types of parsers for context-free grammars Compiler Lecture Note, Syntax Analysis ▶ Two basic types of parsers for context-free grammars ① Top down - starting with the root and working down to the leaves. recursive descent parser, predictive parser. ② Bottom up - beginning at the leaves and working up the root. precedence parser, shift-reduce parser. ex) A → XYZ A reduce expand bottom-up X Y Z top-down "start symbol로" "sentence로"

구문 분석기의 출력 ▶ The output of a parser: ① Parse - left parse, right parse Compiler Lecture Note, Syntax Analysis 구문 분석기의 출력 Text p.231 ▶ The output of a parser: ① Parse - left parse, right parse ② Parse tree ③ Abstract syntax tree ex) G : 1. E → E + T string : a + a * a 2. E → T 3. T → T * F 4. T → F 5. F →(E) 6. F → a

- left parse : a sequence of production rule numbers applied Compiler Lecture Note, Syntax Analysis - left parse : a sequence of production rule numbers applied in leftmost derivation. E  E + T  T + T  F + T 1 2 4  a + T  a + T * F  a + F * F 6 3 4  a + a * F  a + a * a 6 6 ∴ 1 2 4 6 3 4 6 6 - right parse : reverse order of production rule numbers applied in rightmost derivation. E  E + T  E + T * F  E + T * a 1 3 6  E + F * a  E + a * a  T + a * a 4 6 2  F + a * a  a + a * a 4 6 ∴ 6 4 2 6 4 6 3 1

parse tree : derivation tree E string : a + a * a Compiler Lecture Note, Syntax Analysis parse tree : derivation tree E string : a + a * a E + T T T * F F F a a a

::= a transformed parse tree that is a more efficient representation Compiler Lecture Note, Syntax Analysis ▶ Abstract Syntax Tree(AST) ::= a transformed parse tree that is a more efficient representation of the source program. leaf node - operand(identifier or constant) internal node - operator(meaningful production rule name) ex) G: 1. E → E + T  add 2. E → T 3. T → T * F  mul 4. T → F 5. F →(E) 6. F → a string : a + a * a add / \ a mul a a Text p.235

※ 의미 있는 terminal  terminal node Compiler Lecture Note, Syntax Analysis ※ 의미 있는 terminal  terminal node 의미 있는 production rule  nonterminal node → naming : compiler designer가 지정. ex) if a > b then x = a + 1; else x = b; /* C */ Text p.237 if-st gt assign-st assign-st a b x add x b a 1

▶ Representation of a parse tree Compiler Lecture Note, Syntax Analysis ▶ Representation of a parse tree Implicit representation - the sequence of production rule numbers used in derivation Explicit representation - a linked list structure * left son/right brother LLINK INFO RLINK LLINK INFO RLINK INFO

Compiler Lecture Note, Syntax Analysis Top-Down 방법 Text p.238 ::= Beginning with the start symbol of the grammar, it attempts to produce a string of terminal symbol that is identical to a given source string. This matching process proceeds by successively applying the productions of the grammar to produce substrings from nonterminals. ::= In the terminology of trees, this is moving from the root of the tree to a set of leaves in the parse tree for a program. ▶ Top-Down parsing methods (1) Parsing with backup or backtracking. (2) Parsing with limited or partial backup. (3) Parsing with nobacktracking. - backtracking : making repeated scans of the input.

▶ General Top-Down Parsing method - called a brute-force method Compiler Lecture Note, Syntax Analysis ▶ General Top-Down Parsing method - called a brute-force method - with backtracking (  Top-Down parsing with full backup ) 1. Given a particular nonterminal that is to be expanded, the first production for this nonterminal is applied. 2. Compare the newly expanded string with the input string. In the matching process, terminal symbol is compared with an input symbol and nonterminal symbol is selected for expansion and its first production is applied. 3. If the generated string does not match the input string, an incorrect expansion occurs. In the case of such an incorrect expansion this process is backed up by undoing the most recently applied production. And the next production of this nonterminal is used as next expansion. 4. This process continues either until the generated string becomes an input string or until there are no further productions to be tried. In the latter case, the given string cannot be generated from the grammar. ex) text p. 239 [예 8]

▶ Several problems with top-down parsing method Compiler Lecture Note, Syntax Analysis ▶ Several problems with top-down parsing method (1) left recursion  A nonterminal A is left recursive if A  Aα for some α.  A grammar G is left recursive if it has a left-recursive nonterminal. ⇒ A left-recursive grammar can cause a top down parser to go into an infinite loop. ∴ eliminate the left recursion. (2) backtracking  the repeated scanning of input string.  the speed of parsing is much slower. (very time consuming) → the conditions for nobacktracking : FIRST, FOLLOW을 이용하여 formal하게 정의. +

▶ Elimination of left recursion Compiler Lecture Note, Syntax Analysis ▶ Elimination of left recursion direct left-recursion : A → Aα ∈ P indirect left-recursion : A  Aα  general form : A → Aα |  A = Aα +  = α*  introducing new nonterminal A' which generates α*. A' → αA' | ε ==> A → A' +

A → Aα1 | Aα2 | ... | Aαm | β1 | β2 | ... | βn Compiler Lecture Note, Syntax Analysis ex) E → E + T | T T → T  F | F F → (E) | a E  E(+T)*  T(+T)* | | E'  E' → +TE' |  ※ E → TE' E' → +TE' |   general method : A → Aα1 | Aα2 | ... | Aαm | β1 | β2 | ... | βn ==> A → β1A' | β2A' | ... | βnA' A' → α1A' |α 2A' | ... | αmA' |  *

Compiler Lecture Note, Syntax Analysis ▶ Left-factoring  if A →  |  are two A-productions and the input begins with a non-empty string derived from , we do not know whether to expand A to  or to  . ==> left-factoring : the process of factoring out the common prefixes of alternates.  method : A →  |  ==> A → (|) ==> A → A', A' →  |  ex) S → iCtS | iCtSeS | a C → b

::= deterministic selection of the production rule to be applied. Compiler Lecture Note, Syntax Analysis S → iCtS | iCtSeS → iCtS( | eS) ∴ S → iCtSS' | a S' →  | eS C → b ▶ No-backtracking ::= deterministic selection of the production rule to be applied.

Compiler Lecture Note, Syntax Analysis Bottom-up 방법 Text p.247 ::= Reducing a given string to the start symbol of the grammar. ::= It attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working up towards the root(the top). ex) G: S → aAcBe string : abbcde A → Ab | b B → d S A A B a b b c d e

Compiler Lecture Note, Syntax Analysis Reduce [Def 3.1] reduce : the replacement of the right side of a production with the left side. S  , A →  ∈ P rm  S  A   rm rm [Def 3.2] handle : If S  A  , then  is a handle of . rm rm [Def 3.3] handle pruning : S  r0  r1  ...  rn-1  rn =  rm rm rm rm rm  ==> rn-1 ==> rn-2 ==> ... ==> S "reduce sequence" ex) G : S → bAe ω : b a ; a e A → a;A | a * * *

Right Sentential form Reducing Production Compiler Lecture Note, Syntax Analysis - reduce sequence : 3 2 1 S  b A e  b a ; A e  b a ; a e 1 2 3 - right parse : 3 2 1 (rightmost derivation in reverse)  note Bottom-Up parsing의 main aspect는 right sentential form에서 handle을 찾아 적용할 production rule을 deterministic하게 선택하는 것이다. Right Sentential form Reducing Production ba ; ae A→ a 3 ba ; Ae A → a;A 2 bAe S → bAe 1 S

Shift-Reduce Parsing ::= a bottom-up style of parsing. Compiler Lecture Note, Syntax Analysis Shift-Reduce Parsing ::= a bottom-up style of parsing. ▶ Two problems for automatic parsing 1. How to find a handle in a right sentential form. 2. What production to choose in case there is more than one production with the same right hand side. ====> grammar의 종류에 따라 방법이 결정되지만 handle를 유지하기 위하여 stack을 사용한다.

Shift-Reduce Parser ▶ Four actions of a shift-reduce parser ω$ : input Compiler Lecture Note, Syntax Analysis ▶ Four actions of a shift-reduce parser "Stack top과 input symbol에 따라 파싱 테이블을 참조해서 action을 결정." 1. shift : the next input symbol is shifted to the top of the stack. 2. reduce : the handle is reduced to the left side of production. 3. accept : the parser announces successful completion of parsing. 4. error : the parser discovers that a syntax error has occurred and calls an error recovery routine. ω$ : input Sn Shift-Reduce Parser output . . Parsing Table $ stack

ex) G: E →E + T | T string : a + a  a T →T  F | F F → (E) | a Compiler Lecture Note, Syntax Analysis ex) G: E →E + T | T string : a + a  a T →T  F | F F → (E) | a STACK INPUT ACTION -------------- -------------- --------------------- (1) $ a + a  a $ shift a (2) $a + a  a $ reduce F → a (3) $F + a  a $ reduce T → F (4) $T + a  a $ reduce E → T (5) $E + a  a $ shift + (6) $E + a  a $ shift a (7) $E + a  a $ reduce F → a (8) $E + F  a $ reduce T → F (9) $E + T  a $ shift  (10) $E + T  a $ shift a (11) $E + T  a $ reduce F → a (12) $E + T  F $ reduce T → T * F (13) $E + T $ reduce E → E + T (14) $E $ accept

<Thinking points> Compiler Lecture Note, Syntax Analysis <Thinking points> 1. the handle will always eventually appear on top of the stack, never inside.  ∵ rightmost derivation in reverse. stack에 있는 contents와 input에 남아 있는 string이 합해져서 right sentential form을 이룬다. 따라서 항상 stack의 top부분이 reduce된다. 2. How to make a parsing table for a given grammar. → 문법의 종류에 따라 만드는 방법이 다르다. SLR(Simple LR) LALR(LookAhead LR) CLR(Canonical LR)

▶ Constructing a Parse tree Compiler Lecture Note, Syntax Analysis ▶ Constructing a Parse tree 1. shift : create a terminal node labeled the shifted symbol. 2. reduce : A → X1X2...Xn. (1) A new node labeled A is created. (2) The X1,X2,...,Xn are made direct descendants of the new node. (3) If A → ε, then the parser merely creates a node labeled A with no descendants. ex) G : 1. LIST → LIST , ELEMENT 2. LIST → ELEMENT 3. ELEMENT → a string : a , a

Step STACK INPUT ACTION PARSETREE ELEMENT LIST Compiler Lecture Note, Syntax Analysis Step STACK INPUT ACTION PARSETREE (1) (2) (3) (4) (5) (6) (7) (8) a,a$ ,a$ a$ $ $LIST , ELEMENT $LIST $LIST , $LIST , a $ELEMENT $a shift a shift , reduce 3 reduce 1 reduce 2 accept Build Node Build Tree return that tree $ ELEMENT LIST a ,

1. build node : 의미있는 terminal을 shift. Compiler Lecture Note, Syntax Analysis ▶ Constructing an AST 1. build node : 의미있는 terminal을 shift. 2. build tree : 의미있는 생성 규칙으로 reduce. ex) G: 1. LIST → LIST, ELEMENT 2. LIST → ELEMENT 3. ELEMENT → a string: a,a 의미있는 terminal : a 의미있는 생성 규칙 : 0번 ===> 0. ACCEPT → LIST ⇒ list 1. LIST → LIST, ELEMENT

a a STACK INPUT ACTION AST (1) $ a,a$ shift a Build Node Compiler Lecture Note, Syntax Analysis STACK INPUT ACTION AST (1) $ a,a$ shift a Build Node (2) $a ,a$ reduce 3 (3) $ELEMENT ,a$ reduce 2 (4) $LIST ,a$ shift , (5) $LIST , a$ shift a Build Node (6) $LIST , a $ reduce 3 (7) $LIST , ELEMENT $ reduce 1 (8) $LIST $ reduce 0 Build Tree (9) $ACCEPT $ accept return that tree list a a