7.1 결정적 구문 분석 Deterministic Top-Down Parsing One pass nobackup

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.
Top-Down Parsing.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Top-Down Parsing.
– 1 – CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL (1) Table construction Readings: 4.4 Homework: Program.
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Top-Down Parsing - recursive descent - predictive parsing
Chapter 5 Top-Down Parsing.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 1 컴파일러 입문 제 7 장 LL 구문 분석.
1 Compiler Construction Syntax Analysis Top-down parsing.
컴파일러 입문 제 7 장 LL 구문 분석.
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.
Pembangunan Kompilator.  The parse tree is created top to bottom.  Top-down parser  Recursive-Descent Parsing ▪ Backtracking is needed (If a choice.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Top-Down Parsing The parse tree is created top to bottom. Top-down parser –Recursive-Descent Parsing.
Parsing Top-Down.
LL(1) Parser. What does LL signify ? The first L means that the scanning takes place from Left to right. The first L means that the scanning takes place.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Lecture 3: Parsing CS 540 George Mason University.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-Down Parsing.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Bernd Fischer RW713: Compiler and Software Language Engineering.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Parsing COMP 3002 School of Computer Science. 2 The Structure of a Compiler syntactic analyzer code generator program text interm. rep. machine code tokenizer.
9/30/2014IT 3271 How to construct an LL(1) parsing table ? 1.S  A S b 2.S  C 3.A  a 4.C  c C 5.C  abc$ S1222 A3 C545 LL(1) Parsing Table What is the.
Parsing #1 Leonidas Fegaras.
Compiler Construction
Context free grammars Terminals Nonterminals Start symbol productions
Lecture #12 Parsing Types.
Compilers Welcome to a journey to CS419 Lecture15: Syntax Analysis:
Syntactic Analysis and Parsing
Compiler Construction
Introduction to Top Down Parser
Top-down parsing cannot be performed on left recursive grammars.
SYNTAX ANALYSIS (PARSING).
FIRST and FOLLOW Lecture 8 Mon, Feb 7, 2005.
CS 404 Introduction to Compiler Design
Top-Down Parsing.
3.2 Language and Grammar Left Factoring Unclear productions
Lecture 7 Predictive Parsing
CS 540 George Mason University
Syntax Analysis source program lexical analyzer tokens syntax analyzer
Lecture 8 Bottom Up Parsing
Compiler Design 7. Top-Down Table-Driven Parsing
Top-Down Parsing Identify a leftmost derivation for an input string
Top-Down Parsing The parse tree is created top to bottom.
Chapter 4 Top Down Parser.
LL PARSER The construction of a predictive parser is aided by two functions associated with a grammar called FIRST and FOLLOW. These functions allows us.
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
컴파일러 입문 제 7 장 LL 구문 분석.
Computing Follow(A) : All Non-Terminals
Syntax Analysis - Parsing
Lecture 7 Predictive Parsing
Chapter 3 Syntactic Analysis I.
Nonrecursive Predictive Parsing
Context Free Grammar – Quick Review
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

7.1 결정적 구문 분석 Deterministic Top-Down Parsing One pass nobackup ::= deterministic selection of production rules to be applied in top-down syntax analysis. One pass nobackup 1. Input string is scanned once from left to right. 2. Parsing process is deterministic. Top-down parsing with nobackup ::= deterministic top-down parsing. called LL parsing. “Left to right scanning and Left parse”

How to decide which production is to be applied: sentential form : 1 2 … i-1Xα input string : 1 2 … i-1 i i+1 … n X  1 | 2 ... | k ∈ P일 때, i를 보고 X-production 중에 unique하게 결정. the condition for no backtracking : FIRST와 FOLLOW가 필요. (=> LL condition)

FIRST FIRST() ::= the set of terminals that begin the strings derived from . if   , then  is also in FIRST(). FIRST(A) ::= { a∈VT∪{} | A  a,  ∈ V* }. Computation of FIRST(X), where X ∈ V. 1) if X∈VT, then FIRST(X) = {X} 2) if X∈VN and X  a∈P, then FIRST(X) = FIRST(X) ∪ {a} if X   ∈ P, then FIRST(X) = FIRST(X) ∪ {} 3) if X  Y1Y2 …Yk ∈ P and Y1Y2 …Yi-1  , then FIRST(X) = FIRST(X) ∪ (∪ FIRST(Yj) - {}). if Y1Y2 …Yk   , then FIRST(X) = FIRST(X) ∪{}. * j=1 i *

FIRST 구하는 예제 [1/2] ex1) E  TE E  +TE |  T  FT T  FT |  F  (E) | id FIRST(E) = FIRST(T) = FIRST(F) = {(, id} FIRST(E) = {+, } FIRST(T) = {, } ex2) PROGRAM  begin d semi X end X  d semi X X  s Y Y  semi s Y |  FIRST(PROGRAM) = {begin} FIRST(X) = {d,s} FIRST(Y) = {semi, }

FOLLOW FOLLOW(A) Computation of FOLLOW(A) ::= the set of terminals that can appear immediately to the right of A in some sentential form. If A can be the rightmost symbol in some sentential form, then $ is in FOLLOW(A). ::= {a ∈ VT∪{$} | S  Aa, ,  ∈ V*}. ※ $ is the input right marker. Computation of FOLLOW(A) 1) FOLLOW(S) = {$} 2) if A  B ∈ P and  , then FOLLOW(B) = FOLLOW(B) ∪ (FIRST() - ) 3) if A  B ∈ P or A  B and   , then FOLLOW(B) = FOLLOW(B) ∪ FOLLOW(A). *

FOLLOW 구하는 예제 [1/2] E’  +TE |  T  FT T’  FT |  F  (E) | id Nullable = { E, T } FIRST(E) = FIRST(T) = FIRST(F) = {(, id} FIRST(E) = {+, } FIRST(T) = {, } FOLLOW(E) = {),$} FOLLOW(E') = {),$} FOLLOW(T) = {+,),$} FOLLOW(T') = {+,),$} FOLLOW(F) = {,+,),$}

LL condition 기본적 개념 정의: A   | ∈ P, ::= no backup condition ::= the condition for deterministic parsing of top-down method. input : 12 ... i-1i ...n derived string : 12...i-1X X  1 | 2 ... | m  i를 보고 X-production들 중에서 X를 확장할 rule을 결정적으로 선택. 정의: A   | ∈ P, 1. FIRST() ∩ FIRST() =  2. if   , FOLLOW(A) ∩ FIRST() =  if ∈ FIRST(), FOLLOW(A) ∩ FIRST() =  *

LL condition 예제 A  aBc | Bc | dAa B  bB |  FIRST(A) = {a,b,c,d} FOLLOW(A) = {$,a} FIRST(B) = {b, } FOLLOW(B) = {c} LL condition 검사 1) A  aBc | Bc | dAa에서, FIRST(aBc) ∩ FIRST(Bc) ∩ FIRST(dAa) = {a} ∩ {b,c} ∩ {d} =  2) B  bB |  에서, FIRST(bB) ∩ FOLLOW(B) = {b} ∩ {c} =  1), 2)에 의해 LL 조건을 만족한다.

7.2 Recursive-descent 파서 Recursive-descent parsing ::= A top-down method that uses a set of recursive procedures to recognize its input with no backtracking. Create a procedure for each nonterminal. ex) G : S  aA | bB A  aA | c B  bB | d procedure pS; begin if nextSymbol = ta then begin getNextSymbol; pA end else if nextSymbol = tb then begin getNextSymbol; pB end else error end;

procedure pA; begin if nextSymbol = ta then begin getNextSymbol; pA end else if nextSymbol = tc then getNextSymbol else error end; procedure pB; ... /* main */ begin getNextSymbol; pS; if nextSymbol = '$' then accept else error end. ※ procedure call sequence ::= leftmost derivation  = aac$

LOOKAHEAD of a production The main problem in constructing a recursive-descent syntax analyzer is the choice of productions when a procedure is first entered. To resolve this problem, we can compute the lookahead of each production. LOOKAHEAD of a production Definition : LOOKAHEAD(A) = FIRST({ | S  A    ∈ VT*}). Meaning : the set of terminals which can be generated by  and if  , then FOLLOW(A) is added to the set. Computing formula: LOOKAHEAD(A  X1X2...Xn) = FIRST(X1X2...Xn)  FOLLOW(A) *

LOOKAHEAD 구하는 예제 S  aSA |  A  c Nullable Set = {S} FIRST(S) = {a, } FOLLOW(S) = {$,c} FIRST(A) = {c} FOLLOW(A) = {$,c} LOOKAHEAD(S  aSA) = FIRST(aSA)  FOLLOW(S) = {a} LOOKAHEAD(S  ) = FIRST()  FOLLOW(S) = {$,c} LOOKAHEAD(A  c) = FIRST(c)  FOLLOW(A) = {c} ※ LOOKAHEAD를 구하는 순서 : Nullable => FIRST => FOLLOW => LOOKAHEAD

Strong LL condition Definition : A   |  ∈ P, LOOKAHEAD(A  ) ∩ LOOKAHEAD(A  ) = . Meaning : for each distinct pair of productions with the same left-hand side, it can select the unique alternate that derives a string beginning with the input symbol. The grammar G is said to be strong LL(1) if it satisfies the strong LL condition. ex) G : S  aSA |  A  c LOOKAHEAD(S  aSA) = {a} LOOKAHEAD(S  ) = FOLLOW(S) = {$, c} LOOKAHEAD(S  aSA) ∩ LOOKAHEAD(S  ) =   G는 strong LL(1)이다.

Implementation of Recursive-descent parser If a grammar is strong LL(1), we can construct a parser for sentences of the grammar using the following scheme. Terminal procedure: a ∈ VT, procedure pa; /* getNextSymbol => scanner */ begin if nextSymbol = ta then getNextSymbol else error end; ※ getNextSymbol : 스캐너에 해당하는 루틴으로 입력 스트림으로부터 토큰 한 개를 만들어 변수 nextSymbol에 배정한다.

Nonterminal procedure A ∈ VN, procedure pA; var i: integer; begin case nextSymbol of LOOKAHEAD(A  X1X2...Xm): for i := 1 to m do pXi; LOOKAHEAD(A  Y1Y2...Yn): for i := 1 to n do pYi; : LOOKAHEAD(A  Z1Z2...Zr): for i := 1 to r do pZi; LOOKAHEAD(A  ): ; otherwise: error end /* case */ end;

RDP 최적화 Improving the efficiency and structure of recursive-descent parser 1) Eliminating terminal procedures ::= In practice it is better not to write a procedure for each terminal. Instead the action of advancing the input marker can always be initiated by the nonterminal procedures. In this way many redundant tests can be eliminated. ex) text p.285 [예제 7.9] 2) BNF  EBNF : reduce the number of productions and nonterminals. ① repetitive part : { } ② optional part : [ ] ③ alternation : ( | )

[예제 7.10] - text p.287 < if_st > ::= ‘ if ’ < cond > ‘ then ’ < st > [‘ else ’ < st > ] procedure pIF; begin if nextSymbol = tif then begin getNextSymbol; pCOND; if nextSymbol = tthen then begin getNextSymbol; pST end else error(10) end else error(11); if nextSymbol = telse then end;

[예제 7.11] - text p.288 <id_list> ::= ‘ id ’ {‘ , ’ ‘ id ’ } procedure pID_LIST; begin if nextSymbol = tid then begin getNextSymbol; while (nextSymbol = tcomma) do if nextSymbol = tid then getNextSymbol else error(100) end end;

7.3 Predictive 파서 RDP의 단점 Predictive 파서 Predictive parsing 문법이 변경되면 프로그램을 수정해야 함. Predictive 파서 프로그램과 테이블로 분리 Driver routine + Table RDP의 단점을 극복 문법 변경 시 테이블만 재구성 Predictive parsing ::= a deterministic top-down parsing method using a stack. The stack contains a sequence of grammar symbols.

Model of a predictive parser[1/3] ※ The input buffer contains the string to be parsed, followed by $.

Model of a predictive parser[2/3] Current input symbol과 stack top symbol사이의 관계에 따라 parsing. Initial configuration : STACK INPUT $S $ Parsing table(LL) : parsing action을 결정지어 줌. ※ M[X,a] = r : stack top symbol이 X이고 current symbol이 a일 때, r번 생성 규칙으로 expand.

Model of a predictive parser[3/3] Parsing Actions X : stack top symbol, a : current input symbol 1. if X = a = $, then accept. 2. if X = a, then pop X and advance input. 3. if X ∈ VN, then if M[X,a] = r (XABC), then replace X by ABC else error.

Predictive parsing algorithm Algorithm Predictive_Parser_Action; begin // set ip to point to the first symbol of $; repeat // let X be the top stack symbol and a the symbol pointed to by ip; if X is a terminal or $ then if X = a then pop X from the stack and advance ip else error(1) else /* X is nonterminal */ if M[X,a] = X  Y1Y2...Yk then begin pop X from the stack; push YkYk-1,...,Y1 onto the stack, with Y1 on top; output the production X  Y1Y2...Yk end else error(2) until X = a = $ /* stack is empty */ end.

예제 – text p.290 [1/2] • G : 1. S  aSb 2. S  bA 3. A  Aa 4. A  b string : aabbbb • Parsing Table: terminals nonterminal a b $ S 1 2 . A 3 4

예제 – text p.290 [2/2] STACK INPUT ACTIONS OUTPUT $S $bSa $bS $bbSa $bbAb $bbA $bbb $bb $b $ aabbbb$ abbbb$ bbbb$ bbb$ bb$ b$ expand 1 pop a and advance expand 1 pop a and advance expand 2 pop b and advance expand 4 Accept 1 2 4 ※ How to construct a predictive parsing table for the grammar. $bbA bbb$ expand 4 4 $bbb bbb$ pop b and advance $bb bb$ pop b and advance $b b$ pop b and advance $ $ Accept

7.4 Predictive 파싱 테이블의 구성 main idea : If A   is a production with a in FIRST(), then the parser will expand A by  when the current input symbol is a. And if   , then we should again expand A by  when the current input symbol is in FOLLOW(A). parsing table(LL): M[X,a] = r : expand X with r-production blank : error *

Construction Algorithm : for each production A, 1. a ∈ FIRST(), M[A,a] := <A> 2. if  * , then b ∈ FOLLOW(A), M[A,b] := <A>.

예제 – text p.297 [1/2] G: 1. E  TE’ 2. E’ +TE’ 3. E’   4. T  FT’ 5. T’ FT’ 6. T’  7. F  (E) 8. F  id FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FIRST(E’) = { + ,  } FIRST(T’) = {  ,  } FOLLOW(E) = FOLLOW(E’) = { ) , $ } FOLLOW(T) = FOLLOW(T’) = { + , ) , $ } FOLLOW(F) = { + ,  , ) , $ }

예제 – text p.297 [2/2] Parsing Table: id + * ( ) $ E 1 E’ 2 3 T 4 T’ 6 Terminal Nonterminal id + * ( ) $ E 1 E’ 2 3 T 4 T’ 6 5 F 8 7