Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -1- 编译程序的面向对象设计与实现 Dr. Zheng Xiaojuan Associate Professor Software College of Northeast Normal University April. 2009
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -2- 阶段三:语法分析器开发
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -3- 项目需求 读入词法分析的输出结果 token 序列; 对 token 序列进行语法分析生成语法正确的与 源程序结构相对应的语法分析树; 能够指出语法错误所在位置。
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -4- 一、编译原理内容
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -5- 语法分析程序的功能
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -6- 所需编译知识关联图 Develop a Parser Syntax definition basing on Context Free Grammar using implement Top-down Bottom-up √ √
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -7- Top-down parsing Syntax definition basing on CFG using check precondition Predict(A ) first( ) follow(A) Yes Recursive-descent LL(1) Implement 所需编译知识关联图
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -8- 一、 Context Free Grammar (CFG) 定义为四元组 (V T,V N,S,P) V T 是有限的终极符集合 V N 是有限的非终极符集合 S 是开始符, S V N P 是产生式的集合,且具有下面的形式: A X 1 X 2 … X n 其中 A V N , X i (V T V N ) ,右部可空。
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -9- 二、 Top-down parsing 自顶向下语法分析方法的前提条件 –G = (V T, V N, S, P) –For any A V N, –For any two productions of A, –Predict(A 1) Predict(A 2) = ( 同一个非终极符的任意两个产生式的 predict 集合互不相交 ) 这个条件保证 : 针对当前的符号和当前的非终极符, 可 以选择唯一的产生式来进行推导 ;
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -10- 三、 Grammar Transformation 消除公共前缀 (left factoring) 公共前缀 –A 1 | … | n | 1 | … | m 提取公因子 –A A ’ | 1 | … | m –A ’ 1 | … | n
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -11- 消除左递归 (left recursion) – 直接左递归 :A A( 1 | … | n )| 1 | … | m – 消除方法 : A ( 1 | … | m )A ’ A ’ ( 1 | … | n )A ’ |
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -12- 消除左递归 (left recursion) – 间接左递归 : – 消除方法 : Pre-conditions Algorithm S A b A S a | b 1:S 2:A A Aba | b A bA ’ A ’ baA ’ |
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -13- 四、 Three Important Sets First Set(first 集 ) for a string with non-terminal and terminal symbols; –First( ) Follow Set(follow 集 ) for a non-terminal symbol A; –Follow(A) Predict Set( 预测集 ) for a production; –Predict(A )
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques First Set (first 集 ) Definition: –First( ) = {a | *a , a V T }, –if * then First( )= First( ) { } How to calculate First( )? – = , First( ) = { } – = a, a V T, First( ) = {a} – = A, A V N, First( ) = First(A) – = X 1 X 2 …… X i-1 X i …… X n
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -15- S = {A | A * , A V N } For each terminal symbol a, First(a) = {a} For each symbol X, calculate First(X) V N ={A 1, …, A n }, calculate First(A i ) (1) 初始化, First(A i ) ={}; (2)for i =1 to n 对于 A i 的每个产生式, - if A i , First(A i ) = First(A i ) { }; - if A i Y 1 …. Y m, {Y 1,….,Y m } S , First(A i ) = First(A i ) First(Y 1 ) ….. First(Y m ) - if A i Y 1 …. Y m, {Y 1,….,Y j-1 } S , Y j S First(A i ) = {First(A i ) {First(Y 1 ) ….. First(Y j-1 )-{ }} First(Y j ) (3) Repeat (2) until 每个 First(A i ) 没有变化 ( 收敛 ).
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -16- Example P: (1) E TE ’ (2) E ’ + TE ’ (3) E ’ (4) T FT ’ (5) T ’ * F T ’ (6) T ’ (7) F (E) (8) F i (9) F n S = {E ’, T ’ } E{i, n, ( } E’ { +, } T{ i, n, ( } T’ { *, } F{ i, n, ( }
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques Follow Set (follow 集 ) How to calculate Follow(A), A V N (1) 初始化, A V N, Follow(A) = { } (2)Follow(S) = {#} (3) 对于每个产生式 A If there is no non-terminal symbol in , skip; If = B , B V N, Follow(B) = Follow(B) (First( )-{ }) If First( ), Follow(B) = Follow(B) Follow(A) If = B, Follow(B) = Follow(B) Follow(A) (4) Repeat (3) until all follow sets do not change any more;
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -18- Example E{i, n, ( } E’ { +, } T{ i, n, ( } T’ { *, } F{ i, n, ( } P: (1) E TE ’ (2) E ’ + TE ’ (3) E ’ (4) T FT ’ (5) T ’ * F T ’ (6) T ’ (7) F (E) (8) F i (9) F n E{#, )} E’{#, )} T{+, ), #} T’{+, ), #} F{*, +, ), #}
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques Predict Set (predict 集 ) Definition: –Predict(A ) = first( ), if first( ); –Predict(A ) = (first( )- ) follow(A), if first( );
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -20- Example E{i, n, ( } E’ { +, } T{ i, n, ( } T’ { *, } F{ i, n, ( } P: (1) E TE ’ (2) E ’ + TE ’ (3) E ’ (4) T FT ’ (5) T ’ * F T ’ (6) T ’ (7) F (E) (8) F i (9) F n E{#, )} E’{#, )} T{+, ), #} T’{+, ), #} F{*, +, ), #} first 集 follow 集 First(TE ’ )={i, n,( } First(+TE ’ )={+} Follow(E ’ )={#, )} First(FT ’ )={i,n,( } First(*FT ’ )={*} Follow(T ’ )={ ),+, # } First((E))={ ( } First(i)={i} First(n)={n}
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -21- 五、 Recursive-Descent Parsing The goal of parsing –Check whether the input string belongs to the language of CFG; Two actions –match(a): to check current symbol, if match, read next symbol; –Derivation: select the production
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -22- General Process G = (V T, V N, S, P) Predefined function: void match(a: V T ) Global variable: token: V T Input string: str For each A V N, A 1 |……| n A( ) { case token of Predict(A 1 ): SubR( 1 ) ; break; …… Predict(A n ): SubR( n ) ; break; other: error; }
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -23- General Process void match(a: V T ) { if token == a token = readNext(str); else error(); } SubR( ): = X 1 X 2 …… X n If X i V T, match(X i ) If X i V N, X i (); SubR( ): = { }
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -24- Example P: (1) Z aBd {a} (2) B d {d} (3) B c {c} (4) B bB {b} Z ( ) { if token = a {match(a); B( ); match(d); } else error( ); } B ( ) { case token of d: match(d);break; c: match(c); break; b:{ match(b); B( ); break;} other: error( ); } a b c d main( ){ read(token); Z( )}
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -25- Building Parse Tree Data structure –ParseTree Operations –ParseTree BuildRoot(symbol: V T V N ); –ParseTree BuildOneNode(symbol: V T V N ) –AddOneSon(father:*ParseTree, son:*ParseTree ) –SetNum(Node:*ParseTree, n:int) 符号 儿子个数 儿子指针
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -26- Example Z ( ) { if token = a { match(a); B( ); match(d); } else {error(); return nil;} } *ParseTree T = BuildRoot(Z); AddOneSon(T, A);A = BuildOneSon(a); AddOneSon(T, BB); BB= AddOneSon(T, D);D = BuildOneSon(d); return T; SetNum(T, 3);
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -27- 六、 LL(1) Parsing Method LL(1) Parsing –LL(1) parsing table to record predict sets for each production; (LL(1) 分析表 ) –A general engine( 一个通用的驱动程序 )
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -28- LL(1) Parsing Table (LL(1) 分析表 ) How to build LL(1) Parsing Table for a LL(1) Grammar? –For a LL(1) Grammar G = (V T, V N, S, P) –V T = {a 1, …, a n } –V N = {A 1, …, A n } –LL(A i, a i ) = [A i ], if a i predict(A i ) –LL(A i, a i ) = error, if a i not belong to the predict set of any production of A i a1a1 …anan # A1A1 ……. … AnAn
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -29- LL(1) Parsing Mechanism Stack Input #………a 驱动程序 : 栈为空情形的处理 X V T 情形的处理 X V N 情形的处理 X … … LL[1] 分析表
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -30- LL(1) Parsing Engine [1] 初始化: Stack := [ ] ; Push(S) ; [2] 读下一个输入符: Read(a) ; [3] 若当前格局是 (, # ) ,则成功结束;否则转下; [4] 设当前格局为(..... X, a.....) ,则 若 X V T & X= a ,则 { Pop(1) ; Read(a) ; goto [3] } 若 X V T & X a ,则 Error ; 若 X V N ,则: if LL(X, a)=X→Y 1 Y Y n then { Pop(1) ; Push(Y n,.....,Y 1 ) ; goto[3] } else Error
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -31- Building Parse Tree During LL(1) [1] 初始化: Stack := [ ] ; root=BuildOneNode(S); Push(S, root) ; [2] 读下一个输入符: Read(a) ; [3] 若当前格局是 (, # ) ,则成功结束;否则转下; [4] 设当前格局为(..... X, a.....) ,则 若 X V T & X= a ,则 { Pop(1) ; Read(a) ; goto [3] } 若 X V T & X a ,则 Error ; 若 X V N ,则: if LL(X, a)=X→Y 1 Y Y n then { (X, ptr) = Pop(1); for i=n to 1 { p[i] = BuildOneNode(Yi), Push(Y i, p[i]) ; } AddSons(ptr, p, n); goto[3] } else Error
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -32- 二、语法分析程序的实现
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -33- 部分 SNL 语言的上下文无关文法 总程序: Program::=ProgramHead DeclarePart ProgramBody. 程序头: 2) ProgramHead::=PROGRAMProgramName 3) ProgramName::= ID 程序声明: DeclarePart::=TypeDecpart VarDecpart ProcDecpart 类型声明: TypeDecpart::= | TypeDec TypeDec ::= TYPETypeDecList TypeDecList::=TypeId = TypeDef ; TypeDecMore TypeDecMore::= | TypeDecList TypeId::= ID
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -34- 语法树节点的数据结构 childSiblingLinenonodekindkindidnumnametableattrtype 012de c stmtexp…… 原则:语法分析器的输出将作为语义分析的输入,因此语法分 析树中不仅应该包含源程序的结构信息,还应该为语义分析提 供必要的信息。在进行语法分析时,语法分析程序将根据源语 言的文法产生式,为相应的非终极符创建一个语法树节点,并 为之赋值,得到的是与程序结构相似的改进的语法树。除此之 外,只要能够清晰表达出源程序的语法结构即可。语法树节点 的数据结构 可小组自行定义。
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -35- 语法树节点的数据结构
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -36- ProK PheadK p TypeK DecK IntegerK t1 VarK DecK IntegerK v1 v2 ProcDecK q DecK value param: IntegerK i VarK DecK IntegerK a StmLk StmtK Assign ExpK a IdV ExpK i IdV StmtK Write ExpK a IdV StmLk StmtK Read v1 StmtK If ExpK Op < ExpK v1 IdV ExpK Const 10 StmtK Assign ExpK v1 IdV ExpK Op + ExpK v1 IdV ExpK Const 10 StmtK Assign ExpK v1 IdV ExpK Op - ExpK v1 IdV ExpK Const 10 StmtK Call ExpK q IdV ExpK v1 IdV program p type t1 = integer; var integer v1,v2; procedure q(integer i); var integer a; begin a:=i; write(a) end begin read(v1); if v1<10 then v1:=v1+10 else v1:=v1-10 fi; q(v1) end.
Software College of Northeast Normal University Compiler Construction Principles & Implementation Techniques -37- 第三次小组讨论题目: 明确项目需求,学习 ppt 编译原理六个知识点后完成: 仔细研究源语言的语法规则定义,确定产生式是否存在冲突,如有 冲突请根据知识点三进行文法的转换。 根据知识点四求出没有冲突的文法的每条产生式的三个集合。 根据源语言的语法规则特点定义语法分析树的数据结构。 根据知识点五或六进行语法分析(小组任选方法之一)。语法错误 可输出一次。 讨论语法分析树的建立与输出(如对树的数据结构描述不清楚请进 行小组复习或个人复习) 应用 UML 建立系统类图、对象图、用例图、交互图、活动图、状态 图等,编制相应的说明文档。