Lab 3: Using ML-Yacc Zhong Zhuang

Lab 3: Using ML-Yacc Zhong Zhuang dyzz@mail.ustc.edu.cn

How to write a parser?  Write a parser by hand  Use a parser generator  May not be as efficient as hand-written parser  General and robust  How it works? Parser Specification parser generator Parser abstract syntax stream of tokens

ML-Yacc specification  Three parts again User Declarations: declare values available in the rule actions % ML-Yacc Definitions: declare terminals and non-terminals; special declarations to resolve conflicts % Rules: parser specified by CFG rules and associated semantic action that generate abstract syntax

 each nonterminal may have a semantic value associated with it  when the parser reduces with (X ::= s)  a semantic action will be executed  uses semantic values from symbols in s  when parsing is completed successfully  parser returns semantic value associated with the start symbol  usually a syntax tree

Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read

Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E  Result is : E+(E*E) exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read ShiftE+E* ShiftE+E*E ReduceE+E ReduceE If we shift

Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E  Result is: (E+E)*E exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read ReduceE ShiftE* ShiftE*E ReduceE If we reduce

 This is a shift-reduce conflict  We want E+E*E, because “*” has higher precedence than “+”  Another shift-reduce conflict  Tokens from lexer  NUM PLUS NUM PLUS NUM   State of Parser  E+E  Result is : E+(E+E) and (E+E)+E To be read ShiftE+E+ ShiftE+E+E ReduceE+E ReduceE If we shift ReduceE ShiftE+ ShiftE+E ReduceE If we reduce

Deal with shift-reduce conflicts  This case, we need to reduce, because “+” is left associative  Deal with it!  let ML-Yacc complain.  default choice is to shift when it encounters a shift-reduce error  BAD: programmer intentions unclear; harder to debug other parts of your grammar; generally inelegant  rewrite the grammar to eliminate ambiguity  can be complicated and less clear  use Yacc precedence directives  %left, %right %nonassoc

Precedence and Associativity  precedence of terminal based on order in which associativity is specified  precedence of rule is the precedence of the right- most terminal  eg: precedence of (E ::= E + E) == prec(+)  a shift-reduce conflict is resolved as follows  prec(terminal) > prec(rule) ==> shift  prec(terminal) reduce  prec(terminal) = prec(rule) ==>  assoc(terminal) = left ==> reduce  assoc(terminal) = right ==> shift  assoc(terminal) = nonassoc ==> report as error

Reduce-reduce Conflict  This kind of conflict is more difficult to deal with  Example  When we get a “word” from lexer,  word -> maybeword -> sequence (rule 1)  empty –> sequence word -> sequence (rule 2)  We have more than one way to get “sequence” from input “word” sequence::= | maybeword | sequence word maybeword: := | word

Reduce-reduce Conflict  Reduce-reduce conflict means there are two or more rules that apply to the same sequence of input. This usually indicates a serious error in the grammar.  ML-Yacc reduce by first rule  Generally, reduce-reduce conflict is not allowed in your ML-Yacc file  We need to fix our grammar sequence::= | sequence word

Summary of conflicts  Shift-reduce conflict  precedence and associativity  Shift by default  Reduce-reduce conflict  reduce by first rule  Not allowed!

Lab3  Your job is to finish a parser for C language  Input: A “.c” file  Output: “Success!” if the “.c” file is correct  File description  c.lex  c.grm  main.sml  call-main.sml  sources.cm  lab3.mlb  test.c

Using ML-Yacc  Read the ML-Yacc Manual  Run  If your finish “c.grm” and “c.lex”  In command-line: (use MLton’s)  mlyacc c.grm  mllex c.lex  we will get  “c.grm.sig”, “c.grm.sml”, “c.grm.desc”, “c.lex.sml”  Then compile Lab3  Start SML/NJ, Run CM.make “sources.cm”;  or in command-line, mlton lab3.mlb  To run lab3  In SML/NJ, Main.parse “test.c”;  or in command-line, lab3 test.c

“Debug” ML-Yacc File  When you run mlyacc, you’ll see error messages if your ml-yacc file has conflicts. For example,  mlyacc c.grm  2 shift/reduce conflicts  open file “c.grm.desc”(This file is generated by mlyacc)  The beginning of this file  the rest are all the states  rule 12 means the 12 th rule (from 0) in your ML-Yacc file 2 shift/reduce conflicts error: state 0: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12) error: state 1: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12) state 0:prog :. structs vdecs preds funcs MYSTRUCTshift 3proggoto 429 structsgoto 2structdecgoto 1.reduce by rule 12

Use ML-lex with ML-yacc  Most of the work in “c.lex” this time can be copied from Lab2  You can re-use Regular expressions and Lexical rules  Difference with Lab2  You have to define “token” in “c.grm”  %term INT of int | EOF  “%term” in “c.grm” will be automatically in “c.grm.sig” signature C_TOKENS = sig type ('a,'b) token type svalue val EOF: 'a * 'a -> (svalue,'a) token val INT: (int) * 'a * 'a -> (svalue,'a) token end

Hints  Read ML-Yacc Manual  Read the language specification  Test a lot!

Lab 3: Using ML-Yacc Zhong Zhuang

Similar presentations

Presentation on theme: "Lab 3: Using ML-Yacc Zhong Zhuang"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lab 3: Using ML-Yacc Zhong Zhuang

Similar presentations

Presentation on theme: "Lab 3: Using ML-Yacc Zhong Zhuang"— Presentation transcript:

Similar presentations

About project

Feedback