Presentation is loading. Please wait.

Presentation is loading. Please wait.

Subject Name:Sysytem Software Subject Code: 10SCS52

Similar presentations


Presentation on theme: "Subject Name:Sysytem Software Subject Code: 10SCS52"— Presentation transcript:

1 Subject Name:Sysytem Software Subject Code: 10SCS52
Faculty Names:Neema Babu,Sanchari Saha, Suganthi Department:CSE Date: 12/4/2018

2 Overview Grammars Recursive Rules Shift/Reduce Parsing
What Yacc Cannot Parse A Yacc Parser The Definition Section The Rules Section Symbol Values and Actions The Lexer Compiling and Running a Simple Parser Arithmetic Expressions and Ambiguity When Not to Use Precedence Rules Variables and Typed Tokens 12/4/2018

3 Grammars Yacc recognizes entire grammars and writes a parser that recognizes valid "sentences" in that grammar. A grammar is a series of rules that the parser uses to recognize syntactically valid input. For example, here is a version of the grammar 12/4/2018

4 The vertical bar, " | ", means there are two possibilities for the same symbol,i.e., an expression can be either an addition or a subtraction. The symbol to the left of the -> is known as the left-hand side of the rule, often abbreviated LHS or non-terminal symbols. The symbols to the right are the right-hand side, usually abbreviated RHS or terminals 12/4/2018

5 The usual way to represent a parsed sentence is as a tree.
For example, if we parsed the input "fred = " with this grammar, the tree would look like 12/4/2018

6 Recursive Rules Rules can refer directly or indirectly to themselves; this important ability makes it possible to parse arbitrarily long input sequences. For eg: We can extend our grammar to handle longer arithmetic expressions: Now we can parse a sequence like "fred = “ by applying the expression rules repeatedly and the parse tree will be as follows 12/4/2018

7 12/4/2018

8 Shift/Reduce Parsing When yacc processes a parser, it creates a set of states each of which reflects a possible position in one or more partially parsed rules. As the parser reads tokens, each time it reads a token that doesn't complete a rule it pushes the token on an internal stack and switches to a new state reflecting the token it just read. This action is called a shift. When it has found all the symbols that constitute the right-hand side of a rule, it pops the right-hand side symbols off the stack, pushes the left-hand side symbol onto the stack, and switches to a new state reflecting the new symbol on the stack. This action is called a reduction, since it usually reduces the number of items on the stack 12/4/2018

9 Shift/Reduce Parsing Whenever yacc reduces a rule, it executes user code associated with the rule. Now we can check how it parses the input "fred = " using the simple rules in previous slide. The parser starts by shifting tokens on to the internal stack one at a time

10 At this point it can reduce the rule "expression -> NUMBER + NUMBER" so it pops the 12, the plus, and the 13 from the stack and replaces them with expression fred = expression Now it reduces the rule "statement -> NAME = expression", so it pops fred, =, and expression and replaces them with statement. Now it reached the end of the input and the stack has been reduced to the start symbol, so the input was valid according to the grammar 12/4/2018

11 What Yacc Cannot Parse It cannot deal with ambiguous grammars, ones in which the same input can match more than one parse tree. It also cannot deal with grammars that need more than one token of lookahead to tell whether it has matched a rule. Consider the eg: 12/4/2018

12 If we changed the first rule to this
This grammar is not ambiguous, but yacc can't handle it because it requires two symbols of lookahead. In particular, in the input "HORSE AND CART" it cannot tell whether HORSE is a cart-animal or a work-animal until it sees CART, and yacc cannot look that far ahead. If we changed the first rule to this It can look one token ahead to see whether an input of HORSE is followed by CART, in which case the horse is a cart-animal or by PLOW in which case it is a work-animal. 12/4/2018

13 A Yacc Parser A yacc grammar has three-part structure:
Definition Section Rule Section Subroutine section Definition Section: It handles control information for the yacc-generated parser and generally sets up the execution environment in which the parser will operate. Rule Section: It contains the rules for the parser. Subroutine section: Third section is C code copied verbatim into the generated C program. 12/4/2018

14 The Definition Section
The definition section includes declarations of the tokens used in the grammar For eg; %token NAME | NUMBER We can use single quoted characters as tokens without declaring them 12/4/2018

15 The Rules Section The rules section simply consists of a list of grammar rules. A colon is used between the left- and right-hand sides of a rule, and we put a semicolon at the end of each rule The symbol on the left-hand side of the first rule in the grammar is normally the start symbol. 12/4/2018

16 Symbol Values and Actions
Every symbol in a yacc parser has a value. The value gives additional information about a particular instance of a symbol. If a symbol represents a number, the value would be the particular number. If it represents a literal text string, the value would probably be a pointer to a copy of the string. If it represents a variable in a program, the value would be a pointer to a symbol table entry describing the variable. 12/4/2018

17 Whenever the parser reduces a rule, it executes user C code associated with the rule, known as the rule's action. The action appears in braces after the end of the rule, before the semicolon or vertical bar. The action code can refer to the values of the right-hand side symbols as $1, $2, , and can set the value of the left-hand side by setting $$. For eg: 12/4/2018

18 The Lexer Whenever the lexer and parser is used together the parser is the higher level routine, and calls the lexer yylex() whenever it needs a token from the input. As soon as the lexer finds a token of interest to the parser, it returns to the parser, returning the token code as the value. The following is a simple lexer to provide tokens for parser: 12/4/2018

19 Strings of digits are numbers, whitespace is ignored, and a newline returns an end of input token (number zero) to tell the parser that there is no more to read. Whenever the lexer returns a token to the parser, if the token has an associated value, the lexer must store the value in yylval before returning. 12/4/2018

20 Compiling and Running a Simple Parser
Following are the commands to compile and run lex and yacc programs. lex filename.l yacc –d filename.y cc lex.yy.c y.tab.c –ll ./a.out 12/4/2018

21 Arithmetic Expressions and Ambiguity
Consider the grammar: 12/4/2018

22 This grammar has a problem: it is extremely ambiguous. For example:
Consider the input 2+3*4,and following are the possible two parse trees for this input. 12/4/2018

23 Precedence controls which operators to execute first in an expression.
The problem is that we haven't told yacc about the precedence and associativity of the operators. Precedence controls which operators to execute first in an expression. Associativity controls the grouping of operators at the same precedence level. Operators may group to the left, e.g., a-b-c in C means (a-b)-c, or to the right, e.g., a=b=c in C means a=(b=c). 12/4/2018

24 There are two ways to specify precedence and associativity in a grammar,
1)Implicit 2) Explicit Implicit: To specify them implicitly, rewrite the grammar using separate non-terminal symbols for each precedence level. Eg: 12/4/2018

25 Explicit:We can add these lines to the definition section.
Each of these declarations defines a level of precedence. They tell yacc that "+" and "-" are left associative and at the lowest precedence level, "*" and "/" are left associative and at a higher precedence level, and UMINUS, a pseudo-token standing for unary minus, has no associativity and is at the highest precedence. 12/4/2018

26 The calculator grammar with expressions and precedence
12/4/2018

27 Variables and Typed Tokens
We can extend our calculator to handle variables with single letter names. To make the calculator more useful, we can also extend it to handle multiple expressions, one per line, and to use floating point values. 12/4/2018

28 Questions What is shift/reduce parsing? Explain the parsing for the string “fred=12+13” using the following grammar: Statement: name=expression Expression: number+number / number-number M 2. a) What do you mean by ambiguous grammar? How it can be overcome? Illustrate with an example M b) Write a Yacc program to recognize the given arithmetic expression containing +, -, /, * operator. 6M 12/4/2018

29 b) Give a brief description of the working principal of Yacc. 6M
3) a) What is Yacc? Explain the different sections used in writing the Yacc specification. 6.5M b) Give a brief description of the working principal of Yacc. 6M 4) What do you mean by ambiguity in arithmetic expression? How it can be handled in Yacc? Write a Yacc program to function as a calculator which performs addition, subtraction, division, multiplication and unary operations & also show how the program will make the Yacc to understand the concept of precedence & associativity of operators M 12/4/2018

30 5) Define the following terms with an example: 12.5M
i)Non-terminals ii) y.tab.h iii) lexical analysis iv) input() & unput() 6) Define the following terms with an example: M i)yywrap() ii) yylex() iii) YYSTYPE iv) yylval 12/4/2018

31 THANK YOU 12/4/2018


Download ppt "Subject Name:Sysytem Software Subject Code: 10SCS52"

Similar presentations


Ads by Google