1 LEX & YACC Tutorial February 28, 2008 Tom St. John.

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

Structure of a YACC File Has the same three-part structure as Lex Each part is separated by a % symbol The three parts are even identical: – definition.
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
CS252: Systems Programming
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Yacc YACC BNF grammar example.y Other modules example.tab.c Executable
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Language processing: introduction to compiler construction Andy D. Pimentel Computer Systems Architecture group
Context-Free Grammars Lecture 7
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
PLLab, NTHU,Cs2403 Programming Languages
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
LEX and YACC work as a team
Introduction To Yacc and Semantics © Allan C. Milne Abertay University v
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
YACC Example Taken from LEX & YACC Simple calculator a = a a=10 b = 7 c = a + b c c = 17 pressure = ( ) * 16.4 $
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
Compiler Tools Lex/Yacc – Flex & Bison. Compiler Front End (from Engineering a Compiler) Scanner (Lexical Analyzer) Maps stream of characters into words.
Introduction to Lex Ying-Hung Jiang
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Introduction to Yacc Ying-Hung Jiang
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
Introduction to Lex Fan Wu
Introduction to YACC Panfeng Xue
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
Semantic Values and Symbol Tables © Allan C. Milne Abertay University v
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
2-1. LEX & YACC. 2 Overview  Syntax  What its program looks like –Context-free grammar, BNF  Syntax-directed translation –A grammar-oriented compiling.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Language processing: introduction to compiler construction
CS 3304 Comparative Languages
Tutorial On Lex & Yacc.
Compiler Construction
Context-free Languages
Regular Languages.
Bison: Parser Generator
Syntax Analysis Part III
Bison Marcin Zubrowski.
Subject Name:Sysytem Software Subject Code: 10SCS52
CS 3304 Comparative Languages
Yacc Yacc.
Compiler Structures 7. Yacc Objectives , Semester 2,
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Compiler Design Yacc Example "Yet Another Compiler Compiler"
CMPE 152: Compiler Design December 4 Class Meeting
Systems Programming & Operating Systems Unit – III
Compiler Design 3. Lexical Analyzer, Flex
Presentation transcript:

1 LEX & YACC Tutorial February 28, 2008 Tom St. John

2 Outline Overview of Lex and Yacc Structure of Lex Specification Structure of Yacc Specification Some Hints for Lab1

3 Overview Lex (A LEXical Analyzer Generator) generates lexical analyzers (scanners or Lexers) Yacc (Yet Another Compiler-Compiler) generates parser based on an analytic grammar Flex is Free scanner alternative to Lex Bison is Free parser generator program written for the GNU project alternative to Yacc

4 Scanner, Parser, Lex and Yacc Scanner Parser symbol table lex.yy.c token name.tab.c name.tab.h Lex spec (.l)Yacc spec (name.y) Lex/ flex Yacc/ bison C Compiler Source Program

5 Skeleton of a Lex Specification (.l file) x.l %{ %} [DEFINITION SECTION] % [RULES SECTION] % C auxiliary subroutines lex.yy.c is generated after running > lex x.l This part will be embedded into lex.yy.c Define how to scan and what action to take for each token Any user code.

6 Lex Specification: Definition Section %{ #include "zcalc.tab.h" #include "zcalc.h“ #include %} %{ #include "zcalc.tab.h" #include "zcalc.h“ #include %} User-defined header file You should include this! Yacc will generate this file automatically.

7 Lex Specification: Rules Section pattern{ corresponding actions } … pattern{ corresponding actions } … pattern{ corresponding actions } [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; } [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; } Format Example Unsigned integer will be accepted as a token Regular Expression C Expression You need to define these two in.y file

8 Two Notes on Using Lex 1. Lex matches token with longest match Input: abc Rule: [a-z]+  Token: abc (not “a” or “ab”) 2. Lex uses the first applicable rule Input: post Rule1: “post” {printf (“Hello,”); } Rule2: [a-zA-Z]+ {printf (“World!”); }  It will print Hello, (not “World!”)

9 Skeleton of a Yacc Specification (.y file) x.y %{ %} [DEFINITION SECTION ] % [ PRODUCTION RULES SECTION ] % C auxiliary subroutines x.tab.c is generated after running > yacc x.y Declaration of tokens recognized in Parser (Lexer). How to understand the input, and what actions to take for each “sentence”.

10 [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; } [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; } Yacc Specification: Definition Section (1) %{ #include "zcalc.h" #include int flag = 0; %} %union { int dval; … } %token NUMBER %{ #include "zcalc.h" #include int flag = 0; %} %union { int dval; … } %token NUMBER zcalc.l zcalc.y

11 Yacc Specification: Definition Section (2) %left '-' '+' %left '*' '/' '%‘ %type expression statement statement_list %type logical_expr %left '-' '+' %left '*' '/' '%‘ %type expression statement statement_list %type logical_expr Define operator’s precedence and associativity - We can solve problem in slide 13 Define nonterminal’s name - With this name, you will define rules in rule section

12 Yacc Specification: Production Rule Section (1) Format nontermname :symbol1 symbol2 … { corresponding actions } | symbol3 symbol4 … { corresponding actions } | … ; nontermname2 : … nontermname :symbol1 symbol2 … { corresponding actions } | symbol3 symbol4 … { corresponding actions } | … ; nontermname2 : … Regular expression C expression or

13 Yacc Specification: Production Rule Section (2) Example statement : expression { printf (“ = %g\n”, $1); } expression : expression ‘+’ expression { $$ = $1 + $3; } | expression ‘-’ expression { $$ = $1 - $3; } | expression ‘*’ expression { $$ = $1 * $3; } | NUMBER statement : expression { printf (“ = %g\n”, $1); } expression : expression ‘+’ expression { $$ = $1 + $3; } | expression ‘-’ expression { $$ = $1 - $3; } | expression ‘*’ expression { $$ = $1 * $3; } | NUMBER  What will happen if we have input “2+3*4”? $$: final value by performing non-terminal’s action, Only for writing, not reading $n: value of the n th concatenated element Avoiding Ambiguous Expression That’s the reason why we need to define operator’s precedence in definition section

14 Hints for Lab1 Exercise 2 Q: How to recognize “prefix”, “postfix” and “infix” in Lexer? A: Step1: Add these rules to your.l file: Step2: declare PREFIX, POSTFIX and INFIX as “token” in your.y file % “prefix” { return PREFIX;} “postfix” { return POSTFIX; } “infix” { return INFIX;} … % “prefix” { return PREFIX;} “postfix” { return POSTFIX; } “infix” { return INFIX;} … %  Should be put in the rule section  Case-sensitive

15 Exercise 2 Q: How to combine three modes together? A: You may have following grammar in your yacc file int flag = 0;// Default setting %.. statement: PREFIX { flag = 0; } | INFIX { flag = 1; } | POSTFIX { flag = 2; }| expression … expression: expr_pre | expr_in | expr_post; expr_pre: '+' expr_pre expr_pre { if(flag == 0) $$ = $2 + $3; } … expr_in: expr_in ‘+’ expr_in { if(flag == 1) $$ = $1 + $3; } … int flag = 0;// Default setting %.. statement: PREFIX { flag = 0; } | INFIX { flag = 1; } | POSTFIX { flag = 2; }| expression … expression: expr_pre | expr_in | expr_post; expr_pre: '+' expr_pre expr_pre { if(flag == 0) $$ = $2 + $3; } … expr_in: expr_in ‘+’ expr_in { if(flag == 1) $$ = $1 + $3; } … Hints for Lab1

16 Exercise 3 Q: What action do we use to define the octal and hexadecimal token? A: You can simply use ‘strtol’ functions for this. long strtol(const char *nptr, char **endptr, int base); Hints for Lab1

17 Exercise 4-5 Q: How to build up and print AST 1. Define the struct for AST and linked list structure having AST nodes. 2. In yacc file, your statement and expressions should be ‘ast’ type (no longer dval type).  Instead of using struct, if you use union here, It’s easier to handle the terminal nodes (name and numbers) typedef struct EXP{ struct EXP* exp1; struct EXP* exp2; struct OP operator; } AST; typedef struct EXP{ struct EXP* exp1; struct EXP* exp2; struct OP operator; } AST; Hints for Lab1

18 Exercise Functions for making expression. It can be different functions by the type of the node (kinds of expression, number, name and so on). You can make functions like,  The action field for each production in your yacc file can call any function you have declared. Just as a sentence is recursively parsed, your AST is recursively built-up and traversed. makeExpression(struct EXP* exp1, struct EXP* exp2, struct OP operator) Hints for Lab1

19 A case study – The Calculator %{ #include “zcalc.tab.h” #include “y.tab.h” %} % ([0-9]+|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) { yylval.dval = atof(yytext); return NUMBER; } [ \t]; [a-zA-Z][a-zA-Z0-(]* { struct symtab *sp = symlook(yytext); yylval.symp = sp; return NAME; } % %{ #include “zcalc.h” %} %union { double dval; struct symtab *symp; } %token NAME %token NUMBER %left ‘+’ ‘-’ %type expression % statement_list : statement ‘\n’ | statement_list statement ‘\n’ statement : NAME ‘=‘ expression {$1->value = $3;} | expression { printf (“ = %g\n”, $1); } expression : expression ‘+’ expression { $$ = $1 + $3; } | expression ‘-’ expression { $$ = $1 - $3; } | NUMBER { $$ = $1; } | NAME { $$ = $1->value; } % struct symtab * symlook( char *s ) { /* this function looks up the symbol table and check whether the symbol s is already there. If not, add s into symbol table. */ } int main() { yyparse(); return 0; } zcalc.lzcalc.y Yacc –d zcalc.y

20 References Lex and Yacc Page