Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Slides:



Advertisements
Similar presentations
Lesson 6 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Yacc YACC BNF grammar example.y Other modules example.tab.c Executable
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Saumya Debray The University of Arizona Tucson, AZ 85721
LEX and YACC work as a team
Semantic Analysis (Generating An AST) CS 471 September 26, 2007.
Using the LALR Parser Generator yacc By J. H. Wang May 10, 2011.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
CS308 Compiler Principles Introduction to Yacc Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University.
Lesson 9 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
Introduction to Lex Ying-Hung Jiang
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Introduction to Yacc Ying-Hung Jiang
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
Introduction to Lex Fan Wu
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
Introduction to YACC CS 540 George Mason University.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
Lesson 4 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
YACC SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Announcements/Reading
Chapter 3 – Describing Syntax
Syntax Analysis Part III
Tutorial On Lex & Yacc.
Compiler Construction
Chapter 4 Syntax Analysis.
Context-free Languages
Regular Languages.
Syntax Analysis Part III
Syntax Analysis Part III
Bison Marcin Zubrowski.
Syntax Analysis Part III
Syntax Analysis Part III
Compiler Lecture Note, Miscellaneous
Compiler Structures 7. Yacc Objectives , Semester 2,
Saumya Debray The University of Arizona Tucson, AZ 85721
CMPE 152: Compiler Design December 4 Class Meeting
Presentation transcript:

Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg

2 Outline Flex Bison Abstract syntax trees

FLEX 3

Flex Tool for automatic generation of scanners Open-source version of Lex Takes regular expressions as input Outputs a C (or C++) file for the scanner 4

Flex 5 Regexps mylexer.l int yylex() … mylexer.c FlexC compiler … mylexer.obj

The input file to Flex Definitions % Rules % User code 6

The definitions section Macro definitions: –Specify a letter: letter [A-Za-z] –Specify a delimiter: delimiter [,:;.] –Specify a digit: digit [0-9] –Specify an identifier: id letter(letter|digit)* 7

The definitions section User code: %{ #include int a_nice_global_variable = 0; int my_favourite_function(void) {return 42;} %} 8

The rules section Rule = regexp + C code Longest matching pattern is used If two equally long patterns match, the first one in the file is used Examples: =|>=?| )?{ return RELOP; } {id}{ return ID; } 9

The regexp language of Flex ?Previous regexp is optional {}Macro expansion (defined in the definitions section).Matches any character that is not end of line $Matches the end of a line ^Matches the beginning of a line []Matches any enclosed character 10

The [] syntax Similar to | but more powerful Example: digit[ ] is the same as digit0|1|2|3|4|5|6|7|8|9 Special characters inside the brackets: – and ^ digit [0-9] letter [A-Za-z] non_digit [^0-9] 11

The user code section Only C code valid here Will be copied unchanged to the generated C file 12

The generated scanner By default, a function called yylex() is defined –Works similar to your GetNextToken() from lab 1 –The name can be changed with options Some globals are defined as well (can be changed into local variables with options): yyinThe file to read from yytextThe matched lexeme (char*) yylengThe length of yytext yylinenoLine number of the match 13

The yywrap() function Called upon end-of-file Should be supplied by the user Suppressed with %option noyywrap or --noyywrap 14

Scanner states in Flex Affects what tokens should be recognized Example from the language ALF: { fref 32 DEADC0DE }<- Identifier { hex_val DEADC0DE }<- Hex constant 15

Scanner states in Flex Declare state: %x READ_HEX Use the state to make rules conditional: hex_val{ BEGIN(READ_HEX); return HEX_VAL_KW; } [a-zA-Z_][a-zA-Z0-9_]*{ return ID; } [0-9a-fA-F]+{ BEGIN(INITIAL); return NUM; } 16

Online resources 17

BISON 18

Bison Tool for automatic generation of parsers Open-source alternative to Yacc Takes an SDT scheme as input Outputs C (or C++) source code for an LALR parser Commonly used together with Flex 19

Bison 20 SDT scheme myparser.y int parse() … myparser.c Bison C compiler … myparser.obj Token definitions myparser.h

The input file to Bison Definitions % SDT scheme % User code 21

Definitions section Define tokens Define operator precedence Define operator associativity Define the types of grammar symbol attributes Write C code between %{ and %} Issue certain commands to Bison 22

Token definition Normal case: %token IDENTIFIER %token WHILE Token, precedence, associativity, and type: %left RELOP %left MINUSOP PLUSOP %right NOTOP Enables use of ambiguous grammars! 23

Defining types Just enter the type inside <> before the list of tokens: %left RELOP %left MULOP %right NOTOP UNOP %token ID STRING Or the same for non-terminals: %type stmnt expr actuals exprs 24

The variable yylval Used by the lexical analyzer to store token attributes Default type is int May be given another type(s) using %union: %union { int Operator; char *String; NODE_TYPE Node; } The type (member name) is then used like this: %token ID STRING 25

Code provided by the user yyerror(char* msg) –Function called on syntax errors yylex() –Function called to get the next token 26

Options to Bison Given on the command line or in the grammar file --defines or %defines: Output a C header file with definitions useful to a scanner –Tokens (#defines) and the type on yylval %error-verbose: More detailed error messages --name-prefix or %name-prefix: Change the default “yy” prefix on all names %define api.pure: Do not use globals --verbose or %verbose: Write detailed information to extra output file 27

Translation scheme section decl: BASIC_TYPE idents ';' ; idents: idents ',' ident | ident ; ident: ID ; 28

Semantic actions Written in C Executed when the production is used in a reduction $$, $1, $2, etc. refer to the attributes of the grammar symbols –Can be used as regular C variables –$$ refer to the attribute of the head, $1 to the attribute of the first symbol in the body, etc. E : E '+' T { $$ = $1 + $3; } ; 29

Using ambiguous grammars in Bison Default actions: –Reduce/reduce: choose first rule in file –Shift/reduce: always shift With explicit precedence and associativity: –Shift/reduce: Compare prec/ass of rule with that of lookahead token 30

The %expect declaration To suppress shift/reduce warnings: %expect n where n is the exact nr of conflicts 31

Contextual precedence Same token might have different precedence depending on context: expr → expr – expr | expr * expr | – expr | id 32 StackInput … – expr* expr …

Contextual precedence Define dummy token: %left '-' %left '*' %left UMINUS Use the %prec modifier: expr → – expr %prec UMINUS 33

Examples of parser configurations StackInputAction … if (cond) stmtelse …shift StackInputAction … expr + expr* …shift StackInputAction … expr * expr+ …red. expr → expr * expr StackInputAction … expr * expr* …red. expr → expr * expr 34

Online resources 35

ABSTRACT SYNTAX TREES 36

Abstract syntax trees “AST” or just “syntax tree” 37 E E E a + E E b5 * + * a b5

Syntax trees vs. parse trees Parse trees: Interior nodes are nonterminals, leaves are terminals Rarely constructed as an explicit data structure Represents the concrete syntax Syntax trees: Interior nodes are “operators”, leaves are operands Commonly constructed as an explicit data structure Represents the abstract syntax 38

Why syntax trees? Simplifies subsequent analyses Independent on the parsing strategy Makes it easier to add new analysis passes without having to modify the parser More compact representation than parse trees 39

Syntax tree example if (a < 1) b = 2 + 3; else { c = d * 4; e(f, 5); } 40 if < = a = c call e f * 1 b + 23 d4 null 5

Exercise (1) Draw an abstract syntax tree for the statement while (i < 100) { x = 2 * x; i = i + 1; } 41

Constructing a syntax tree in Bison expr: expr '+' expr{ $$ = createOpNode($1, '+',$3); } | expr '*' expr{ $$ = createOpNode($1, '*',$3); } | ID{ $$ = createIdNode($1.name); } ; 42

Constructing a syntax tree in Bison stmt : RETURN expr ';'{ $$ = mReturn($2, $1); } ; stmts : stmts stmt { $$ = connectStmts($1, $2); } | { $$ = NULL; } ; 43

Conclusion Flex generates C source code for a scanner given a set of regular expressions Bison generates C source code for a bottom- up parser given a syntax-directed translation scheme Building syntax trees simplifies subsequent analyses of the program Syntax trees can be built in semantic actions 44

Next time Syntax-directed definitions and translation schemes Semantic analysis and type analysis 45