FLEX Fast Lexical Analyzer EECS 6083. Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.

Slides:



Advertisements
Similar presentations
Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Advertisements

Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
Scanning with Jflex.
Lecture 2: Lexical Analysis CS 540 George Mason University.
1 Material taught in lecture Scanner specification language: regular expressions Scanner generation using automata theory + extra book-keeping.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Using CookCC.  Use *.l and *.y files.  Proprietary file format  Poor IDE support  Do not work well for some languages.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lecture 2: Lexical Analysis
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
Compiler Tools Lex/Yacc – Flex & Bison. Compiler Front End (from Engineering a Compiler) Scanner (Lexical Analyzer) Maps stream of characters into words.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Introduction to Lex Ying-Hung Jiang
Chapter 13 – C++ String Class. String objects u Do not need to specify size of string object –C++ keeps track of size of text –C++ expands memory region.
Introduction to Yacc Ying-Hung Jiang
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
Introduction to Lex Fan Wu
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Lexical Analysis with lex(1) and flex(1) © 2014 Clinton Jeffery.
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
Compiler Construction Sohail Aslam Lecture 9. 2 DFA Minimization  The generated DFA may have a large number of states.  Hopcroft’s algorithm: minimizes.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
Parser Generation Using SLK and Flex++ Copyright © 2015 Curt Hill.
1 Steps to use Flex Ravi Chotrani New York University Reviewed By Prof. Mohamed Zahran.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
LECTURE 6 Scanning Part 2. FROM DFA TO SCANNER In the previous lectures, we discussed how one might specify valid tokens in a language using regular expressions.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Compiler Construction Sohail Aslam Lecture Parser Generators  YACC – Yet Another Compiler Compiler appeared in 1975 as a Unix application.  The.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
NFAs, scanners, and flex.
Tutorial On Lex & Yacc.
CSc 453 Lexical Analysis (Scanning)
Using SLK and Flex++ Followed by a Demo
Context-free Languages
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
Bison: Parser Generator
Chapter 3 Introduction to Classes, Objects Methods and Strings
Subject Name:Sysytem Software Subject Code: 10SCS52
Compiler Structures 3. Lex Objectives , Semester 2,
More on flex.
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
NFAs, scanners, and flex.
Compiler Design 3. Lexical Analyzer, Flex
Lexical Analysis - Scanner-Contd
Presentation transcript:

FLEX Fast Lexical Analyzer EECS 6083

Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard input In return, Flex generates C code for a scanner function InputFile.lex / Standard Input Flex lex.yy.c defines yylex()

Introduction The generated code can be used in two ways: – Compiled with the Flex library to produce a scanner executable lex.yy.c C/C++ Compiler Flex library Scanner executable

Introduction The generated code can be used in two ways: – Compiled with other compiler source code and Flex library to produce an entire compiler lex.yy.cC/C++ Compiler Flex library Other compiler source code Compiler executable

Input File Format The Input file contains three major sections: – Definitions – Rules – User Code (Optional) Each Section separated by % % characters * Extracted from Flex User Manual

Definitions Section Input Format: name definition “name” is the identifier for the token type being scanned for. “definition” is the regular expression that characterizes that token. Particular definition can be referenced by {name} * Extracted from Flex User Manual

Definitions Section Un-indented comments copied verbatim to output file from /* to */ Code bracketed by %{ and %} copied verbatim to output minus the brackets Code bracketed by %top{ and } place on top of output file, input order preserved * Extracted from Flex User Manual

Rules Section Input Format: pattern action “pattern”: – describes what the scanner may encounter when scanning – is created using extended regular expressions – ends after first non-escaped whitespace character “action”: – refers to the code that is implemented when a pattern is encountered – begins after pattern and ends either at end of line or with closing bracket } Comments bracketed by /* and */ are ignored. Brackets %{ and %} used to declare local variables before first rule. * Extracted from Flex User Manual

Input Matching Scanner matches strings in scans to patterns user has defined. In case of multiple matches, scanner chooses pattern matching most text. If same amount of text is matched, scanner chooses first pattern defined When string is matched: – global pointer yytext set to location of string – Length of string saved in yyleng Any unmatched strings will be copied directly to the output.

Input Matching Pointer option (Default) – Advantages: faster, no overflow issues – Disadvantage: stored char destroyed with unput() not portable Array option: – Advantages: stored chars can be safely manipulated (internally and externally) unput() doesn’t destroy – Disadvantages : slower than pointer option cannot use with C++ scanner classes Form of yytext can be a character pointer or a character array.

Actions Many Flex functions and macros exist. – ECHO: copies yytext to output – BEGIN : places scanner in start condition – REJECT: looks for second best matching rule – yymore(): appends next matched text to current – yyless(n): places last n characters back in input stream – unput(): place current character back into beginning of input stream – input(): reads the next character from the input stream – YY_FLUSH_BUFFER: flushes scanner’s internal buffer – yyterminate(): terminates scanning and returns 0 to scanner’s caller.

Start Conditions Start Conditions allow state specific processing to occur Declared in definitions section Types: – Inclusive (%s): recognizes start condition and general patterns – Exclusive (%x): recognizes only start condition patterns * Extracted from Flex User Manual

Values Available to User Flex functions and variables available to user – char* yytext text of current token, modifiable – int yyleng length of current token text – FILE* yyinpointer to file scanner reading from – FILE* yyoutpointer to file scanner outputting to – void yyrestart( FILE *new_file ) directs scanner to scan new_file – YY_STARTreturns int corresponding to current start condition.

Interfacing with YACC The parser-generator YACC is designed to use Flex for scanning. All token definitions placed in y.tab.h Token text stored in global variable yylval YACC will call yylex() to get the next token yylex() will return token id and store token text * Extracted from Flex User Manual

Generating C++ Scanners C++ Flex scanner can be created two ways: – Compile Flex input file and library with C++ compiler – Compile Flex with “-+“ or “%option c++ “ lex.yy.cc generated containing two scanner classes FlexLexer: contains user value members (yyleng) yyFlexLexer: access to scanner specific methods (yylex()) * Extracted from Flex User Manual

Survey of Scanner Options Batch/Interactive (default) mode – scanner does/doesn’t look ahead one character to recognize token Enable start condition stack yytext character pointer (default)/array mode Automatically create main(), consisting only of yylex() Debug mode – indicates when a rule is matched Reentrant mode – for multithread scanning * Extracted from Flex User Manual

References Flex User Manual l#Top l#Top Quick Tutorial on using Flex with Bison own-toy-compiler/5/ own-toy-compiler/5/