241-437 Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use 241-437, Semester 1, 2011-2012 3. Lex.

Slides:



Advertisements
Similar presentations
Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Advertisements

Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Lex(1) and flex(1). Lex public interface FILE *yyin; /* set before calling yylex() */ int yylex(); /* call once per token */ char yytext[];/* chars matched.
Lexical Analysis with lex(1) and flex(1) © 2011 Clinton Jeffery.
Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
CS 310 – Fall 2006 Pacific University CS310 Lex & Yacc Today’s reference: UNIX Programming Tools: lex & yacc by: Levine, Mason, Brown Chapter 1, 2, 3 November.
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Lecture 2: Lexical Analysis CS 540 George Mason University.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Lexical analysis.
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lecture 2: Lexical Analysis
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
Compiler Tools Lex/Yacc – Flex & Bison. Compiler Front End (from Engineering a Compiler) Scanner (Lexical Analyzer) Maps stream of characters into words.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Lexical Analysis – Part I EECS 483 – Lecture 2 University of Michigan Monday, September 11, 2006.
By Neng-Fa Zhou Lexical Analysis 4 Why separate lexical and syntax analyses? –simpler design –efficiency –portability.
Introduction to Lex Ying-Hung Jiang
Compilers: lex analysis/2 1 Compiler Structures Objective – –what is lexical analysis? – –look at a lexical analyzer for a simple 'expressions'
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
COMPILERS AND INTERPRETERS Lesson 3 – TDDD16 TDDB44 Compiler Construction 2010 Kristian Stavåker Department.
Introduction to Lex Fan Wu
Lex.
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Lexical Analysis with lex(1) and flex(1) © 2014 Clinton Jeffery.
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
By Neng-Fa Zhou Programming language syntax 4 Three aspects of languages –Syntax How are sentences formed? –Semantics What does a sentence mean? –Pragmatics.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
1 Steps to use Flex Ravi Chotrani New York University Reviewed By Prof. Mohamed Zahran.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
LECTURE 6 Scanning Part 2. FROM DFA TO SCANNER In the previous lectures, we discussed how one might specify valid tokens in a language using regular expressions.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 1www.nand2tetris.org Building.
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Sung-Dong Kim, School of Computer Engineering, Hansung University
Lexical Analysis.
NFAs, scanners, and flex.
Tutorial On Lex & Yacc.
Using SLK and Flex++ Followed by a Demo
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
Compiler Structures 3. Lex Objectives , Semester 2,
Appendix B.1 Lex Appendix B.1 -- Lex.
Compiler Structures 7. Yacc Objectives , Semester 2,
Compiler Structures 2. Lexical Analysis Objectives
More on flex.
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
Compiler Design 3. Lexical Analyzer, Flex
Lex Appendix B.1 -- Lex.
Presentation transcript:

Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex

Compilers: lex/3 2 Overview 1. What is lex (and flex)? 2. Lex Program Format 3. Removing Whitespace (white.l) 4. Printing Line Numbers (linenos.l) 5. Counting (counter.l) 6.Counting IDs (ids.l) 7.Matching Rules 8.More Information.

Compilers: lex/ What is lex (and flex)? lex is a lexical analyzer generator – –flex is a fast version of lex, which we'll be using lex translates REs into C code The generated code is easy to integrate into C compilers (and other applications).

Compilers: lex/3 4 Uses for Lex Convert input from one form to another. Extract information from text files. Extract tokens for a syntax analyzer.

Compilers: lex/3 5 Using Lex lex (flex) lex source program lex.l lex.yy.c input stream of chars C compiler a.out sequence of tokens lex.yy.c a.out

Compilers: lex/3 6 Running Flex With UNIX: > flex foo.l > gcc –Wall -o foo lex.yy.c >./foo < inputfile.txt You may need to include –ll (-lfl) in the gcc call. – –it links in the lex library You may get "warning" messages from gcc.

Compilers: lex/3 7 How Lex Works The lex-generated program (e.g. foo) will read characters from stdin, trying to match against a character sequence using its REs. Once it matches a sequence, it reads in more characters for the next RE match.

Compilers: lex/ Lex Program Format A lex program has three sections: REs and/or C code % RE/action rules % C functions

Compilers: lex/3 9 A Lex Program %{ int charCount=0, wordCount=0, lineCount=0; %} word [^ \t\n]* % {word}{wordCount++; charCount += yyleng; } [\n]{charCount++; lineCount++;}.{charCount++;} % int main(void) { yylex(); printf(“Chars %d, Words: %d, Lines: %d\n”, charCount, wordCount, lineCount); return 0; } 1) C Code, REs 2) RE/Action rules 3) C functions

Compilers: lex/3 10 Section 1: Defining a RE Format: nameRE Examples: digit [0-9] letter [A-Za-z] id {letter} ({letter}|{digit})* word [^ \t\n]*

Compilers: lex/3 11 Regular Expressions in Lex xmatch the char x \.match the char. "string"match contents of string of chars. match any char except \n ^match beginning of a line $match the end of a line [xyz]match one char x, y, or z [^xyz]match any char except x, y, and z [a-z]match one of a to z

Compilers: lex/3 12 r*closure (match 0 or more r's) r+positive closure (match 1 or more r's) r? optional (match 0 or 1 r) r1 r2match r1 then r2 (concatenation) r1 | r2match r1 or r2 (union) ( r ) grouping r1 \ r2match r1 when followed by r2 { name }match the RE defined by name

Compilers: lex/3 13 Example REs (Again) [0-9] A single digit. [0-9]+ An integer. [0-9]+ (\.[0-9]+)? An integer or floating point number. [+-]? [0-9]+ (\.[0-9]+)? ([eE][+-]?[0-9]+)? Integer, floating point, or scientific notation.

Compilers: lex/3 14 Section 2: RE/Action Rule A rule has the form: name{ action } – –the name must be defined in section 1 – –the action is any C code If the named RE matches an input character sequence, then the C code is executed.

Compilers: lex/3 15 Section 3: C Functions Added to the lexical analyzer Depending on the lex/flex version, you may need to add the function: int yywrap(void) { return 1; } – –it returns 1 to signal that the end of the input file means that the lexer can terminate

Compilers: lex/ Removing Whitespace (white.l) whitespace [ \t\n] % {whitespace} ;. { ECHO; } % int yywrap(void) { return 1; } int main(void) { yylex(); // the lexical analyzer return 0; } empty action ECHO macro name RE

Compilers: lex/3 17 Usage > flex white.l > gcc -Wall -o white lex.yy.c >./white < white.l /*white.l*//*AndrewDavison,May... > flex output file

Compilers: lex/ Printing Linenos (linenos.l) %{ int lineno = 1; %} % ^(.*)\n { printf("%4d\t%s", lineno, yytext); lineno++; } % int yywrap(void) { return 1; } continued

Compilers: lex/3 19 int main(int argc, char *argv[]) { if (argc > 1) { FILE *file = fopen(argv[1], "r"); if (file == NULL) { printf("Error opening %s\n", argv[1]); exit(1); } yyin = file; } yylex(); fclose(yyin); return 0; }

Compilers: lex/3 20 Built-in Variables yytext holds the matched string. yyin is the input stream. yyleng holds the length of the string. There are several other built-in variables in lex.

Compilers: lex/3 21 Usage > flex linenos.l > gcc -Wall -o linenos lex.yy.c >./linenos textfile.txt >./linenos < textfile.txt

Compilers: lex/3 22./linenos < linenos.l 1 2 /* linenos.l */ 3 /* Andrew Davison, March 2005 */ 4 5 %{ 6 int lineno = 1; 7 %} 8 9 % : :

Compilers: lex/ Counting (counter.l) %{ int charCount = 0, wordCount = 0, lineCount = 0; %} word [^ \t\n]* % {word} { wordCount++; charCount += yyleng; } \n { charCount++; lineCount++; }. { charCount++; } % int yywrap(void) { return 1; } continued

Compilers: lex/3 24 int main(void) { yylex(); printf("Characters %d, Words: %d, Lines: %d\n", charCount, wordCount, lineCount); return 0; }

Compilers: lex/3 25 Usage > flex counter.l > gcc -Wall -o counter lex.yy.c >./counter < counter.l Characters 496, Words: 78, Lines: 29

Compilers: lex/ Counting IDs (ids.l) %{ int count = 0; %} digit [0-9] letter [A-Za-z] id {letter}({letter}|{digit})* % {id} { count++; }. ; /* ignore other things */ \n ; % continued

Compilers: lex/3 27 int yywrap(void) { return 1; } int main() { yylex(); printf("No. of Idents: %d\n", count); return 0; }

Compilers: lex/3 28 Usage > flex ids.l > gcc -Wall -o ids lex.yy.c >./ids < test1.txt No. of Idents: 6 > l test1.txt this is a test bing2 *((() this5 >

Compilers: lex/ Matching Rules A rule is chosen that matches the biggest amount of input. beg{…} begin{…} Both rules can match the input string "beginning", but the second rule is chosen because it matches more. continued

Compilers: lex/ If two rules can match the same amount of input, then the first rule is used. begin{… } [a-z]+{…} Both rules can match the input string "begin", so the first rule is chosen

Compilers: lex/ More Information Lex and Yacc by Levine, Mason, and Brown O'Reilly; 2nd edition On UNIX: – –man lex – –info lex continued in our library

Compilers: lex/3 32 A Compact Guide to Lex & Yacc by Tom Niemann – –with several calculator examples, which I'll be discussing when we get to yacc – –it's also on the course website in the "Niemann Tutorial" subdirectory of "Useful Info" Software.coe/Compilers/