1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Lexical Analysis with lex(1) and flex(1) © 2011 Clinton Jeffery.
Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Scanning with Jflex.
Lecture 2: Lexical Analysis CS 540 George Mason University.
1 Material taught in lecture Scanner specification language: regular expressions Scanner generation using automata theory + extra book-keeping.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
CS 536 Spring Learning the Tools: JLex Lecture 6.
Lex & yacc CIS*2750 Winter CIS*2750 (W13)D. McCaughan Scanners A “scanner” turns an input stream in the source language into token codes –in principle:
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
1 Using Yacc: Part II. 2 Main() ? How do I activate the parser generated by yacc in the main() –See mglyac.y.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lecture 2: Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – JLex Scanner Generator.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
Compiler Tools Lex/Yacc – Flex & Bison. Compiler Front End (from Engineering a Compiler) Scanner (Lexical Analyzer) Maps stream of characters into words.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Lexical Analysis – Part I EECS 483 – Lecture 2 University of Michigan Monday, September 11, 2006.
Introduction to Lex Ying-Hung Jiang
Introduction to Yacc Ying-Hung Jiang
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
COMPILERS AND INTERPRETERS Lesson 3 – TDDD16 TDDB44 Compiler Construction 2010 Kristian Stavåker Department.
Introduction to Lex Fan Wu
Lex.
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Lexical Analysis with lex(1) and flex(1) © 2014 Clinton Jeffery.
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Compiler Construction Sohail Aslam Lecture 9. 2 DFA Minimization  The generated DFA may have a large number of states.  Hopcroft’s algorithm: minimizes.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
C Chuen-Liang Chen, NTUCS&IE / 35 SCANNING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
1 Steps to use Flex Ravi Chotrani New York University Reviewed By Prof. Mohamed Zahran.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
LECTURE 6 Scanning Part 2. FROM DFA TO SCANNER In the previous lectures, we discussed how one might specify valid tokens in a language using regular expressions.
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Sung-Dong Kim, School of Computer Engineering, Hansung University
NFAs, scanners, and flex.
Tutorial On Lex & Yacc.
Using SLK and Flex++ Followed by a Demo
Command Line Arguments
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
Subject Name:Sysytem Software Subject Code: 10SCS52
Compiler Structures 3. Lex Objectives , Semester 2,
Appendix B.1 Lex Appendix B.1 -- Lex.
More on flex.
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
NFAs, scanners, and flex.
Compiler Design 3. Lexical Analyzer, Flex
Lex Appendix B.1 -- Lex.
Presentation transcript:

1 Using Lex

2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the patterns matches, the lex program invokes C code that you provide which does something with the matched text.

3 Introduction (Cont’d) Lex itself doesn’t produce an executable program; instead it translates the lex specification into a file containing a C routine called yylex(). Your program calls yylex() to run the lexer.

4 The format of regular expressions in lex The notation is slightly different from that used in our text book.

5 Regular Expressions Regular expressions used by Lex (See pages 28 and 29). * [] ^ $ {} \ + ? | “…” / ()

6 Examples of Regular Expressions [0-9] [0-9]+ [0-9]* -?[0-9]+ [0-9]*\.[0-9]+ ([0-9]+)|([0-9]*\.[0-9]+) -?(([0-9]+)|([0-9]*\.[0-9]+)) [eE][-+]?[0-9]+ -?(([0-9]+)|([0-9]*\.[0-9]+))([eE][-+]?[0-9]+)?)

7 The Structure of a Lex Program (Definition section) % (Rules section) % (User subroutines section)

8 %{ /* * this sample demonstrates (very) simple recognition: * a verb/not a verb. */ %} % [\t ]+ /* ignore white space */ ; is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go { printf("%s: is a verb\n", yytext); } [a-zA-Z]+ { printf("%s: is not a verb\n", yytext); }.|\n { ECHO; /* normal default anyway */ } % main() { yylex(); } Example 1-1: Word recognizer ch1-02.l

9 The definition section Lex copies the material between “%{“ and “%}” directly to the generated C file, so you may write any valid C codes here

10 Rules section Each rule is made up of two parts –A pattern –An action E.g. [\t ]+ /* ignore white space */ ;

11 Rules section (Cont’d) E.g. is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go { printf("%s: is a verb\n", yytext); }

12 Rules section (Cont’d) E.g. [a-zA-Z]+ { printf("%s: is not a verb\n", yytext); }.|\n { ECHO; /* normal default anyway */ } Lex had a set of simple disambiguating rules: 1.Lex patterns only match a given input character or string once 2.Lex executes the action for the longest possible match for the current input

13 User subroutines section It can consists of any legal C code Lex copies it to the C file after the end of the Lex generated code % main() { yylex(); }

14 Example 2-1 % [\n\t ] ; -?(([0-9]+)|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) { printf("number\n"); }. ECHO; % main() { yylex(); }

15 A Word Counting Program The definition section %{ unsigned charCount = 0, wordCount = 0, lineCount = 0; %} word [^ \t\n]+ eol \n

16 A Word Counting Program (Cont’d) The rules section {word} { wordCount++; charCount += yyleng; } {eol} { charCount++; lineCount++; }. charCount++;

17 A Word Counting Program (Cont’d) The user subroutines section main(argc,argv) int argc; char **argv; { if (argc > 1) { FILE *file; file = fopen(argv[1], "r"); if (!file) { fprintf(stderr,"could not open %s\n",argv[1]); exit(1); } yyin = file; } yylex(); printf("%d %d %d\n",charCount, wordCount, lineCount); return 0; }

18 How to implement a scanner()? We have to stop the yylex() when it recognizes a defined token. –Insert “return” at the end of your program [a-zA-Z]+ { return 2; } See scanner_example.l

19 %{ %} % [\t ]+ /* ignore white space */ ; is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go { return 1; } [a-zA-Z]+ { return 2; }.|\n { /* normal default anyway */ } % main() { int i; while ((i=yylex())!=0) { printf("return value is %d, token is %s\n", i,yytext); } printf("End of file\n"); }

How to implement multiple characters lookahead in lex? Check lex_lookahead.l –DO10I=1,100 –DO10I=