Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Slides:



Advertisements
Similar presentations
4b Lexical analysis Finite Automata
Advertisements

Compiler Baojian Hua Lexical Analysis (II) Compiler Baojian Hua
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Converting NFAs to DFAs. NFA to DFA: Approach In: NFA N Out: DFA D Method: Construct transition table Dtran (a.k.a. the "move function"). Each DFA state.
Chapter 2 Lexical Analysis Nai-Wei Lin. Lexical Analysis Lexical analysis recognizes the vocabulary of the programming language and transforms a string.
Chapter 3 Lexical Analysis. Definitions The lexical analyzer produces a certain token wherever the input contains a string of characters in a certain.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
Chapter 3: Lexical Analysis
1 Chapter 2: Scanning 朱治平. Scanner (or Lexical Analyzer) the interface between source & compiler could be a separate pass and places its output on an.
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
Lexical Analysis Recognize tokens and ignore white spaces, comments
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
Chapter 3 Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Lexical Analysis Natawut Nupairoj, Ph.D.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
어휘분석 (Lexical Analysis). Overview Main task: to read input characters and group them into “ tokens. ” Secondary tasks: –Skip comments and whitespace;
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
CS412/413 Introduction to Compilers Radu Rugina Lecture 4: Lexical Analyzers 28 Jan 02.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Lexical Analyzer in Perspective
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
1 Lexical Analysis and Lexical Analyzer Generators Chapter 3 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
By Neng-Fa Zhou Lexical Analysis 4 Why separate lexical and syntax analyses? –simpler design –efficiency –portability.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
Using Scanner Generator Lex By J. H. Wang May 10, 2011.
Lecture 3 RegExpr  NFA  DFA Topics Lex, Flex Thompson Construction Subset construction (maybe) Readings: , 3.7, 3.6 January 18, 2006 CSCE 531.
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
By Neng-Fa Zhou Programming language syntax 4 Three aspects of languages –Syntax How are sentences formed? –Semantics What does a sentence mean? –Pragmatics.
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Lexical Analysis.
1st Phase Lexical Analysis
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
Deterministic Finite Automata Nondeterministic Finite Automata.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Compilers Lexical Analysis 1. while (y < z) { int x = a + b; y += x; } 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Department of Software & Media Technology
Lexical Analyzer in Perspective
Chapter 3 Lexical Analysis.
Lexical analysis Finite Automata
The time complexity for e-closure(T).
Recognizer for a Language
Review: NFA Definition NFA is non-deterministic in what sense?
Lexical Analysis Why separate lexical and syntax analyses?
פרק 3 ניתוח לקסיקאלי תורת הקומפילציה איתן אביאור.
Chapter 3: Lexical Analysis
Lexical Analysis and Lexical Analyzer Generators
Lecture 5: Lexical Analysis III: The final bits
Recognition of Tokens.
4b Lexical analysis Finite Automata
Finite Automata & Language Theory
Designing a Predictive Parser
Other Issues - § 3.9 – Not Discussed
4b Lexical analysis Finite Automata
Lexical Analysis - An Introduction
Regular Expressions and Lexical Analysis
Presentation transcript:

Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly

Final Notes : R.E. to NFA Construction So, an NFA may be simulated by algorithm, when NFA is constructed using Previous techniques Algorithm run time is proportional to |N| * |x| where |N| is the number of states and |x| is the length of input Alternatively, we can construct DFA from NFA and use the resulting Dtran to recognize input: space required O(|r|)O(|r|*|x|) O(|x|)O(2 |r| )DFA NFA time to simulate where |r| is the length of the regular expression.

Pulling Together Concepts Designing Lexical Analyzer Generator Reg. Expr.  NFA construction NFA  DFA conversion DFA simulation for lexical analyzer Recall Lex Structure Pattern Action … … - Each pattern recognizes lexemes - Each pattern described by regular expression e.g.    etc. (abc)*ab (a | b)*abb Recognizer!

Lex Specification  Lexical Analyzer Let P 1, P 2, …, P n be Lex patterns (regular expressions for valid tokens in prog. lang.) Construct N(P 1 ), N(P 2 ), … N(P n ) Note: accepting state of N(P i ) will be marked by P i Construct NFA:    N(P 1 ) N(P 2 ) N(P n ) Lex applies conversion algorithm to construct DFA that is equivalent!

Pictorially Lex Specification Lex Compiler Transition Table (a) Lex Compiler FA Simulator Transition Table lexeme input buffer (b) Schematic lexical analyzer

Example P 1 : a P 2 : abb P 3 : a*b + 3 patterns NFA’s : start 1 b b bb a a a P1P1 P2P2 P3P3

Example – continued (2) Combined NFA :    0 b b bb a a a start Examples a a b a {0,1,3,7} {2,4,7} {7} {8} death pattern matched: - P 1 - P 3 - a b b {0,1,3,7} {2,4,7} {5,8} {6,8} pattern matched: - P 1 P 3 P 2,P 3  P1P1 P2P2 P3P3 break tie in favor of P 2

Example – continued (3) Alternatively Construct DFA: (keep track of correspondence between patterns and new accepting states) P2P2 {8}-{6,8} P3P3 -{5,8} none{8}{7} P3P3 {8}- P1P1 {5,8}{7}{2,4,7} none{8}{2,4,7}{0,1,3,7} PatternbaSTATE Input Symbol break tie in favor of P 2

Minimizing the Number of States of DFA 1.Construct initial partition  of S with two groups: accepting/ non-accepting. 2.(Construct  new )For each group G of  do begin 1.Partition G into subgroups such that two states s,t of G are in the same subgroup iff for all symbols a states s,t have transitions on a to states of the same group of . 2.Replace G in  new by the set of all these subgroups. 3.Compare  new and . If equal,  final :=  then proceed to 4, else set  :=  new and goto 2. 4.Aggregate states belonging in the groups of  final

example D C A B b b a a a b b F a b A,C,D B,F a b b a a Minimized DFA:

Using LEX Lex Program Structure: declarations % translation rules % auxiliary procedures Name the file e.g. test.lex Then, “ lex test.lex ” produces the file “ lex.yy.c ” (a C-program)

LEX %{ /* definitions of all constants LT, LE, EQ, NE, GT, GE, IF, THEN, ELSE,... */ %} letter[A-Za-z] digit[0-9] id{letter}({letter}|{digit})* % if{ return(IF);} then{ return(THEN);} {id}{ yylval = install_id(); return(ID); } % install_id() {/* procedure to install the lexeme to the ST */ C declarations declarations Rules Auxiliary

Example of a Lex Program int num_lines = 0, num_chars = 0; % \n {++num_lines; ++num_chars;}. {++num_chars;} % main( argc, argv ) int argc; char **argv; { ++argv, --argc; /* skip over program name */ if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); }

Another Example %{ #include %} WS[ \t\n]* % [ ]+ printf("NUMBER\n"); [a-zA-Z][a-zA-Z0-9]* printf("WORD\n"); {WS} /* do nothing */. printf(“UNKNOWN\n“); % main( argc, argv ) int argc; char **argv; { ++argv, --argc; if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); }

Concluding Remarks Focused on Lexical Analysis Process, Including - Regular Expressions - Finite Automaton - Conversion - Lex - Interplay among all these various aspects of lexical analysis Looking Ahead: The next step in the compilation process is Parsing: - Top-down vs. Bottom-up -- Relationship to Language Theory