Download presentation
Presentation is loading. Please wait.
Published byLillian Benson Modified over 9 years ago
1
1 Chapter 3 Scanning - Theory and Practice Prof Chung. 10/8/2015
2
2 2 Outlines 3.1 Overview 3.2 Regular Expressions 3.4 Finite Automata and Scanners 3.5 Using a Scanner Generator LEX --- Introduce in TA course: LEX introduction 3.7 Practical Considerations 3.8 Translating Regular Expressions into Finite Automata 3.9 Summary Modify form http://www.cs.ualberta.ca/~amaral/courses/680/ 10/8/2015
3
Overview(1) Formal notations For specifying the precise structure of tokens are necessary Quoted string in Pascal Can a string split across a line? Is a null string allowed? Is.1 or 10. ok? The 1..10 problem Scanner generators Tables, Programs What formal notations to use? 10/8/20153
4
Overview(2) Lexical analyzer (scanner) role Produce a sequence of (tokens) for parser Stripe out comments and whitespaces Associate a line number with each error message Expand macros 10/8/20154 Lexical Analyzer Parser Symbol Table source program to semantic analysis token getNextToken
5
Regular Expressions (1) Tokens built from symbols of a finite vocabulary. Structures of tokens use regular expressions to define Set Definition The sets of strings defined by regular expressions are termed is a regular expression denoting the empty set is a regular expression denoting the set that contains only the empty string A string s is a regular expression denoting a set containing only s 10/8/20155
6
Regular Expression (2) if A and B are regular expressions, so are A | B (alternation) A regular expression formed by A or B (a)|(b) = {a, b} AB or AB (concatenation) A regular expression formed by A followed by B (a)(b) = {ab} A* (Kleene closure) A regular expression formed by zero or more repetitions of A a* = {, a, aa, aaa, …} 10/8/20156 More Complex Example (a|b|c)* = {, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc …}
7
Regular Expression (3) Some notational convenience P + PP* (at least one) Not(A) V - A Not(S) V* - S A K AA …A (k copies) A ? Optional, zero or one occurrence of A 10/8/20157 More Complex Example Let D = (0 | 1 | 2 | 3 | 4 |... | 9 ) Let L = (A | B |... | Z | a | b |... | z) comment = -- not(EOL)* EOL ex: --hello12_34 \n decimal = D + · D + ex: 123.456 ident = L (L | D | _)* ex: A1a5_6 comments = ((#| ) not(#))* ex:#A435#3
8
Regular Expressions (4) Is regular expression as power as CFG? { [ i ] i | i 1} Regular grammar 10/8/20158 AaAa AaBAaB AaAa A Ba or
9
Finite Automata and Scanners (1) Finite automaton (FA) can be used to recognize the tokens specified by a regular expression A FA consists of A finite set of states S A set of input symbols (the input symbol alphabet) A set of transitions (or moves) from one state to another, labeled with characters in V A special start state s 0 (only one) A set of final, or accepting, states F 10/8/20159 FA = {S, , s 0, F, move }
10
Finite Automata and Scanners (2) 10/8/201510 is a transition is a state is a final state is the start state Example at next page….
11
Example A transition diagram This machine accepts (abc + ) + Finite Automata and Scanners (3) 10/8/201511 a abc c (abc + ) +
12
Finite Automata and Scanners (4) Other Example (0|1)*0(0|1)(0|1) 10/8/201512 1423 00,1 5 0 (0|1)*0
13
Finite Automata and Scanners (5) Other Example ID = L(L|D)*(_(L|D) + )* A data structure can be translated for many REs or FAs 10/8/201513 L - L | DL | D L | DL | D (_(L|D) + )*L(L|D)* Final for two * symbol What difference? Answer : “_” by times item 2 = item 3
14
Finite Automata and Scanners (6) Other Example RealLit = (D + ( |.))|(D*.D + ) 10/8/201514
15
Two kinds of FA: Deterministic: next transition is unique Non-deterministic: otherwise Finite Automata and Scanners (7) 10/8/201515... a a Which path we should select?...
16
A transition diagram A transition table 4 3 2 1 Finite Automata and Scanners (8) 10/8/201516 1 / / Not(Eol) 342 Eol StateCharacter -Eolab… 3 3 2 4333
17
Finite Automata and Scanners (9) Any regular expression can be translated into a DFA that accepts the set of strings denoted by the regular expression The transition can be done Automatically by a scanner generator : LEX (TA course) Manually by a programmer : Coding the DFA in two form 1. Table-driven, commonly produced by a scanner generator 2. Explicit control, produced automatically or by hand 10/8/201517
18
Finite Automata and Scanners (10) Scanner Driver Interpreting a Transition Table /* Note: CurrentChar is already set to the current input character. */ State = StartState; while (TRUE) { NextSate = T[State, CurrentChar]; if (NextSate == ERROR) break; State = NextState; CurrentChar = getchar(); } If(is_final_state(State)) /* Return or process valid token. */ else lexical_error(CurrentChar); 10/8/201518 Table-driven
19
Finite Automata and Scanners (11) Scanner with Fixed Token Definition if (CurrentChar == ‘/') { CurrentChar = getchar(); if (CurrentChar == ‘/') { do { CurrentChar = getchar(); } while (CurrentChar != '\n'); } else { ungetc(CurrentChar, stdin); lexical_error(CurrentChar); } else lexical_error(CurrentChar); /* Return or process valid token. */ 10/8/201519 Explicit control
20
Finite Automata and Scanners (12) Transducer We may perform some actions during state transition. A scanner can be turned into a transducer by the appropriate insertion of actions based on state transitions 10/8/201520
21
21 Using a Scanner Generator By TA…. 10/8/2015
22
Practical Considerations (1) Reserved Words Usually, all keywords are reserved in order to simplify parsing. In Pascal, we could even write begin begin; end; end; begin; end if else then if = else; The problem with reserved words is that they are too numerous. COBOL has several hundreds of reserved words! ZEROS ZERO ZEROES 10/8/201522
23
Practical Considerations (2) Compiler Directives and Listing Source Lines Compiler options e.g. optimization, profiling, etc. handled by scanner or semantic routines Complex pragmas are treated like other statements. Source inclusion e.g. #include in C handled by preprocessor or scanner Conditional compilation e.g. #if, #endif in C useful for creating program versions 10/8/201523
24
Practical Considerations (3) Entry of Identifiers into the Symbol Table Who is responsible for entering symbols into symbol table? Scanner? Consider this example: { int abc; … { int abc; } } 10/8/201524
25
Practical Considerations (4) How to handle end-of-file? Create a special EOF token. EOF token is useful in a CFG Multicharacter Lookahead Blanks are not significant in Fortran DO 10 I= 1,100 Beginning of a loop DO 10 I = 1.100 An assignment statement DO 10 I= 1.100 A Fortran Scanner can determine whether the O is the last character of a DO token only after reading as far as the comma 10/8/201525
26
Practical Considerations (5) Multicharacter Lookahead (Cont’d) In Ada and Pascal To scan 10..100 There are three token 10 .. 100 Two-character (..) lookahead after the 10 It is easy to build a scanner that can perform general backup. If we reach a situation in which we are not in final state and cannot scan any more characters, we extract characters from the right end of the buffer and queue them fir rescanning Until we reach a prefix of the scanned characters flagged as a valid token 10/8/201526 Example at next page
27
Practical Considerations (6) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201527 D ● D D D ●● Buffered TokenToken Flag 1Integer Literal 12Integer Literal 12.Invalid 12.3Real Literal 12.3eInvalid 12.3e+Invalid Detail Operation of each case at next page
28
Practical Considerations (7) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201528 D 1 Buffered Token Token Flag Integer Literal1 Input Token Input string: 12.3e+q
29
Practical Considerations (8) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201529 D 1 Buffered Token Token Flag Integer Literal1 Input Token D 22 Input string: 12.3e+q
30
Practical Considerations (9) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201530 D 1 Buffered Token Token Flag Invalid1 Input Token D 22.. ● Input string: 12.3e+q
31
Practical Considerations (10) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201531 D 1 Buffered Token Token Flag Real Literal1 Input Token D 22.. ● 33 D Input string: 12.3e+q
32
Practical Considerations (11) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201532 D 1 Buffered Token Token Flag Invalid1 Input Token D 22.. ● 33 D ee ? Input string: 12.3e+q Backup is invoked!
33
Practical Considerations (11) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201533 D 1 Buffered Token Token Flag Invalid1 Input Token D 22.. ● 33 D ee ? Input string: 12.3e+q Backup is invoked!
34
Practical Considerations (12) An FA That Scans Integer and Real Literals and the Subrange Operator 10/8/201534 D 1 Buffered Token Token Flag Invalid1 Input Token D 22.. ● 33 D ee ? ++ ? Input string: 12.3e+q
35
Practical Considerations (13) cannot scan any more characters, and not in accept state Backup is invoked ! 10/8/201535 D 1 Buffered Token Token Flag Invalid1 Input Token D 22.. ● 33 D ee ? ++ ? Input string: 12.3e+q
36
36 Outlines 3.1 Over View 3.2 Regular Expression 3.3 Finite Automata and Scanners 3.4 Using a Scanner Generator 3.5 Practical Considerations 3.6 Translating Regular Expressions into Finite Automata Creating Deterministic Automata Optimizing Finite Automata 3.7 Tracing Example 10/8/2015
37
Translating Regular Expressions into Finite Automata(1) Regular expressions are equivalent to FAs The main job of a scanner generator To transform a regular expression definition into an equivalent FA 10/8/201537 A regular expressionNondeterministic FADeterministic FA Optimized Deterministic FA minimize # of states Importance in NFA->DFA
38
Translating Regular Expressions into Finite Automata(2) We can transform any regular expression into an NFA with the following properties: There is an unique final state The final state has no successors Every other state has at least one successors Example : A Nondeterministic Finite Automaton (NFA) Input : babb Regular Expressions : (a|b)*abb 10/8/201538 Unique final stateFinal S has no successor 0 a a 2 bb b 31 either one or two successors
39
Translating Regular Expressions into Finite Automata(3) We need to review the definition of regular expression Item 1: It is null string Item 2: a It is a char of the vocabulary Item 3 : | It is “or” operation. Example : A|B Item 4 : ● It is the operation of catenation Example : AB Item 4 : * It is the operation of repetition Example : A* 10/8/201539 More Example at Next Page
40
Translating Regular Expressions into Finite Automata(4) NFA : (null string) NFA : a (1string) A char of the vocabulary 10/8/201540 a Processing Token a
41
NFA : NFA For A Translating Regular Expressions into Finite Automata(5) 10/8/201541 NFA For B Processing Token
42
NFA : ● Translating Regular Expressions into Finite Automata(6) 10/8/201542 NFA For A NFA For B Processing Token ●
43
NFA : Translating Regular Expressions into Finite Automata(7) 10/8/201543 NFA For A Processing Token = 0 times > 1 times
44
Construct an NFA for Regular Expression 01 * | 1 (0(1 * )) |1 Translating Regular Expressions into Finite Automata(8) 10/8/201544 1 * Processing Token Start
45
Construct an NFA for Regular Expression 01 * |1 (0(1 * )) |1 Translating Regular Expressions into Finite Automata(9) 10/8/201545 Processing Token 1 * Start 0 For Connection
46
Construct an NFA for Regular Expression 01 * +1 (0(1 * ))+1 Translating Regular Expressions into Finite Automata(10) 10/8/201546 Processing Token 1 * 0 |1 Start
47
What’s problem about NFA? Ans: It may be ambiguous that difficult to program!!! A Nondeterministic Finite Automaton (NFA): (a|b)*abb Translating Regular Expressions into Finite Automata(11) 10/8/201547 2 b 3 Start 0 a 1 b a b Input : babb Processing Token ba Ambiguous!!! Which one should we select?
48
What’s problem about NFA? Ans: It may be ambiguous that difficult to program!!! A deterministic Finite Automaton (NFA): b*abb Translating Regular Expressions into Finite Automata(12) 10/8/201548 2 b 3 Start 0 a 1 b b Input : babb Processing Token ba No Ambiguous!!! It have unique path! bb
49
Creating Deterministic Automata(1) The transformation from an NFA N to an equivalent DFA M works by what is sometimes called the subset construction An Example for each step… Initial NFA : 01 * |1 (0(1 * )) |1 10/8/201549 Start 4 65 2 3 1 10 7 89 More Detail operation at next page…
50
Creating Deterministic Automata(2) Step 1: 10/8/201550 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7, 10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10} More Detail operation at next page…
51
Creating Deterministic Automata(2) Step 1: 10/8/201551 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
52
Creating Deterministic Automata(2) Step 1: 10/8/201552 2 0 1 8 1 1. -closure(1) ={1, 2, 8}
53
Creating Deterministic Automata(2) Step 1: 10/8/201553 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
54
Creating Deterministic Automata(2) Step 1: 10/8/201554 2 0 2. -closure(2) ={2}
55
Creating Deterministic Automata(2) Step 1: 10/8/201555 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
56
Creating Deterministic Automata(2) Step 1: 10/8/201556 4 1 53 10 7 3. -closure(3) ={3,4,5,7,10}
57
Creating Deterministic Automata(2) Step 1: 10/8/201557 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
58
Creating Deterministic Automata(2) Step 1: 10/8/201558 4 1 5 10 7 4. -closure(4) ={4,5,7,10}
59
Creating Deterministic Automata(2) Step 1: 10/8/201559 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
60
Creating Deterministic Automata(2) Step 1: 10/8/201560 1 5 5. -closure(5) ={5}
61
Creating Deterministic Automata(2) Step 1: 10/8/201561 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
62
Creating Deterministic Automata(2) Step 1: 10/8/201562 1 65 10 7 6. -closure(6) ={5,6,7,10} This point line not be computed!!
63
Creating Deterministic Automata(2) Step 1: 10/8/201563 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
64
Creating Deterministic Automata(2) Step 1: 10/8/201564 10 7 7. -closure(7) ={5, 7,10} 1 5
65
Creating Deterministic Automata(2) Step 1: 10/8/201565 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
66
Creating Deterministic Automata(2) Step 1: 10/8/201566 8 1 8. -closure(8) ={8}
67
Creating Deterministic Automata(2) Step 1: 10/8/201567 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
68
Creating Deterministic Automata(2) Step 1: 10/8/201568 109 9. -closure(9) ={9,10}
69
Creating Deterministic Automata(2) Step 1: 10/8/201569 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10}
70
Creating Deterministic Automata(2) Step 1: 10/8/201570 10 10. -closure(10) ={10}
71
Creating Deterministic Automata(2) Step 1: 10/8/201571 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10} Total closures, but…..
72
Creating Deterministic Automata(2) Step 1: 10/8/201572 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={5, 7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10} Delete Sub Set...
73
Creating Deterministic Automata(2) Step 1: 10/8/201573 Start 4 1 65 2 0 3 110 7 89 1 1. -closure(1) ={1, 2, 8} 2. -closure(2) ={2} 3. -closure(3) ={3,4,5,7,10} 4. -closure(4) ={4,5,7,10} 5. -closure(5) ={5} 6. -closure(6) ={5,6,7,10} 7. -closure(7) ={7,10} 8. -closure(8) ={8} 9. -closure(9) ={9,10} 10. -closure(10) ={10} Now Closures, No Sub Set... State 3 state 3 state 4 state 5,7 state 10 empty
74
Creating Deterministic Automata(3) Step 1: The initial state of M is the set of states reachable from the initial state of N by -transitions Usually called l-closure or ε-closure 10/8/201574 Algorithm for example at upside
75
Creating Deterministic Automata(4) Step 2: To create the successor states Take any state S of M and any character c, and compute S’s successor under c S is identified with some set of N’s states, {n 1, n 2,…} Find all possible successor states to {n 1, n 2,…} under c Obtain a set {m 1, m 2,…} T=close({m 1, m 2,…}) 10/8/201575 ST {n 1, n 2,…}close({m 1, m 2,…})
76
Creating Deterministic Automata(7) Step 2: void make_deterministic( nondeterministic_fa N, deterministic *M) { set_of_fa_states T; M->initial_state = SET_OF(N.initial_state) ; close (& M->initial_state ); Add M-> initial_state to M->states; while( states or transitions can be added) { choose S in M->states and c in Alphabet; T=SET_OF (y in N. states SUCH THAT x->y under c for some x in S); close(& T); if(T not in M->states) add T to M->states; Add the transition to M->transitions: S->T under c; } M->final_states = SET_OF(S in M->states SUCH_THAT N.final_state in S); } 10/8/201576 Example at next page…
77
Creating Deterministic Automata(5) Step 2: First Re-Number for simplifying the work flow 1. -closure(1) ={1, 2, 8} A = {1, 2, 8} 3. -closure(3) ={3,4,5,7,10} B = {3,4,5,7,10} 6. -closure(6) ={5,6,7,10} C = {5,6,7,10} 9. -closure(9) ={9,10} D = {9, 10} 10/8/201577 Start 4 1 65 2 0 3 1 10 7 89 1 More Operation at next page ……
78
Creating Deterministic Automata(6) 10/8/201578 {1,2,8} {3,4,5,7,10} {9, 10} {5,6,7,10} Start 4 1 65 2 0 3 1 10 7 89 1 A : {1, 2, 8} B : {3,4,5,7,10} C : {5,6,7,10} D : {9, 10} A B C D Start 0 1 1 1 No Out-Degree Final
79
Creating Deterministic Automata(7) Step 2: void make_deterministic( nondeterministic_fa N, deterministic *M) { set_of_fa_states T; M->initial_state = SET_OF(N.initial_state) ; close (& M->initial_state ); Add M-> initial_state to M->states; while( states or transitions can be added) { choose S in M->states and c in Alphabet; T=SET_OF (y in N. states SUCH THAT x->y under c for some x in S); close(& T); if(T not in M->states) add T to M->states; Add the transition to M->transitions: S->T under c; } M->final_states = SET_OF(S in M->states SUCH_THAT N.final_state in S); } 10/8/201579 Example at next page…
80
Optimizing Finite Automata(1) Minimize number of states Every DFA has a unique smallest equivalent DFA Given a DFA M we use Transition Table to construct the equivalent minimal DFA. Initially, we draw a transition table from DFA diagram. 10/8/201580 Start 1 A DFA D 1 BC Table State Character 01 ABD BC CC D A: Start State B,C,D: Final State
81
Optimizing Finite Automata(2) Minimize number of states 10/8/201581 State Character 01 ABD BC CC D Start 1 A DFA D 1 BC Optimize B is equal C State Character 01 A{B, C}D D New DFA Start A D 1 B,C A: Start State B,C,D: Final State Special : B can merge into C, Because the B and C are final state.
82
Additional Simplifying rules (removing parentheses) “ * ” has highest precedence and is left associative Concatenation has 2nd highest precedence and is left associative “| “has lowest precedence and is left associative E.g., (a)|((b)*(c)) == a|b*c 10/8/201582
83
83 Outlines 3.1 Over View 3.2 Regular Expression 3.3 Finite Automata and Scanners 3.4 Using a Scanner Generator 3.5 Practical Considerations 3.6 Translating Regular Expressions into Finite Automata 3.7 Tracing Example Modify form http://www.cs.ualberta.ca/~amaral/courses/680/ 10/8/2015
84
Tracing Example(1) Review Steps of Scanner Generator 10/8/201584 A regular expressionNondeterministic FADeterministic FA Optimized Deterministic FA minimize # of states Importance in NFA->DFA
85
Tracing Example(2) Regular Expression IF and IFA 10/8/201585 if {return IF;} [a - z] [a – z|0 - 9 ] * {return ID;} [0 - 9] + {return NUM;}. {error ();}
86
Tracing Example(3) Translate from RE to NFA 10/8/201586 A regular expression Nondeterministic FA Deterministic FA Optimized Deterministic FA minimize # of states
87
Tracing Example(4) 10/8/201587 The NFA for a symbol i is: i 12 start The NFA for the regular expression if is: f 3 1 start 2 i The NFA for a symbol f is: f 2 start 1 IF if {return IF;}
88
Tracing Example(5) 10/8/201588 a-z 1 start [a-z] [a-z|0-9 ] * {return ID;} 423 a-z 0-9 ID
89
Tracing Example(6) 10/8/201589 5 43 2 0-9 1 start NUM [0 – 9] + {return NUM;} 0-9
90
Tracing Example(9) 10/8/201590 NUM 21 any but \n error ID IF 1 2 i f 3 a-z 1 423 0-9 5 43 2 0-9 1
91
Tracing Example(10) Translate from NFA to DFA 10/8/201591 A regular expression Nondeterministic FA Deterministic FA Optimized Deterministic FA minimize # of states
92
Tracing Example(11) 10/8/201592 238 4 56713 9 101112 14 151 a-z 0-9 a-z 0-9 i f IF error NUM ID any character Full NFA Diagram Special case :Handle in Final
93
Tracing Example(12) 10/8/201593 23 8 4 5 67 9 14 1 a-z 0-9 a-z 0-9 i f IF error NUM ID any character 1. -closure(1) ={ 1, 4, 9, 14} 2. -closure(2) ={ 2} 3. -closure(3) ={ 3} 4. -closure(5) ={ 5, 6, 8} 5. -closure(7) ={ 7, 8} 6. -closure(8) ={ 6, 8} 7. -closure(10) ={ 10, 11, 13} 9. -closure(13) ={11, 13} 8. -closure(12) ={12, 13} 10 1112 13 10. -closure(15) ={15} 15
94
Tracing Example(13) 10/8/201594 DFA States = {1-4-9-14} 1-4-9-14 Now we need to compute: move(1-4-9-14,a-h) = {5,15} -closure ({5,15}) = {5,6,8,15} a-h 5-6-8-15 23 8 4 5 67 9 14 1 a-z 0-9 a-z 0-9 i f IF error NUM ID any character 10 1112 13 15
95
Tracing Example(16) 10/8/201595 DFA States = {1-4-9-14} move(1-4-9-14, i) = 1-4-9-14 a-h {2,5,15} -closure ({2,5,15}) = {2,5,6,8,15} 2-5-6-8-15 i 23 8 4 5 67 9 14 1 a-z 0-9 a-z 0-9 i f IF error any character 10 1112 13 15 5-6-8-15
96
Tracing Example(21) 10/8/201596 DFA States = {1-4-9-14} move(1-4-9-14, j-z) = -closure ({5,15}) = 1-4-9-14 a-h 5-6-8-15 2-5-6-8-15 i j-z {5,15} {5,6,8,15} 23 8 4 5 67 9 14 1 a-z 0-9 a-z 0-9 i f IF error NUM ID any character 10 1112 13 15
97
Tracing Example(22) 10/8/201597 DFA States = {1-4-9-14} move(1-4-9-14, 0-9) = 1-4-9-14 a-h 5-6-8-15 2-5-6-8-15 i j-z 10-11-13-15 0-9 {10,15} -closure ({10,15}) = {10,11,13,15} 23 8 4 5 67 9 14 1 a-z 0-9 a-z 0-9 i f IF error NUM ID any character 10 1112 13 15
98
Tracing Example(23) 10/8/201598 DFA States = {1-4-9-14} move(1-4-9-14, other ) = 1-4-9-14 a-h 5-6-8-15 2-5-6-8-15 i j-z 10-11-13-15 0-9 15 other {15} -closure ({15}) = {15} 23 8 4 5 67 9 14 1 a-z 0-9 a-z 0-9 i f IF error any character 10 1112 13 15 NUM ID
99
Tracing Example(24) 10/8/201599 DFA states = {1-4-9-14} The analysis for 1-4-9-14 is complete. We mark it and pick another state in the DFA to analysis. (Practice) 23 8 4 5 67 13 9 101112 14 15 1 a-z 0-9 a-z 0-9 i f IF error NUM ID any character 1-4-9-14 a-h 5-6-8-15 2-5-6-8-15 i j-z 10-11-13-15 0-9 15 other
100
Tracing Example(25) 10/8/2015100 5-6-8-15 2-5-6-8-15 10-11-13-153-6-7-8 11-12-13 6-7-8 15 1-4-9-14 a-e, g-z, 0-9 a-z,0-9 0-9 f i a-h j-z 0-9 other ID NUM IF error ID a-z,0-9 See pp. 118 of Aho-Sethi-Ullman and pp. 29 of Appel.
101
Tracing Example(26) 10/8/2015101 A regular expression Nondeterministic FA Deterministic FA Optimized Deterministic FA minimize # of states Minimize DFA
102
Tracing Example(27) 10/8/2015102 Stat e character 0-9a-efg-hij-z oth er ADCCCBCE BGGFGGG- CGGGGGG- DH------ E------- FGGGGGG- GGGGGGG- HH------ A B C D E F G H Transition Table DFA
103
Tracing Example(28) 10/8/2015103 Stat e character 0-9a-efg-hij-z oth er ADCCCBCE BGGFGGG- CGGGGGG- DH------ E------- FGGGGGG- GGGGGGG- HH------ A B C D E F G H Transition Table DFA Sta te character 0-9a-efg-hij-z oth er ADCCCBCE BCCCCCC- CCCCCCC- DD------ E------- New Transition Table-1
104
Tracing Example(29) 10/8/2015104 A B C D E F G H DFA Sta te character 0-9a-efg-hij-z oth er ADCCCBCE BCCCCCC- CCCCCCC- DD------ E------- New Transition Table-1 Sta te character 0-9a-efg-hij-z oth er ADBBBBBE BBBBBBB- DD------ E------- New Transition Table-2
105
Tracing Example(30) 10/8/2015105 A B C D E F G H DFA B DE A 0-9 a-z 0-9 other IF ID NUM error a-z,0-9 B=C=F=G D=H Sta te character 0-9a-efg-hij-z oth er ADBBBBBE BBBBBBB- DD------ E------- New Transition Table-2 i f New DFA IF can be handled by look-ahead programming
106
Chapter 3 End Any Question? 10/8/2015106 隨堂考試(1 + ) What is the optimized DFA for 1 + ?
107
1. -closure(1) ={1, 2} 2. -closure(2) ={2} 3. -closure(3) ={3,4,2} 4. -closure(4) ={4,2} 1423 1 * = 1. 1 + (Can use this method) 1. -closure(1) ={1, 2} A 3. -closure(3) ={3,4,2} B State Character 01 AB BB A B {1,2} A Start {3,4,2} 1 B 1 Can Not Optimized, (Merge) For A is Start State, B is Final State!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.