Download presentation
Presentation is loading. Please wait.
Published byMary Carpenter Modified over 8 years ago
1
Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University
2
Finite Automata Process of recognizing patterns in input strings construct scanner Example identifier = letter(letter|digit)* Process of recognizing xtemp (2010-1) Compiler2 2 1 letter digit 1 22222 xtemp
3
Definition of Deterministic FA (1) DFA Automata where the next state is uniquely given by the current state and the current input character DFA M = ( , S, T, s 0, A) where : alphabet S : set of states T : S S: state transition function s 0 S: start state A S: set of accepting states (2010-1) Compiler3
4
Definition of Deterministic FA (2) Language accepted by M, L(M) Set of strings of characters c 1 c 2 … c n Conditions s 1 = T(s 0,c 1 ), s 2 = T(s 1,c 2 ), …, s n = T(s n-1,c n ) s n is en element of A (i. e., an accepting state) T( s, c) = s’ (2010-1) Compiler4 S c S’
5
(2010-1) Compiler5 in_id start letter digit letter error other any other = ~letter other = ~(letter|digit)
6
Definition of Deterministic FA (3) The set of strings that contain exactly one b The set of strings that contain at most one b (2010-1) Compiler6 b notb b
7
Definition of Deterministic FA (4) Numeric constants digit = [0-9] nat = digit+ signedNat = (+|-)? Nat number = signedNat(“.” nat)?(E signedNat)? (2010-1) Compiler7 digit
8
(2010-1) Compiler8 digit + - + -.
9
(2010-1) Compiler9 digit + -. + - E E FA for floating-point numbers
10
Definition of Deterministic FA (5) Unnested comments C comments (2010-1) Compiler10 } other { other = ~} other / /** * other = ~*other = ~(*|/)
11
Lookahead, Backtracking, NFA (1) Precise algorithms Problems What happens when an error occurs What action must be performed on An accepting state When matching a character during a transition (2010-1) Compiler11
12
Lookahead, Backtracking, NFA (2) Problems in DFA for ID (2010-1) Compiler12 in_id start letter digit letter error other any other = ~letter other = ~(letter|digit)
13
Lookahead, Backtracking, NFA (3) Error state Delimiter has been seen We should accept and generate an identifier token [other] = delimiting character lookahead Returned to the input string Not consumed Principle of longest substring (2010-1) Compiler13 finish start letter digit letter [other] in_id
14
Lookahead, Backtracking, NFA (4) How to arrive at the start state in the first place Tokens beginning with a different character :=, <=, = (2010-1) Compiler14 = return ASSIGN : = return LE < return EQ =
15
(2010-1) Compiler15 = return ASSIGN : = return LE < return EQ =
16
Lookahead, Backtracking, NFA (5) Tokens beginning with the same character (2010-1) Compiler16 = return LE > > return NE < return LT <
17
(2010-1) Compiler17 return LE > return NE < return LT = [other]
18
Lookahead, Backtracking, NFA (6) ε-transition Transition that may occur without consulting the input string Combination of DFAs Keep the original automata intact Only add a new start state Explicit description of a match of the empty string (2010-1) Compiler18
19
(2010-1) Compiler19 =: = return LE < return EQ =
20
Lookahead, Backtracking, NFA (7) Nondeterministic finite automata More than one transition from a state may exist for a particular character NFA M = ( , S, T, s 0, A) where : alphabet S: set of states T: S ( { }) P(S): state transition function s 0 S: start state A S: set of accepting states (2010-1) Compiler20
21
Lookahead, Backtracking, NFA (8) Language accepted by M, L(M) Set of strings of characters c 1 c 2 … c n Conditions s 1 T(s 0,c 1 ), s 2 T(s 1,c 2 ), …, s n T(s n-1,c n ) s n is en element of A (i. e., an accepting state) (2010-1) Compiler21
22
Lookahead, Backtracking, NFA (9) Example 2.10 (2010-1) Compiler22 4 1 2 3 a b a ab+ | ab* | b* (a| )b* 2 3 a bb b 1 12424 abεb 1 aεεbε 222224 b
23
Implementation of FA in Code (1) DFA for ID Ad hoc method Better implementation (2010-1) Compiler23 3 1 letter digit letter [other] 2
24
(2010-1) Compiler24 {starting in state 1} if the next character is a letter then advance the input; {now in state 2} while the next character is a letter or a digit do advance the input; {stay in state 2} end while {go to state 3 without advance the input} accept; else {error or other cases} end if; - Not too many states - Small loops - Small scanners
25
(2010-1) Compiler25 state := 1; while state = 1 or 2 do case state of 1 : case input character of letter: advance the input; state := 2; else state :=... {error or other}; end case; - First case: current state - Nested second level: input character - Transition: assigning a new state and advancing the input 2 : case input character of letter, digit: advance the input; state := 2; {unnecessary} else state := 3; end case; end while if state = 3 then accept else error;
26
Implementation of FA in Code (2) DFA as a data structure + generic code Transition table (2010-1) Compiler26 3 1 letter digit letter [other] 2
27
(2010-1) Compiler27 state := 1; ch := next input character; while not Accept[state] and not error(state) do newstate := T[state,ch]; if Advance[state,ch] then ch := next input character; state := newstate; end while if Accept[state] then accept; - Accept: accepting states - Advance: transitions that advance the input - Table Driven algorithm
28
Implementation of FA in Code (3) Advantages Reduced code size Same code will work for many different problems Code is easier to change (maintain) Disadvantages Tables can become very large Much of the space in the arrays is wasted (2010-1) Compiler28
29
Implementation of FA in Code (4) Table compression Sparse-array representation Slow table lookup Rarely used (2010-1) Compiler29
30
Implementation of FA in Code (5) NFA implementation Many different sequences of transitions that must be tried Store up transitions that have not yet been tried Backtrack to them on failure Similar to algorithms that attempt to find paths in directed graphs Lot of backtracking inefficient NFA DFA coding (2010-1) Compiler30
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.