Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University.

Similar presentations


Presentation on theme: "Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University."— Presentation transcript:

1 Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University

2 Finite Automata Process of recognizing patterns in input strings  construct scanner Example identifier = letter(letter|digit)* Process of recognizing xtemp (2010-1) Compiler2 2 1 letter digit 1 22222 xtemp

3 Definition of Deterministic FA (1) DFA Automata where the next state is uniquely given by the current state and the current input character DFA M = ( , S, T, s 0, A) where  : alphabet S : set of states T : S    S: state transition function s 0  S: start state A  S: set of accepting states (2010-1) Compiler3

4 Definition of Deterministic FA (2) Language accepted by M, L(M) Set of strings of characters c 1 c 2 … c n Conditions s 1 = T(s 0,c 1 ), s 2 = T(s 1,c 2 ), …, s n = T(s n-1,c n ) s n is en element of A (i. e., an accepting state) T( s, c) = s’ (2010-1) Compiler4 S c S’

5 (2010-1) Compiler5 in_id start letter digit letter error other any other = ~letter other = ~(letter|digit)

6 Definition of Deterministic FA (3) The set of strings that contain exactly one b The set of strings that contain at most one b (2010-1) Compiler6 b notb b

7 Definition of Deterministic FA (4) Numeric constants digit = [0-9] nat = digit+ signedNat = (+|-)? Nat number = signedNat(“.” nat)?(E signedNat)? (2010-1) Compiler7 digit

8 (2010-1) Compiler8 digit + - + -.

9 (2010-1) Compiler9 digit + -. + - E E FA for floating-point numbers

10 Definition of Deterministic FA (5) Unnested comments C comments (2010-1) Compiler10 } other { other = ~} other / /** * other = ~*other = ~(*|/)

11 Lookahead, Backtracking, NFA (1) Precise algorithms Problems What happens when an error occurs What action must be performed on An accepting state When matching a character during a transition (2010-1) Compiler11

12 Lookahead, Backtracking, NFA (2) Problems in DFA for ID (2010-1) Compiler12 in_id start letter digit letter error other any other = ~letter other = ~(letter|digit)

13 Lookahead, Backtracking, NFA (3) Error state Delimiter has been seen We should accept and generate an identifier token [other] = delimiting character  lookahead Returned to the input string Not consumed Principle of longest substring (2010-1) Compiler13 finish start letter digit letter [other] in_id

14 Lookahead, Backtracking, NFA (4) How to arrive at the start state in the first place Tokens beginning with a different character :=, <=, = (2010-1) Compiler14 = return ASSIGN : = return LE < return EQ =

15 (2010-1) Compiler15 = return ASSIGN : = return LE < return EQ =

16 Lookahead, Backtracking, NFA (5) Tokens beginning with the same character (2010-1) Compiler16 = return LE > > return NE < return LT <

17 (2010-1) Compiler17 return LE > return NE < return LT = [other]

18 Lookahead, Backtracking, NFA (6) ε-transition Transition that may occur without consulting the input string Combination of DFAs Keep the original automata intact Only add a new start state Explicit description of a match of the empty string (2010-1) Compiler18  

19 (2010-1) Compiler19 =: = return LE < return EQ =   

20 Lookahead, Backtracking, NFA (7) Nondeterministic finite automata More than one transition from a state may exist for a particular character NFA M = ( , S, T, s 0, A) where  : alphabet S: set of states T: S  (   {  })  P(S): state transition function s 0  S: start state A  S: set of accepting states (2010-1) Compiler20

21 Lookahead, Backtracking, NFA (8) Language accepted by M, L(M) Set of strings of characters c 1 c 2 … c n Conditions s 1  T(s 0,c 1 ), s 2  T(s 1,c 2 ), …, s n  T(s n-1,c n ) s n is en element of A (i. e., an accepting state) (2010-1) Compiler21

22 Lookahead, Backtracking, NFA (9) Example 2.10 (2010-1) Compiler22 4 1 2 3   a b a  ab+ | ab* | b* (a|  )b* 2 3 a bb b 1 12424 abεb 1 aεεbε 222224 b

23 Implementation of FA in Code (1) DFA for ID Ad hoc method Better implementation (2010-1) Compiler23 3 1 letter digit letter [other] 2

24 (2010-1) Compiler24 {starting in state 1} if the next character is a letter then advance the input; {now in state 2} while the next character is a letter or a digit do advance the input; {stay in state 2} end while {go to state 3 without advance the input} accept; else {error or other cases} end if; - Not too many states - Small loops - Small scanners

25 (2010-1) Compiler25 state := 1; while state = 1 or 2 do case state of 1 : case input character of letter: advance the input; state := 2; else state :=... {error or other}; end case; - First case: current state - Nested second level: input character - Transition: assigning a new state and advancing the input 2 : case input character of letter, digit: advance the input; state := 2; {unnecessary} else state := 3; end case; end while if state = 3 then accept else error;

26 Implementation of FA in Code (2) DFA as a data structure + generic code Transition table (2010-1) Compiler26 3 1 letter digit letter [other] 2

27 (2010-1) Compiler27 state := 1; ch := next input character; while not Accept[state] and not error(state) do newstate := T[state,ch]; if Advance[state,ch] then ch := next input character; state := newstate; end while if Accept[state] then accept; - Accept: accepting states - Advance: transitions that advance the input - Table Driven algorithm

28 Implementation of FA in Code (3) Advantages Reduced code size Same code will work for many different problems Code is easier to change (maintain) Disadvantages Tables can become very large Much of the space in the arrays is wasted (2010-1) Compiler28

29 Implementation of FA in Code (4) Table compression Sparse-array representation Slow table lookup Rarely used (2010-1) Compiler29

30 Implementation of FA in Code (5) NFA implementation Many different sequences of transitions that must be tried Store up transitions that have not yet been tried Backtrack to them on failure Similar to algorithms that attempt to find paths in directed graphs Lot of backtracking  inefficient NFA  DFA  coding (2010-1) Compiler30


Download ppt "Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University."

Similar presentations


Ads by Google