Finite-state automata Day 12 LING Computational Linguistics Harry Howard Tulane University
21-Sept-2009LING , Prof. Howard, Tulane University2 Course organization NLTK is installed on the computers in this room! How would you like to use the Provost's $150?
SLP §2.2 Finite-state automata Sheeptalk
21-Sept-2009LING , Prof. Howard, Tulane University4 The sheep language How would you recognize this language? baa! baaa! baaaa! baaaaa! …… Regex in Python? '^baa+!$'
21-Sept-2009LING , Prof. Howard, Tulane University5 q0q0 q1q1 q2q2 q3q3 q4q4 baa a ! State transitions A directed graph of vertices/nodes connected by links/arcs. Each node is labeled as a state, q 0 - q 4 q 0 is the start state; q 4 is the final or accepting state. Each transition from state to state is labeled with the character that it recognizes.
21-Sept-2009LING , Prof. Howard, Tulane University6 Equivalency to a tape Start at the start state. Check the next character of the input. If it matches the symbol on the arc leaving the current state, then cross the arc, and move to the next state. Return to beginning. If it is the final state, and there is no more input, the input has been recognized successfully. If the final state is never reached, the input is rejected. baaa! q0q0
21-Sept-2009LING , Prof. Howard, Tulane University7 Input Stateba! :000 State-transition table : marks final state. 0 = illegal or missing transition. If in state 0 and see input b, go to state 1; if in state 0 and see input a or !, fail.
21-Sept-2009LING , Prof. Howard, Tulane University8 Definition p. 28 The procedure is known as a finite(-state) automaton, which consists of five parameters: Q a finite set of n states, a finite input alphabet of symbols, q 0 the start state, Fthe set of final states, (q,i)the transition function or matrix. For a state q and an input symbol i, (q,i) returns a new state q'.
21-Sept-2009LING , Prof. Howard, Tulane University9 Algorithm Fig. 2.12, p. 29 function D-RECOGNIZE(tape, machine) returns accept or reject index Beginning of tape current-state Initial state of machine loop if End of input has been reached then if current-state is an accept state then return accept else return reject elseif transition-table[current-state,tape[index]] is empty then return reject else current-state transition-table[current-state,tape[index]] index index + 1 end
21-Sept-2009LING , Prof. Howard, Tulane University10 Python dictionaries Use the Python data object known as a dictionary (hash table in other languages) your_dict = {key1: value1, key2: value2,...} >>> your_dict[key1] value1 Part of speech dictionary, see NLPP pp. 190ff pos = {'colorless': 'ADJ', 'ideas': 'N', 'furiously': 'ADV'} >>> pos['furiously'] 'ADV'
21-Sept-2009LING , Prof. Howard, Tulane University11 A state-transition table Need compound keys your_dict = {(key1a, key1a): value1, (key2a, key2b): value2,...} >>> your_dict[(key2a, key2a)] value2 Sample first line for sheep language stt = {(0,'b'): 1, (0,'a'): 0, (0,'!'): 0, >>> stt[(0,'!')] 0 HINT: put dictionary at beginning of function
Next time Code up the algorithm and bring a printout of it to class on Wed. SLP §2.2.2-end