Presentation is loading. Please wait.

Presentation is loading. Please wait.

Finite-state automata 2 Day 13 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Similar presentations


Presentation on theme: "Finite-state automata 2 Day 13 LING 681.02 Computational Linguistics Harry Howard Tulane University."— Presentation transcript:

1 Finite-state automata 2 Day 13 LING 681.02 Computational Linguistics Harry Howard Tulane University

2 23-Sept-2009LING 681.02, Prof. Howard, Tulane University2 Course organization  http://www.tulane.edu/~ling/NLP/ http://www.tulane.edu/~ling/NLP/  NLTK is installed on the computers in this room!  How would you like to use the Provost's $150?

3 SLP §2.2 Finite-state automata 2.2.1 Sheeptalk

4 23-Sept-2009LING 681.02, Prof. Howard, Tulane University4 Find your files >>> import sys >>>sys.path.append("/Users/harryhow/Do cuments/Work/Research/Sims/NLTK")

5 23-Sept-2009LING 681.02, Prof. Howard, Tulane University5 Run program >>> import fsaproc >>> test = 'baaa!' >>> test = 'baaa!$' >>> fsaproc.machine(test)

6 23-Sept-2009LING 681.02, Prof. Howard, Tulane University6 Go over print-out

7 23-Sept-2009LING 681.02, Prof. Howard, Tulane University7 Key points  D-recognize is a simple table-driven interpreter.  The algorithm is universal for all unambiguous regular languages.  To change the machine, you simply change the table.  Crudely therefore… matching strings with regular expressions (ala Perl, grep, etc.) is a matter of:  translating the regular expression into a machine (a table) and  passing the table and the string to an interpreter.

8 23-Sept-2009LING 681.02, Prof. Howard, Tulane University8 Recognition as search  You can view this algorithm as a kind of state-space search.  States are pairings of tape positions and state numbers.  The goal state is a pairing with the end of tape position and a final accept state.

9 SLP §2.2 Finite-state automata 2.2.2 Formal languages

10 23-Sept-2009LING 681.02, Prof. Howard, Tulane University10 Generative Formalisms  Formal Languages are sets of strings composed of symbols from a finite set of symbols.  Finite-state automata define formal languages (without having to enumerate all the strings in the language).  The term Generative is based on the view that you can run the machine as a generator to get strings from the language.

11 23-Sept-2009LING 681.02, Prof. Howard, Tulane University11 Generative Formalisms  A FSA can be viewed from two perspectives, as:  an acceptor that can tell you if a string is in the language.  a generators to produce all and only the strings in the language.

12 SLP §2.2 Finite-state automata 2.2.4 Determinism

13 23-Sept-2009LING 681.02, Prof. Howard, Tulane University13 Determinism  A deterministic FSA has one unique thing to do at each point in processing.  i.e. there are no choices

14 23-Sept-2009LING 681.02, Prof. Howard, Tulane University14 Non-determinism

15 23-Sept-2009LING 681.02, Prof. Howard, Tulane University15 Non-determinism cont.  Epsilon transitions  An arc has no symbol on it, represented as .  Such a transition does not examine or advance the tape during recognition:

16 SLP §2.2 Finite-state automata 2.2.5 Use of a nFSA to accept strings

17 23-Sept-2009LING 681.02, Prof. Howard, Tulane University17 Read on your own  pp. 33-5

18 SLP §2.2 Finite-state automata 2.2.6 Recognition as search

19 23-Sept-2009LING 681.02, Prof. Howard, Tulane University19 Non-deterministic recognition: Search  In a ND FSA there is at least one path through the machine for a string that is in the language defined by the machine.  But not all paths directed through the machine for an accept string lead to an accept state.  No paths through the machine lead to an accept state for a string not in the language.

20 23-Sept-2009LING 681.02, Prof. Howard, Tulane University20 Non-deterministic recognition  So success in non-deterministic recognition occurs when a path is found through the machine that ends in an accept.  Failure occurs when all of the possible paths for a given string lead to failure.

21 23-Sept-2009LING 681.02, Prof. Howard, Tulane University21 Example ba a a !\ q0q0 q1q1 q2q2 q2q2 q3q3 q4q4

22 23-Sept-2009LING 681.02, Prof. Howard, Tulane University22 Example

23 23-Sept-2009LING 681.02, Prof. Howard, Tulane University23 Example

24 23-Sept-2009LING 681.02, Prof. Howard, Tulane University24 Example

25 23-Sept-2009LING 681.02, Prof. Howard, Tulane University25 Example

26 23-Sept-2009LING 681.02, Prof. Howard, Tulane University26 Example

27 23-Sept-2009LING 681.02, Prof. Howard, Tulane University27 Example

28 23-Sept-2009LING 681.02, Prof. Howard, Tulane University28 Example

29 23-Sept-2009LING 681.02, Prof. Howard, Tulane University29 Example

30 23-Sept-2009LING 681.02, Prof. Howard, Tulane University30 Key points  States in the search space are pairings of tape positions and states in the machine.  By keeping track of as yet unexplored states, a recognizer can systematically explore all the paths through the machine given an input.

31 23-Sept-2009LING 681.02, Prof. Howard, Tulane University31 Ordering of states  But how do you keep track?  Depth-first/last in first out (LIFO)/stack  Unexplored states are added to the front of the agenda, and they are explored by going to the most recent.  Breadth-first/first in first out (FIFO)/queue  Unexplored states are added to the back of the agenda, and they are explored by going to the most recent.

32 SLP §2.2 Finite-state automata 2.2.7 Comparison

33 23-Sept-2009LING 681.02, Prof. Howard, Tulane University33 Equivalence  Non-deterministic machines can be converted to deterministic ones with a fairly simple construction.  That means that they have the same power:  non-deterministic machines are not more powerful than deterministic ones in terms of the languages they can accept.

34 23-Sept-2009LING 681.02, Prof. Howard, Tulane University34 Why bother?  Non-determinism doesn’t get us more formal power and it causes headaches, so why bother?  More natural (understandable) solutions.

35 Next time SLP §2.3 briefly SLP §3


Download ppt "Finite-state automata 2 Day 13 LING 681.02 Computational Linguistics Harry Howard Tulane University."

Similar presentations


Ads by Google