LING 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/3.

Slides:



Advertisements
Similar presentations
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 3 graded.
Advertisements

Lecture 9,10 Theory of AUTOMATA
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 13: 10/9.
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong. Administrivia We'll postpone Homework 4 review until next week …
YES-NO machines Finite State Automata as language recognizers.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/27.
LING 388: Language and Computers Sandiway Fong 9/29 Lecture 11.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 16: 10/19.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4.
LING 388: Language and Computers Sandiway Fong Lecture 21: 11/7.
LING 388: Language and Computers Sandiway Fong Lecture 12: 10/5.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/26.
LING 388 Language and Computers Lecture 4 9/11/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/13.
LING 388: Language and Computers Sandiway Fong Lecture 11: 10/3.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/5.
CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/25.
PZ02B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ02B - Regular grammars Programming Language Design.
LING 388: Language and Computers Sandiway Fong Lecture 10: 9/26.
LING 388 Language and Computers Lecture 7 9/23/03 Sandiway FONG.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/2.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Finite State Machines Data Structures and Algorithms for Information Processing 1.
LING 388 Language and Computers Lecture 6 9/18/03 Sandiway FONG.
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
PZ02B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ02B - Regular grammars Programming Language Design.
October 2004CSA3050 NL Algorithms1 CSA3050: Natural Language Algorithms Words, Strings and Regular Expressions Finite State Automota.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
LING/C SC/PSYC 438/538 Lecture 7 9/15 Sandiway Fong.
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
2. Regular Expressions and Automata 2007 년 3 월 31 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.33 ~ 56.
CHAPTER 1 Regular Languages
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10.
LING/C SC/PSYC 438/538 Lecture 13 Sandiway Fong. Administrivia Reading Homework – Chapter 3 of JM: Words and Transducers.
CS 203: Introduction to Formal Languages and Automata
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 5 graded.
Three Basic Concepts Languages Grammars Automata.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Lecture 2 Overview Topics What I forgot from last lecture Proof techniques continued Alphabets, strings, languages Automata June 2, 2015 CSCE 355 Foundations.
using Deterministic Finite Automata & Nondeterministic Finite Automata
Nondeterministic Finite Automata (NFAs). Reminder: Deterministic Finite Automata (DFA) q For every state q in Q and every character  in , one and only.
BİL711 Natural Language Processing1 Regular Expressions & FSAs Any regular expression can be realized as a finite state automaton (FSA) There are two kinds.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.
Deterministic Finite Automata Nondeterministic Finite Automata.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
1 Regular grammars Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Regular grammars Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong.
CSE322 CONSTRUCTION OF FINITE AUTOMATA EQUIVALENT TO REGULAR EXPRESSION Lecture #9.
4. Properties of Regular Languages
Intro to Data Structures
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
Regular grammars Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Chapter 1 Regular Language
LECTURE # 07.
CHAPTER 1 Regular Languages
Regular grammars Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
PZ02B - Regular grammars Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section PZ02B.
CSCE 355 Foundations of Computation
Presentation transcript:

LING 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/3

2 Administrivia homework 2 –will be returned tomorrow (by ) homework 3 –will be out on Thursday

3 Last Tuesday textbook –Chapter 2: Regular Expressions and Finite State Automata regular expressions –Unix grep and –wildcard search in Microsoft Word implementing the FSA in Prolog –Method 1: two line program fsa/2 + transition/3 (δ function) and final_state/1 –Method 2: define each state, e.g. x, as a predicate, e.g. x/1, taking the input list as an argument –non-determinism handled by Prolog’s computation rule

4 Today’s Topic more on FSA –expressive power –limits

5 Determinism deterministic FSA (DFSA) –no ambiguity about where to go at any given state non-deterministic FSA (NDFSA) –no restriction on ambiguity (surprisingly, no increase in formal power) textbook –D-RECOGNIZE (FIGURE 2.13) –ND-RECOGNIZE (FIGURE 2.21) fsa(S,L) :- L = [C|M], L = [C|M],transition(S,C,T),fsa(T,M). fsa(y,[]) :-. fsa(y,[]) :- end_state(E).

6 NDFSA → (D)FSA [ discussed at the end of section 2.2 in the textbook] construct a new machine –each state of the new machine represents the set of possible states of the original machine when stepping through the input Note: –new machine is equivalent to old one (but has more states) –new machine is deterministic example sx z a a a b y b b a b s{x,y} {z} a a a {y,z} b a {y} b a b b

7 ε-transitions jump from state to another state with the empty character –ε-transition (textbook) or λ-transition –no increase in expressive power examples a ε b > a b b > a ε b > what’s the equivalent without the ε-transition?

8 Start State(s) Finite State Automata (FSA) –(Q,s,f,Σ,  ) 1.set of states (Q): {s,x,y} must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): {a, b} 5.transition function  : signature: character × state → state  (a,s)=x  (a,x)=x  (b,x)=y  (b,y)=y sx y a a b b >

9 FSA Properties FSAs (and thus regular languages) are preserved, i.e. maintain their FSA nature, under... –concatenation –union –intersection –complementation –and other operations... –[see section 2.3 of textbook]

10 concatenation concatenate two FSAs, result is a FSA –trick: use ε-transitions to link the automatons example –[figure 2.24]

11 union disjunction (union) of two FSAs, result is a FSA –trick: use ε-transitions to link the automatons example –[figure 2.26]

12 intersection (conjunction) intersect two FSAs, result is a FSA –trick: use (modified) set-of-states construction example s1s1 xy a ab b s2s2 z b ab {s 1,s 2 } a {x,s 2 } a {y,z} b b look familiar? that’s because a + b * ∩ a * b + = a + b +

13 complementation (complementation) the negation or opposite FSA –with respect to Σ * the set of all possible strings from the alphabet –i.e. accepts everything original FSA rejects –and rejects everything original FSA accepts –result is still a FSA

14 Limits of Finite State Technology Language = set of strings case 1 –suppose set is finite –e.g. L = {ba, abc, ccb, dd} easy to encode as a FSA (by closure under union) case 2 –set is infinite –... s1s1 s2s2 s3s3 ab s1s1 s2s2 s3s3 ba s4s4 c s1s1 s2s2 s3s3 cc s4s4 b s1s1 s2s2 s3s3 dd s0s0 ε ε ε ε

15 Limits of Finite State Technology Language = set of strings case 2 –set is infinite –e.g. L = a + b + = { ab, aab, abb, aabb, aaab, abbb, … } “ one or more a ’ s followed by one or more b ’ s ” we know this set is regular –however, consider L = {a n b n | n ≥ 1} = { ab, aabb, aaabbb, … } “ same number of b ’ s as a ’ s …” this set is not regular. Why? sx y a a b b

16 The Limits of Finite State Technology [Formally, we can use the Pumping Lemma to prove this particular case.] informally, –we can build FSA for … –ab –aabb –aaabbb –… ab aabb aaabbb = end state

17 The Limits of Finite State Technology we can merge the individual FSA for … –ab –aabb –aaabbb aaabbb bb b such direct encoding would require an infinite number of states –and we ’ re using Finite State Automata quite different from the infinity obtained by looping –freely iterate (no counting)

18 The Limits of Finite State Technology example –L = a + b + = { ab, abb, aab, aabb, aaab, abbb, … } –“ one or more a ’ s followed by one or more b ’ s ” Note: –can be divided into two independent halves –each half can be replaced by iteration s1s1 s2s2 s3s3 ba s1s1 s2s2 s3s3 aa s4s4 b s1s1 s2s2 s3s3 ba s4s4 b s1s1 s2s2 s3s3 aa s4s4 b s5s5 b s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s1s1 s2s2 s3s3 ba s4s4 b s5s5 b

19 The Limits of Finite State Technology example –L = a + b + = { ab, abb, aab, aabb, aaab, abbb, … } –“ one or more a ’ s followed by one or more b ’ s ” Note: –can be divided into two independent halves –each half can be replaced by iteration s1s1 s2s2 s3s3 ba s1s1 s2s2 s3s3 aa s4s4 b s1s1 s2s2 s3s3 ba s4s4 b s1s1 s2s2 s3s3 aa s4s4 b s5s5 b s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s1s1 s2s2 s3s3 ba s4s4 b s5s5 b s1s1 s2s2 s3s3 ba s4s4 b s1s1 s2s2 s3s3 aa s4s4 b s5s5 b s0s0 ε ε s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s6s6 b s0s0 ε ε s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s6s6 bb s7s7 s1s1 s2s2 s3s3 aa s4s4 a s5s5 b b s3s3 s4s4 a s5s5 b b a