Download presentation
Presentation is loading. Please wait.
1
LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29
2
Administrivia reminder –homework 2 due tonight
3
Last Time regular grammars –aka Chomsky hierarchy type-3 grammars –are formal grammars with severe restrictions on what can appear on the RHS –are limited in generative capacity or power –in Prolog DCG notation: x --> y, [t]. x --> [t]. (left recursive variant) or x --> [t],y. x --> [t]. (right recursive variant) –can ’ t have both left and right recursive rules in the same grammar
4
Last Time regular grammars examples regular languages –“ one or more a ’ s followed by one or more b ’ s ” –sheeptalk {ba!, baa!, baaa!,...} i.e. –can be encoded by a regular grammar beyond regular grammars examples –a n b n = –{ab, aabb, aaabbb,... } –ww R : where w {a,b} + –i.e. any non-empty sequence of a’s and b’s informal idea about the crucial difference “needing to keep track of history”
5
Today’s Topic Finite State Automata –plus more on what it means to be a regular language Merge Point –Textbook – Chapter 2: Regular Expressions and Automata
6
+ left & right recursive rules Today’s Topic Finite State Automata –plus more on what it means to be a regular language formally equivalent – in terms of generative capacity or power Regular Grammars FSA Regular Expressions Regular Languages
7
Some Regular Expression Notation... some notation first (more on regexps next time) Regular Expressions (regexp) shorthand for describing sets of strings Operators: –string + set of one or more occurrences of string a + = {a, aa, aaa, aaaa, aaaaa, …} (abc) + = {abc, abcabc, abcabcabc, …} –Note: parentheses used to delimit the scope of the operator –string * set of zero or more occurrences of string a * = {, a, aa, aaa, aaaa, …} (abc) * = {, abc, abcabc, …} –Note: - zero length string
8
Some Regular Expression Notation... some notation first Relation between * and + –a a * = a + –“a concatenated with a*” –a {, a, aa, aaa, aaaa, …} = {a, aa, aaa, aaaa, aaaaa, …} Operators: –string n exactly n occurrences of string a 4 b 3 = { aaaabbb } Language = a set of strings
9
Regular Expressions regular expressions –formally equivalent to regular grammars and finite state automata How to show this? Proof by construction… beyond regular expressions –examples {a n b n | n>0} is not regular {ww R | w {a,b} + } is not regular, e.g. (abc) R = cba –How to show this? –Proof by Pumping Lemma Regular Grammars FSA Regular Expressions
10
Regular Expressions Example: –Language: L = {a + b + } “one or more a’s followed by one or more b’s” regular language –described by a regular expression Note: –infinite set of strings belonging to language L »e.g. abbb, aaaab, aabb, *abab, * Notation: – is the empty string (or string with zero length) –* means string is not in the language regular grammar s --> [a],b. b --> [a],b. b --> [b],c. b --> [b]. c --> [b],c. c --> [b].
11
Finite State Automata (FSA) sx y a a b b L = {a + b + } L = {aa * bb * } deterministic FSA (DFSA) no ambiguity about where to go at any given state non-deterministic FSA (NDFSA) no restriction on ambiguity (surprisingly, no increase in power)
12
Finite State Automata (FSA) more formally –(Q,s,f,Σ, ) 1.set of states (Q): {s,x,y}must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): {a, b} 5.transition function : signature: character × state → state (a,s)=x (a,x)=x (b,x)=y (b,y)=y sx y a a b b
13
Finite State Automata (FSA) practical applications can be encoded and run efficiently on a computer widely used –encode regular expressions –compress large dictionaries –morphological analyzers Different word forms, e.g. want, wanted, unwanted (suffixation/prefixation) see chapter 3 of textbook speech recognizers Markov models = FSA + probabilities and many more …
14
Finite State Automata (FSA) how: 3 vs. 6 keystrokes michael: 7 vs. 15 keystrokes –T9 text entry (tegic.com) built in to your cellphone predictive text entry for mobile messaging/data entry reduces the number of keystrokes for inputting words on a telephone keypad (8 keys)
15
RegExp FSA From Regular Expression to FSA Operators –asingle symbol a –a n n occurrences of a –a –a n a 3 a a aa
16
RegExp FSA Operators –a * zero or more occurrences of a –a + one or more occurrences of a –a * –a + a + = aa * a a a
17
Regular Grammar FSA examples –s --> [a], t. –x --> [a], x. –x --> [a]. a st a x a x final state y
18
Next Time Prolog and FSA
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.