LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong
Administrivia Homework 3 graded
Last Time 1.Introduced Regular Languages – can be generated by regular expressions – or Finite State Automata (FSA) – or regular grammars --- not yet introduced 2.Deterministic and non-deterministic FSA 3.DFSA can be easily encoded in Perl: – hash table for the transition function – foreach loop over a string (character by character) – conditional to check for end state 4.NDFSA can be converted into DFSA – example of the set of states construction – Practice: ungraded homework exercise
Ungraded Homework Exercise do not submit, do the following exercise to check your understanding – apply the set-of-states construction technique to the two machines on the ε- transition slide (repeated below) – self-check your answer: verify in each case that the machine produced is deterministic and accurately simulates its ε- transition counterpart a ε b > a ε b >
Ungraded Homework Exercise Review Converting a NDFSA into a DFSA 1 a ε 23 b > {1,3} {2} a b {3} > Note: this machine with an ε-transition is non-deterministic Note: this machine is deterministic
Ungraded Homework Exercise Review Converting a NDFSA into a DFSA 1 a ε 23 b > {1,2} {2} a b {3} b Note: this machine with an ε-transition is non-deterministic Note: this machine is deterministic >
Last Time Regular Languages Three formalisms – All formally equivalent (no difference in expressive power) – i.e. if you can encode it using a RE, you can do it using a FSA or regular grammar, and so on … Regular Grammars FSA Regular Expressions Regular Languages talk more about formal equivalence later today… Perl regular expressions stuff out here
Perl Regular Expressions Perl regex can include backreferences to groupings (i.e. \1, etc.) – backreferences give Perl regexs expressive power beyond regular languages: the set of prime numbers is not a regular language L prime = {2, 3, 5, 7, 11, 13, 17, 19, 23,.. } can be proved using the Pumping Lemma for regular languages (later) can have regular Perl code inside a regex
Backreferences and FSA Deep question: – why are backreferences impossible in FSA? sx y a a b b > Example: Suppose you wanted a machine that accepted /(a+b+)\1/ One idea: link two copies of the machine together x2 y2 a a b b y Doesn’t work! Why? Perl implementation: – how to modify it get the backreference effect?
Regular Languages and FSA Formal (constructive) set-theoretic definition of a regular language Correspondence between REs and Regular Languages concatenation (juxtaposition) union( | also [ ] ) Kleene closure( * )= (x + = xx*) Note: backreferences are memory devices and thus are too powerful e.g. L = {ww} and prime number testing (earlier slides)
Regular Languages and FSA Other closure properties: Not true higher up: e.g. context-free grammars as we’ll see later
Equivalence: FSA and Regexs Textbook gives one direction only Case by case: a)Empty string b)Empty set c)Any character from the alphabet
Equivalence: FSA and Regexs Concatenation: – Link final state of FSA 1 to initial state of FSA 2 using an empty transition Note: empty transition can be eliminated using the set of states construction (see earlier slides in this lecture)
Equivalence: FSA and Regexs Kleene closure: – repetition operator: zero or more times – use empty transitions for loopback and bypass
Equivalence: FSA and Regexs Union: aka disjunction – Non-deterministically run both FSAs at the same time, accept if either one accepts
Regular Languages and FSA Other closure properties: Let’s consider building the FSA machinery for each of these guys in turn…
Regular Languages and FSA Other closure properties:
Regular Languages and FSA Other closure properties:
Regular Languages and FSA Other closure properties:
Regular Languages and FSA Other closure properties:
Regular Expressions from FSA Textbook Exercise: find a RE for Examples (* denotes string not in the language): *ab *ba babbab λ (empty string) bb *baba*baba bababbabab
Regular Expressions from FSA Draw a FSA and convert it to a RE: > b ab b b ε b*ab+( )+ [Powerpoint Animation] = b+(ab+)*| ε b
Regular Expressions from FSA Perl implementation: $s = "ab ba bab bb baba babab"; while ($s =~ /\b(b+(ab+)*)\b/g) { print " match!\n"; } Output: perl test.perl match! Note: doesn’t include the empty string case Note: /../g global flag for multiple matches