Presentation is loading. Please wait.

Presentation is loading. Please wait.

Text search and closure properties

Similar presentations


Presentation on theme: "Text search and closure properties"— Presentation transcript:

1 Text search and closure properties
The Chinese University of Hong Kong Fall 2010 CSCI 3130: Automata theory and formal languages Text search and closure properties Andrej Bogdanov

2 Text search

3 The program grep grep -E regexp file
Searches for the occurrence of patterns matching a regular expression cat|12 {cat, 12} union [abc] {a, b, c} shorthand for a|b|c [ab][12] {a1, a2, b1, b2} concatenation (ab)* {e, ab, abab, ...} star [ab]? {e, a, b} zero or one (cat)+ {cat, catcat, ...} one or more [ab]{2} {aa, ab, ba, bb} {n} copies

4 Searching with grep Words containing savor or savour
Words with 5 consecutive a or b grep –E `savou?r` words grep –E `[ab]{5}` words outsavor savor savored savorer savorily savoriness savoringly savorless savorous savorsome savory savour unsavored unsavoredly unsavoredness unsavorily unsavoriness unsavory grabbable grep –E `zo+zo+` words zoozoo

5 More grep commands a n f a s u f f e r l r e g u l a r e n s t a r
. any symbol 1 2 3 4 5 [a-z] anything in a range a l e n \< beginning of line n f a $ end of line s u f f e r r e g u l a r 1 If a DFA is too hard, I do an ... 2 CSCI 3130 homework makes me ... s t a r 3 If a DFA likes it, it is ... 4 $ = $10? 5 I study 3130 hard because it will make me ... grep –E `\<.ff.u..t` words

6 how do you look for... Words that start in cat and have another cat
grep –E `\<cat.*cat` words Words with at least ten vowels? grep –E `([aeiouy].*){10}` words Words without any vowels? [^R] does not contain R grep –E `\<[^AEIOUYaeiouy]*$` words Words with exactly ten vowels? grep –E `\<[^AEIOUYaeiouy]* ([aeiouy][^AEIOUYaeiouy]*){10}$` words

7 How grep (could) work regular expression NFA NFA without e DFA input
text file differences in class in grep [ab]?, a+, (cat){3} not allowed allowed input handling matches whole looks for pattern output accept/reject finds pattern

8 Implementation of grep
How do you handle expressions like: [ab]? zero or one → ()|[ab] R? → e|R (cat)+ one or more → (cat)(cat)* R+ → RR* a{3} {n} copies → aaa R{n} → RR...R n times ? [^aeiouy] not containing any

9 Closure properties

10 Example The language L of strings that end in 101 is regular
How about the language L of strings that do not end in 101? (0+1)*101

11 Example Hint: w does not end in 101 if and only if it ends in: or it has length 0, 1, or 2 So L can be described by the regular expression 000, 001, 010, 011, 100, 110 or 111 (0+1)*( ) + e + (0 + 1) + (0 + 1)(0 + 1)

12 Complement The complement L of a language L is the set of all strings that are not in L Examples (S = {0, 1}) L1 = all strings that end in 101 L1 = all strings that do not end in = all strings end in 000, …, 111 or have length 0, 1, or 2 L2 = 1* = {e, 1, 11, 111, …} L2 = all strings that contain at least one = (0 + 1)*0(0 + 1)*

13 Example The language L of strings that contain 101 is regular
How about the language L of strings that do not contain 101? (0+1)*101(0+1)* You can write a regular expression, but it is a lot of work!

14 Closure under complement
If L is a regular language, so is L. To argue this, we can use any of the equivalent definitions for regular languages: The DFA definition will be most convenient We assume L has a DFA, and show L also has a DFA regular expression NFA DFA

15 Arguing closure under complement
Suppose L is regular, then it has a DFA M Now consider the DFA M’ with the accepting and rejecting states of M reversed accepts L accepts strings not in L this is exactly L

16 NO! Food for thought Can we do the same thing with an NFA? (0+1)*10
0, 1 q0 q1 q2 (0+1)*10 1 0, 1 q0 q1 q2 (0+1)*

17 Intersection The intersection L  L’ is the set of strings that are in both L and L’ Examples: If L, L’ are regular, is L  L’ also regular? L = (0 + 1)*11 L’ = 1* L  L’ = 1*11 L = (0 + 1)*10 L’ = 1* L  L’ =

18 Closure under intersection
If L and L’ are regular languages, so is L  L’. To argue this, we can use any of the equivalent definitions for regular languages: regular expression NFA DFA Suppose L and L’ have DFAs, call them M and M’ Goal: Construct a DFA (or NFA) for L  L’

19 An example M M’ L = even number of 0s L’ = odd number of 1s
M s0 s1 1 M’ L = even number of 0s L’ = odd number of 1s r0, s1 1 r0, s0 r1, s1 r1, s0 L  L’ = even number of 0s and odd number of 1s

20 Closure under intersection
M and M’ DFA for L  L’ states Q = {r1, ..., rn} Q’ = {s1, ..., sn’} Q × Q’ = {(r1, s1), (r1, s2), ..., (r2, s1), ..., (rn, sn’)} start state ri for M sj for M’ (ri, sj) accepting states F for M F’ for M’ F × F’ = {(ri, sj): ri ∈ F, sj ∈ F’} Whenever M is in state ri and M’ is in state sj, the DFA for L  L’ will be in state (ri, sj)

21 Closure under intersection
M and M’ DFA for L  L’ ri rj a transitions in M ri, si’ rj, sj’ a si’ sj’ a in M’

22 Reversal The reversal wR of a string w is w written backwards
The reversal LR of a language L is the language obtained by reversing all its strings w = cave wR = evac L = {cat, dog} LR = {tac, god}

23 Reversal of regular languages
L = all strings that end in 101 is regular How about LR? This is the language of all strings beginning in 101 Yes, because it is represented by (0+1)*101 101(0+1)*

24 Closure under reversal
If L is a regular language, so is LR. How do we argue? regular expression NFA DFA

25 Arguing closure under reversal
Take a regular expression E for L We will show how to reverse E A regular expression can be of the following types: The special symbols  and e Alphabet symbols like a, b The union, concatenation, or star of simpler expressions

26 Proof of closure under reversal
regular expression E reversal ER Æ Æ e e a (alphabet symbol) a E1 + E2 E1R + E2R E1E2 E2RE1R E1* (E1R)*

27 ? A question LDUP = {ww: w  L} L = {cat, dog} LDUP = {catcat, dogdog}
Ex. LDUP = {catcat, dogdog} If L is regular, is LDUP also regular? ? regular expression NFA DFA

28 LDUP = LL A question Let’s try with regular expression: L = {a, b}
Let’s try with NFA: L = {a, b} LDUP = {aa, bb} LL = {aa, ab, ba, bb} LDUP = LL q0 q1 NFA for L

29 An example L = 0*1 is regular LDUP = {1, 01, 001, 0001, ...}
= {0n10n1: n ≥ 0} Let’s try to design an NFA for LDUP

30 An example LDUP = {11, 0101, 001001, 00010001, ...} = {0n10n1: n ≥ 0}
1 01 1 001 1 0001 1 LDUP = {11, 0101, , , ...} = {0n10n1: n ≥ 0}


Download ppt "Text search and closure properties"

Similar presentations


Ads by Google