Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Regular Expressions Definitions Equivalence to Finite Automata Midterm exam 10-8-14 Review for Midterm 10-6-14.

Similar presentations


Presentation on theme: "1 Regular Expressions Definitions Equivalence to Finite Automata Midterm exam 10-8-14 Review for Midterm 10-6-14."— Presentation transcript:

1 1 Regular Expressions Definitions Equivalence to Finite Automata Midterm exam 10-8-14 Review for Midterm 10-6-14

2 2 RE’s: Introduction uRegular expressions are an algebraic way to describe regular languages. uIf RE is a regular expression, then L(RE) denotes the language it defines. uRE’s and their languages are defined recursively.

3 Introduction 2 uRecursive description of languages derived from RE’s involves 3 basic operations between languages: union, concatenation, and closure. uUnion of L and M is set of all strings either in L or in M or in both u{001,10,111}U{e,001}={e,10,001,111} 3

4 Introduction 3 uConcatenation of L and M is sometimes denoted by “dot” (L.M) uMost often denoted by simply LM uLM is the set of all string that can be formed by concatenating any string in L with any string in M u{001,10,111}.{e,001}= w{001,10,111,001001,10001,111001} wNote: left-right order preserved 4

5 Introduction 4 uClosure (denoted L*) is set of strings obtained by taking any number of strings from L, possibly with repeats, and concatenating all of them. uL* = U k>0 L k wUnion of all powers of L (including zero) uFor all languages, L * contains {e} wWhy? 5

6 Introduction 5 uL 1 = L uL k (k>1) concatenation of k copies of L uIf L={0,11}, L 2 = {0,11}{0,11} ={00,011,110,1111}  L( ∅ ) is the empty language (no strings)  L( ∅ )*={e} rare example of finite closure 6

7 Building regular expressions uLike all algebras, RE’s are made up of constants and variables connected by operators. uParentheses used to group terms 7

8 8 Elementary components of RE’s uBasis 1: any symbol, a, is a RE. L(RE)={a} wL(RE) is language containing one string of length 1. wGeneralizable to strings of any length  Basis 2: ε is a RE, and L( RE ) = { ε } wL(RE) consists of empty string only  Basis 3: ∅ is a RE, and L(RE) = ∅ wL(RE) has no strings

9 9 Recursive Definitions of RE’s uInduction 1: If E 1 and E 2 are RE’s, thenE 1 +E 2 is a RE, and L(E 1 +E 2 ) = L(E 1 )  L(E 2 ) uInduction 2: If E 1 and E 2 are RE’s then E 1 E 2 is a RE, and L(E 1 E 2 ) = L(E 1 )L(E 2 ). Concatenation

10 10 Recursive Definition of RE 2 uInduction 3: If E is a RE, then E* is a RE, and L(E*) = (L(E))* or simply L(E)* Closure, or “Kleene closure” named for originator of * operation

11 11 Precedence of Operators uParentheses used as needed to influence the grouping of operators. uIf E is a RE then (E) is a RE defining the same language as E; L((E))=L(E) uOrder of precedence is * (highest), then concatenation, then + (lowest).

12 12 Examples: RE’s and L(RE) uL(01) = {01}. uL(01+0) = {01, 0}. uL(0(1+0)) = {0}{0,1}={00, 01}. wNote order of precedence of operators.  L(0*) = { ε, 0, 00, 000,… }. uL(01*) = all strings consisting of a 0 followed by any number of 1’s uL((01)*) = all strings consisting of zero or more occurrences of 01

13 L=all strings of alternating 0’s and 1’s uL((01)*) is case that begins with 0 and ends with 1. u3 other cases: L((10)*), L(0(10)*), and L(1(01)*) uL is the union the 4 cases uL = L(RE) uWhere RE = (01)*+(10)*+0(10)*+1(01)* 13

14 L=all strings of alternating 0’s and 1’s uConcatenation method: (e+1)(01)*(e+0) wDistributive law gives the 4 cases w(01)*(e+0)=((01)*+ (01)*0) w(e+1)((01)*+ (01)*0)= (01)*+ (01)*0 + 1(01)*+ 1(01)*0 w(01)* begins 0 ends 1 w(01)*0 begins 0 ends 0 w1(01)* begins 1 ends 1 w1(01)*0 begins 1 ends 0 14

15 Application of precedence u* (highest) operates on smallest sequence to symbols to its left that is legal RE wExample: 01* closure on 1 only uAfter grouping all *’s to their operands, group all concatenations to their operands (0 to 1* in example) uFinally, group unions (+) with operands; (as in 1+01*) 15

16 Associative laws uConcatenation is associative. w0(12) = (01)2 uUnion is associative. w(a+b)+c = a+(b+c) 16

17 Examples: uE=01*+1=(0(1*))+1: L(E)={1} plus all strings with 0 followed by any number 1’s uE=(01)*+1: L(E)={1} plus all string repeating 01 zero or more times uE=0(1*+1): L(E)=all string beginning with 0 followed by any number of 1’s wNote: 1* and (1*+1) are the same 17

18 18 Equivalence of RE’s and FA’s uWill show that for every RE, there is an FA that defines the same language.  Sufficient to show for ε -NFA’s. uWill show that for every FA, there is a RE that defines the same language.  Sufficient to show for D FA’s.

19 19 DFA-to-RE uRename the states of the DFA to be 1,2,…,n. wconstruct RE’s from the labels of a restricted sets of paths called k-paths. uk-path is a path between specified states that goes though no state numbered higher than k. wEndpoints of k-paths are not restricted; they can be any pair of states or the same state (i.e. a loop)

20 20 Example: k-Paths u0-paths from 2 to 3 wno intermediates wRE from labels (only one in this case) = 0. u1-paths from 2 to 3 wdirect and around outside wRE for labels = 0+11 1 3 2 0 00 1 1 1

21 21 Example: k-Paths u2-paths from 2 to 3: wRE from labels = (10)*0+1(01)*1 w(10)* and (01)* allow for zero or more loops through 1 before going to 3 u3-paths from 2 to 3: wno restrictions, k=n 1 3 2 0 00 1 1 1

22 22 Formal development: DFA to RE uLet R ij k be the RE from the set of labels of k-paths from state i to state j.  Basis: k=0 R ij 0 = sum of labels on arcs from i to j; ∅ if no such arc; add ε if i=j uExamples:  R 11 0 = ∅ + ε = ε wR 12 0 = 0 wR 13 0 = 1 wR 21 0 = 1 1 3 2 0 00 1 1 1

23 23 Induction: relate k to k-1 uA k-path from i to j either: 1.Never goes through state k, or 2.Goes through k one or more times. R ij k = R ij k-1 + R ik k-1 (R kk k-1 )* R kj k-1. Doesn’t go through k Goes from i to k the first time Zero or more times from k to k Then, from k to j

24 24 Illustration of Induction States < k k i j Paths not going through k From k to j From k to k Several times Path to k

25 25 Final Step uR ij n is the RE with the same language as the DFA where: 1.n is the number of states in the DFA 2.i is the start state. 3.j is one of the final states.

26 26 Example of formalism start=2, accept=3, n=3 uR 23 3 = R 23 2 + R 23 2 (R 33 2 )*R 33 2 = R 23 2 (R 33 2 )* uR 23 2 = (10)*0+1(01)*1 (see slide 21) uR 33 2 = 0(01)*(1+00) + 1(10)*(0+11) uR 23 3 = [(10)*0+1(01)*1] [(0(01)*(1+00) + 1(10)*(0+11))]* 1 3 2 0 00 1 1 1 R ij k = R ij k-1 + R ik k-1 (R kk k-1 )* R kj k-1

27 Useful RE’s in evaluation of R ij k  Identity union: E + ∅ = E  Annihilator concatenation: ∅E=E∅=∅  E is any RE 27

28 28 Equivalence of RE’s and FA’s uWe have shown by construction that a RE for any DFA exist that defines the same language that the DFA accepts. uThe method always works but may be time consuming since about n 3 RE’s must be constructed for an n-state DFA. uAn alternate method “eliminating states”

29 29 DFA to RE by Eliminating States uBasic principle: After state s is eliminated, RE’s on the residual arcs must define a transition function that supports the same language as before. uUsually this requirement can be satisfied by considering the states q i that are precursors to s and states p j that are successors to s

30 30 DFA to RE by Eliminating States 2 uLet Q i be RE for labels on arc from predecessor q i to eliminated state s uLet P j be RE for labels on arc from eliminated state s to successor p j uLet S be RE for labels on a loop on s uLet R ij be RE for labels on existing direct path between q i and p j. uThen the RE for path between q i and p j without s is R ij +Q i S* P j. uSome parts may not be present

31 31 DFA to RE by Eliminating States 3 uExample from exercise 3.2.1 page 107 q1q1 q2q2 1 q3q3 0 1 1 0 0 u4 sets of predecessor-successor combinations involving state q 2 wArcs q 1 to q 2 and q 2 to q 3 both labeled 0 wArcs q 3 to q 2 and q 2 to q 1 both labeled 1 wArcs q 1 to q 2 labeled 0 and q 2 to q 1 labeled 1 wArcs q 3 to q 2 labeled 1 and q 2 to q 3 labeled 0 uLet P j be RE for labels on arc from eliminated state s to successor p j uLet S be RE for labels on loop on s uLet R ij be RE for labels on existing direct path between q i and p j. uThen the RE for path between q i and p j without s is R ij +Q i S* P j.  Some parts may be ∅

32 32 DFA to RE by Eliminating States 4 uAll 4 cases, standard form reduces to Q i P j wNo loop on q 2 and no direct q 1 to q 3 uIn 2 cases, p j =q i so arcs become loops q1q1 q2q2 1 q3q3 0 1 1 0 0 q1q1 1+01 q3q3 0+10 11 00

33 33 DFA to RE by Eliminating States 5 uTo find the RE that is equivalent to the DFA, continue state elimination until only “start” and accepting states {q k } remain uLet L(RE k ) be the language of strings accepted by q k uThe RE equivalent to DFA is sum over k of RE k (union of all L(RE k ))

34 34 12 RU S T 1 R Generic two-stateGeneric one-state RE k =(R+SU*T)*SU* RE k = R* Actual values of R,S,T, and U are problem specific and some may be ∅ For each accepting state q k, the state-elimination process will result in a generic one-state (if q 0 =q k ) or two-state automaton

35 35 DFA to RE by Eliminating States 7 uR=1+01, S=00, T=11, U=0+10 uRE=[(1+01)+00(0+10)*11]*00(0+10)* uStandard form applied to exercise 3.2.4(e) q1q1 1+01 q3q3 0+10 11 00 12 RU S T RE k =(R+SU*T)*SU* q1q1 q2q2 1 q3q3 0 1 1 0 0

36 36 Equivalence of RE’s and Automata uTo complete proof of equivalence, we show by construction that for every RE, there is an automaton that accepts the same language that the RE defines.  It is sufficient to construct a  -NFA type with the following restriction: wOne accepting state wNo arcs into “start” state wNo arcs out of accepting state

37 37 Converting a RE to an ε -NFA  Formal statement: if L(RE) is a language defined by RE, then there exist an ε -NFA, denoted by  RE, such that L(  RE)=L(RE) uProof is by constructive induction on the number of operators (+, concatenation, *) in the RE.  Basis: For L(RE)={a} and {  },  RE consist of single arc between “start” and accepting states labeled by a and , respectively  Same for L(RE)= ∅ except no arc

38 (IH):assume theorem true for subexpressions E 1 and E 2 in RE  Use these  -NFA’s to build  RE such that L(  RE)=L(RE)  Sufficient to show how these  -NFA’s are used to build  -NFA’s for E 1 +E 2, E 1 E 2, and E1*   RE is built by linking these intermediate  - NFA’s as ordered by operations in RE 38  -NFA for E 1  -NFA for E 2

39 39 RE to ε -NFA: Induction 1 – Union For E 1 For E 2 For E 1  E 2 ε εε ε

40 40 RE to ε -NFA: Induction 2 – Concatenation For E 1 For E 2 For E 1 E 2 ε

41 41 RE to ε -NFA: Induction 3 – Closure For E 1 For E 1 * ε ε εε

42 42  Regular expressions (RE) and finite automata (DFA, NFA, ε -NFA) are equivalent in their ability to define “regular languages” uProof of equivalence involves construction  Some constructions are trivial (DFA->NFA and NFA->  -NFA) uRE FA not trivial in either direction uDFA->RE most challenging w2 methods: K-paths and elimination of states Review

43 43 k-path from i to j either: Never goes through state k, or Goes through k one or more times. R ij k = R ij k-1 + R ik k-1 (R kk k-1 )* R kj k-1 Construction: for i=start and j=accepting, build up R ij k k=0…n, where n=number of states of the DFA K-paths method: relate k to k-1

44 44 uReduce to start and accepting state k uSubstitute arc RE’s into generic form wRE k =(R+SU*T)*SU* uRepeat for all accepting states uForm union of all L(RE k )) q1q1 q2q2 1 q3q3 0 1 1 0 0 q1q1 1+01 q3q3 0+10 11 00 DFA to RE by Eliminating States

45 45 Algebraic Laws of RE’s uCommutative law of union: L+M=M+L uAssociative law of union: (L+M)+N=L+(M+N) uIdempotence of union: L+L=L U L=L uAssociative law of concatenation: (LM)N=L(MN) uConcatenation does not commute: LM not equal to ML 

46 46 Algebraic Laws for RE’s 2 uconcatenation distributes over union but with restrictions because cat is not commutative wLeft distributive law: L(M+N)=LM+LN wRight distributive law: (M+N)L=ML+NL uIdentities and annihilators  R+ ∅ = R  ε R = R ε = R  ∅ R = R ∅ = ∅

47 47 Algebraic Laws for RE’s 3 uLaws on closure w(L*)*=L*  ∅ *=  

48 48 Testing algebraic laws by simple examples u(L*)*=L* for any regular language L wMore obvious from simple example w(a*)*=a* uMost useful in disproving laws wIf test on example is false, then law cannot be true in general

49 Testing Algebraic Laws - 2 uExample of testing in text p121 wL+ML ?= (L+M)L wTry L=a, M=b w{a}+{b}{a} ?= ({a}+{b}){a} w{a}U{ba} ?= {aa}U{ba} wNot true: left side has no {aa} 49

50 50 CptS 317 Fall 2014 Assignment 6, Due 10-17-14 Exercise 3.2.1 (a) and (b), Text p107 Exercise 3.2.4 (a) Text p108 Show all steps

51 51


Download ppt "1 Regular Expressions Definitions Equivalence to Finite Automata Midterm exam 10-8-14 Review for Midterm 10-6-14."

Similar presentations


Ads by Google