Formal Language & Automata Theory Shyamanta M Hazarika Computer Sc. & Engineering Tezpur University http://www.tezu.ernet.in/~smh
Regular Expressions A regular expression defines a language. Regular expressions are defined through union, concatenation, and closure. Union: L1 L2 {x | x in L1 or x in L2}. Concatenation: L1L2 {xy | x in L1 and y in L2}. (Kleene) Closure: L* i=0.. Li, where L0 = {}, Li =LLi-1. Example: {01, 10}* = {, 01, 10, 0110, 0101, 1010, …}.
Formal Definition Definition: Regular expressions (over ). 1) is a regular expression denoting (empty set). 2) is a regular expression denoting {}. 3) For each a , a is a regular expression denoting {a}. 4) If E (F) is a reg. exp. denoting language L(E) (L(F)), then (E + F) is a reg. exp. denoting L(E) L(F), (EF) ………………………….L(E)L(F), (E*) ………………………….L(E)*. We will omit parenthesis if no confusion.
Example 00(0 + 1)*01 + = {x | x = or x, |x| 4, begins with 00 and ends with 01} (An alphabet of {0,1} is assumed here, and in many other examples as well.)
Finite Automata and Reg. Exps. Regular expressions define the same class of languages as Finite Automaton. NFA DFA -NFA Reg. Exp.
Theorem Theorem: If R is a regular expression, then L(R) = L(E) for some NFA E with -moves. Proof: We construct E so that it has one final state and no transitions out of that state. We do this by induction on the number of operators in R.
Basis Only three cases: Case 1: R = . Case 2: R = . Case 3: R = a. q0 Start q0 qf Start q0 qf Start a
Inductive Step Case 1: R = R1 + R2. By the induction hypothesis, machines E1 and E2 exist for R1 and R2, respectively. Construct E as follows: q1 f1 E1 q0 Start f0 q2 f2 E2
Inductive Step (Continued) Case 2: R = R1R2. Start q1 f1 E1 q2 E2 f2 Case 3: R = R1*. q1 f1 E1 q0 f0 Start
Example 0(0 + 1)* 0: 1: 0 + 1: Start 1 Start 1 Start
Example (Continued) 1 0 + 1: (0 + 1)*: Start 1 Start
Example (Continued) (0 + 1)*: 0(0 + 1)*: (0 + 1)*: 0(0 + 1)*: 1 Start 1 Start
Example (Continued) 0(0 + 1)*: 1 Start Note: A much simpler machine exists.