UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.

UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with  transitions – Significance acceptance of languages NFA to DFA conversion minimization of DFA Finite Automata with output Moore and Mealy machines Constructing finite Automata for a given regular expressions Conversion of Finite Automata to Regular expressions.

What is automata theory Automata theory is the study of abstract computational devices Abstract devices are (simplified) models of real computations Computations happen everywhere: On your laptop, on your cell phone, in nature, … Why do we need abstract models? BATTERY SWITCH off on start f f input: switch output: light bulb actions: f for “flip switch” states: on, off bulb is on if and only if there was an odd number of flips A simple “computer”

Alphabets and Languages An alphabet is a finite non-empty set. We use the symbol ∑ (sigma) to denote an alphabet Examples: – Binary: ∑ = {0,1} – All lower case letters: ∑ = {a,b,c,..z} – Alphanumeric: ∑ = {a-z, A-Z, 0-9} – DNA molecule letters: ∑ = {a,c,g,t} Strings A string or word is a finite sequence of symbols chosen from ∑ Empty string is  (or “epsilon”) Length of a string w, denoted by “|w|”, is equal to the number of (non- ) characters in the string E.g., x = 010100 |x| = 6 x = 01  0  1  00  |x| = ? xy = concatentation of two strings x and y

4 Powers of an alphabet Let ∑ be an alphabet. – ∑ k = the set of all strings of length k – ∑* = ∑ 0 U ∑ 1 U ∑ 2 U … – ∑ + = ∑ 1 U ∑ 2 U ∑ 3 U … L is a said to be a language over alphabet ∑, only if L  ∑*  this is because ∑* is the set of all strings (of all possible length including 0) over the given alphabet ∑ Examples: 1.Let L be the language of all strings consisting of n 0’s followed by n 1’s: L = {,01,0011,000111,…} 2.Let L be the language of all strings of with equal number of 0’s and 1’s: L = {,01,10,0011,1100,0101,1010,1001,…} Definition:Ø denotes the Empty language Let L = {}; Is L=Ø?

Formal Language A formal language L is a set of finite-length words (or "strings") over some finite alphabet A.  is the empty word. Example: A = {a, b, c} L 1 = {ab, c} Some examples of formal languages: the set of all words over {a, b}, the set { a n | n is a prime number }, the set of syntactically correct programs in some programming language

Several operations can be used to produce new languages from given ones. Suppose L1 and L2 are languages over some common alphabet. The concatenation L1L2 consists of all strings of the form vw where v is a string from L1 and w is a string from L2. The intersection of L1 and L2 consists of all strings which are contained in L1 and also in L2. The union of L1 and L2 consists of all strings which are contained in L1 or in L2. The complement of the language L1 consists of all strings over the alphabet which are not contained in L1. The Kleene star L1* consists of all strings which can be written in the form w1w2...wn with strings wi in L1 and n ≥ 0. Note that this includes the empty string ε because n = 0 is allowed.

Regular Expressions A regular expression defines a regular language over an alphabet : –  is a regular language: {} – Any symbol from  is a regular language:  = { a, b, c} {a} {b} {c} – Two concatenated regular languages is a regular language:  = { a, b, c} {ab} {bc} {ca} – The union (or disjunction) of two regular languages is a regular language:  = { a, b, c} {ab|bc} {ca|bb} – The Kleene closure (denoted by the Kleene star: *) of a regular language is a regular language:  = { a, b, c} {a*} {(ab|ca)*} – Positive closure – Positive closure of a language L L + = L *  L 0 = L * – {e} – Parentheses group a sub-language to override operator precedence – A regular set is a set represented by a regular expression.

RE Examples L(001) = {001} L(0+10*) = { 0, 1, 10, 100, 1000, 10000, … } L(0*10*) = {1, 01, 10, 010, 0010, …} i.e. {w | w has exactly a single 1} L()* = {w | w is a string of even length} L((0(0+1))*) = { ε, 00, 01, 0000, 0001, 0100, 0101, …} L((0+ε)(1+ ε)) = {ε, 0, 1, 01} L(1Ø) = Ø; concatenating the empty set to any set yields the empty set. Rε = R R+Ø = R Exercise: Write a regular expression for the set of strings that contains an even number of 1’s over  ={0,1}. Treat zero 1’s as an even number.

Identity Rules

What are the strings represented by 10* A 1 followed by any number of 0s (including no zeros) (10)* Any number of copies of 10 (including null string) 0 + 01 the string 0 or the string 01 0 (0 + 1)* Any string beginning with 0 (0*1)* Any string not ending with a 0 (including null string) Find a regular expression The set of bit strings with even length (00 +01 +10 +11)* Set of bit strings ending with a 0 not containing 11 not the null string (0 +10)*(0+10) or (0+10) + The set of bit strings containing and odd number of 0s 1*01*(01*01*)*

11 Finite State Automata A finite state automata over an alphabet is: – a directed graph – a finite set of states defined by the nodes – edges are labeled with elements of alphabet, or empty string; they define state transitions – some nodes (or states), marked as final – one node marked as start state is a transition is a state is a final state is the start state

Finite-state Automata q0q0 q1q1 q2q2 q3q3 q4q4  = { a, b, c } abca transition final state start state state Representation –An FSA may also be represented with a state- transition table. The table for the above FSA: Input State abc 01  1  2  2  3 34  4 

Given an input string, an FSA will either accept or reject the input. – If the FSA is in a final (or accepting) state after all input symbols have been consumed, then the string is accepted (or recognized). – Otherwise (including the case in which an input symbol cannot be consumed), the string is rejected. q0q0 q1q1 q2q2 q3q3 q4q4  = { a, b, c } abca Input State abc 01  1  2  2  3 34  4  abca ccba abcac IS 1 : IS 2 : IS 3 :

Determinism – An FSA may be either deterministic (DFSA or DFA) or non-deterministic (NFSA or NFA). An FSA is deterministic if its behavior during recognition is fully determined by the state it is in and the symbol to be consumed. – I.e., given an input string, only one path may be taken through the FSA. Conversely, an FSA is non-deterministic if, given an input string, more than one path may be taken through the FSA. – One type of non-determinism is  -transitions, i.e. transitions which consume the empty string (no symbols). A finite state automata M = (∑, Q, δ, q 0, F) ∑: alphabet Q: set of states δ: Q ⅹ ∑  Q, a transition function q 0 : the start state F: final states Formal Definition of FSA

Non-deterministic Finite Automata A nondeterministic finite automaton M is a five-tuple M = (Q, , , q 0, F), where: – Q is a finite set of states of M –  is the finite input alphabet of M – : Q    power set of Q, is the state transition function mapping a state-symbol pair to a subset of Q – q 0 is the start state of M – F  Q is the set of accepting states or final states of M NFA that recognizes the language of strings that end in 01 q0q0 q2q2 0,1 0 1 q1q1 note:  (q 0,0) = {q 0,q 1 }  (q 1,0) = {}

Deterministic Finite Automata A DFA is an NFA with the following restrictions:  moves are not allowed For every state s S, there is one and only one path from s for every input symbol a  . start 03 b 21 ba b a b a a What Language is Accepted?

Algorithm to construct a NFA for any regular expression Basic building blocks: (1) Any letter a of the alphabet is recognized by: (2) The empty set  is recognized by: (3) The empty string  is recognized by: (Thompson Construction)

(4) Given a regular expression for R and S, assume these boxes represent the finite automata for R and S: (5) To construct a nfa for RS (concatenation): (6) To construct a nfa for R | S (alternation):

(7) To construct a nfa for R* (closure): Construct NFA for the regular expression (ab*c) | (a(b|c*)) b     ac c       b a    1 6543 8 2 10 9121314 11 15 7 16 17  

NFA to DFA conversion (Subset construction method)

Convert the given RE into DFA using Subset Construction ( a | b ) * abb a, b q0q0 q1q1 q4q4 q2q2 q3q3  abb contains q 4 (final state) Iter. new state ε-closure(move(sj,x)) nameContains a b 0s0 q0, q1 q1,q2 q1 1s1 q1, q2 q1,q3 s2 q1 q1,q2 q1 2s3 q1,q3 q1,q2 q1,q4 3s4 q1, q4 q1,q2 q1 NFA to DFA s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b

25 Converting DFAs to REs 1.Combine serial links by concatenation 2.Combine parallel links by alternation 3.Remove self-loops by Kleene closure 4.Select a node (other than initial or final) for removal. Replace it with a set of equivalent links whose path expressions correspond to the in and out links 5.Repeat steps 1-4 until the graph consists of a single link between the entry and exit nodes.

26 Example 012 6 43 d a b c d 7 5 a b d d b c 012 6 43 da|b|cd 7 5 a b d d b|c 043 d(a|b|c)d 5 ad b(b|c)d

UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.

Similar presentations

Presentation on theme: "UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.

Similar presentations

Presentation on theme: "UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with."— Presentation transcript:

Similar presentations

About project

Feedback