BCS 2143 Theory of Computer Science

BCS 2143 Theory of Computer Science
Part I: Automata and Languages Topic 2: Regular Languages

Finite Automata Consider a board-game
Pieces are set up on a playing board. Dice are thrown, and a number is generated on random. Depending on number, the pieces on the board must be rearranged in a fashion completely specified by the rules. The child has no options about changing the board. Everything is determined by the dice.

Game All possible positions of the pieces on the board is called States. The game changes from one state to another in a fashion determined by the input of a certain number. For each possible number, there is one and only one resulting state. It is possible that after a number is entered, the game is still in the same state as it was before. (e.g, a player is in jail and needs to roll doubles to get out) The victory state is called a final state. (may have many final states) (it can also claimed as halting states, terminal states, or accepting states)

Computer A child has a simple computer (input device, processing unit, memory, output device) and wished to calculate the sum of 3 plus 4. The child writes a program, which is a sequence of instructions that are fed into the machine one at a time. Each instruction is executed as soon as it is read, and then the next instruction is read. If all goes well, the machine outputs the number 7 and terminates execution. Computer is also deterministic : the resultant state is completely determined by the prior state and the input instruction.

Game vs. Computer Difference:
The input of Game (the number generated by rolling dice) is depended on whether anyone has won the game yet. But the input of the computer program is the number of instruction which is predetermined before running the program. Now we will consider the input of both the game and computer as strings of alphabet instead of the number generated randomly by the dice or the command instruction of the program.

Finite Automata (FA) A simple class of machines with limited capabilities. good models for computers with an extremely limited amount of memory. e.g., an automatic door : a computer with only a single bit of memory

State Diagram front pad rear automatic door CLOSED OPEN FRONT NEITHER
BOTH

State Transition Table
INPUT SIGNAL NEITHER FRONT REAR BOTH STATE CLOSED OPEN

Examples Elevator controller Dishwashers Electronic thermostats
state : floor input : signal received from the buttons. Dishwashers Electronic thermostats Digital watches Calculators Test

Definition A finite automaton is a collection of 3 things:
A finite set of states, one of which is designated as the initial state, called the start state, and some (maybe none) of which are designated as final states. An alphabet of possible input letters. A finite set of transitions that tell for each state and for each letter of the input alphabet which state to go to next.

State Diagram q1 1 0,1 q2 q3 start state = q1 final state = q2 transitions = each arrows alphabet = each labels When this automaton receives an input string such as 1101, it processes that string and produce output (Accept or Reject).

Language of machine If A is the set of all strings that machine M accepts, we say that A is the language of machine M. L(M) = A M recognizes A (only 1 language) M accepts strings (several strings) If M accepts no strings, it still recognizes one language, empty language 

Formal Definition A finite automaton is a 5-tuple (Q,,,q0,F) where
Q is a finite set called the states,  is a finite set called the alphabet, : Q x   Q is the transition function, q0  Q is the start state, and F  Q is the set of accept states (final states) Final 2013

Example : Finite Automaton M1
q1 1 0,1 q2 q3 M1= (Q,,,q0,F) , where Q = {q1, q2, q3},  = {0,1},  is described as q1 is the start state, and F = {q2}. 1 q1 q2 q3

q1 1 0,1 q2 q3 What is the language of M1?

q1 1 0,1 q2 q3 A = {w | w contains at least one 1 and an even number of 0s follow that last 1} L(M1) = A, or equivalently, M1 recognizes A

q1 1 q2 M2= (Q,,,q0 ,F) , where Q =  =  is described as is the start state, and F = { }. 1 q1 q2

q1 1 q2 What is the language of M2? L(M2) = {w | w ends in a 1}

Empty String  q1 q2 1 If the start state is also a final state, what string does it automatically accept ? L(M3) = { w | w is the empty string  or ends in a 0}

S q1 r1 q2 r2 b M4= (Q,,,q0,F) , where Q =  =  is described as is the start state, and F = { }. a b q1 q2 r1 r2 L(M4) =

2,<reset> q1 q0 q2 0,<reset> 1 2 1,<reset>  = {<reset>, 0, 1, 2} we treat <reset> as a single symbol. What does the M5 accept ?

FA with Computer Language
Certain character strings are recognizable words. (DO, IF,END,…) Certain strings of words are recognizable commands. Certain set of commands become a program that can be compiled which means translated into machine commands. FA is used to determine whether the input commands (instruction) is valid or not corresponding to the structure rules. FA implements the rule with the transitions.

Theory of formal languages
The word “Formal” refers to the fact that all the rules for the language are explicitly stated in terms of what strings of symbols can occur. NO liberties are tolerated. Language will be considered as symbols on paper not as expressions of ideas in the minds of humans.

Conclusion Language is a game of symbols with formal rules.
Only the form of the string of symbols are interested in, not the meaning.

Terminology Empty string or Null string ()
a string to have no letters (with no length) DO NOT allow Empty String to be part of the alphabet of any languages Language with no words is called empty language or null set. ()

The truth about  vs.  It is not true that  is a word in the language  since this language has no words at all.

Abstract languages Defined into 2 ways either presented as
an alphabet and the exhaustive list of all valid words an alphabet and a set of rules defining the acceptable words.

PALINDROME language Definition of a new language PALINDROME over the alphabet :

PALINDROME if we begin listing the elements in PALINDROME, we find
which if we concatenate 2 words in PALINDROME, sometimes it can produce a new words which is also in PALINDROME but sometimes it doesn’t. (Talk about it later)

Kleene Closure / Kleene star
Closure of the alphabet ( ) is a language in which any string of letters from an alphabet is a word. For example, if ,then if ,then

Kleene star (cont.) We can think of the Kleene star as
an operation that makes an infinite language of strings of letter out of an alphabet. Infinite language = infinitely many words, each of finite length.

Definition of S* If S is a set of words, then by S*
we mean the set of all finite strings formed by concatenating words from S, where any word may be used as often as we like, and where the null string is also included.

Example If S = {aa b}, then
S* = { plus any word composed of factors of aa and b} = { plus all strings of a’s and b’s in which the a’s occur in even clumps} = { b aa bb aab baa bbb … }

Example If S = {a ab}, then
S* = { plus any word composed of factors of a and ab} = { plus all strings of a’s and b’s except those that start with b and those that contain a double b} = { a aa ab aaa aab aba … }

Proof a word in the S* To prove that certain word is in the closure language S*, we must show how it can be written as a concatenate of words from the base set S. For example, to show abaab is in S*, we can factor it as (ab)(a)(ab) and these are in S, therefore, their concatenation is in S*. If there is only one way to factor the string, we say that the factoring is unique.

Example Consider the 2 languages S = {a b ab} and T = {a b bb}
both S* and T* are languages of all strings of a’s and b’s since any string of a’s and b’s can be factored into syllables of either (a) or (b), both of which are in S and T.

Positive closure (+) If we would like to refer to only the concatenation of some (not zero) strings from a set S, we use the notation + instead of *, for example, if , then

S+ if S is a set of strings not include , then S+ is the language S* without the word . If S is a language that does contain , then S+ = S*. S+ can contain  only when S contains the word  initially.

S* and S** Theorem 1: Proof: For any set S of strings we have S* = S**
Every word in S** is made up of factors from S*. Every factor from S* is made up of factors from S. Therefore, every word in S** is made up of factors from S. Therefore, every word in S** is also a word in S*. (is contained in or equal to)

Regular Expressions We can describe a language definition looked similar to We can guess the meaning of the languages, however it can be defined in a particular way that gets hard to guess. For example:

Another new method of language definition
We shall develop some new language-definition symbolism that will be much more precise than the … For example: consider the language L4 We can define it with closure Let Then for shorthand, we could have written

Simple expression By using Kleene star, we can have a simple expression instead of … In order to distinguish between x from alphabet of x from Kleene star x*, we will use bold face x* instead to make it different.

Language(x*) We can also define L4 as
Since x* is any string of x’s, L4 is then the set of all possible string of x’s of any length (including )

Example Suppose we wish to describe the language L over the alphabet
where “all words of the form one ‘a’ follow by some number of ‘b’s’ (maybe no b’s at all)” we may write

ab* means a(b*), not (ab)*
Parentheses are not letters in the alphabet of this language, so they can be used to indicate factoring without accidentally changing the words. Like the powers in algebra ab* means a(b*), not (ab)*

Language(xx*) vs. Language(x+)
means ? We start each word of L1 by writing down an x and then we follow it with some string of x’s (which may be no more x’s at all.) We can use the + notation and write

Example The language L1 defined above can also be defined by any of these expressions: xx* x+ xx*x* x*xx* x+x* x*x+ x*x*x*xx* Remember x* can always be 

Example ab*a is the set of all string of a’s and b’s that have at least two letters, that begin and end with a’s, and that have nothing but b’s inside (if anything at all).

ba and aba are not in this Lang.
Example a*b* contains all the strings of a’s and b’s in which all the a’s (if any) come before all b’s (if any) notice that ba and aba are not in this Lang.

a*b* vs. (ab)* (ab)* can contain abab but a*b* can’t contain abab

A* = {x1x2x3 … xk | k  0 and each xi  A}
Regular Operations Let A and B be languages. We define the regular operations union, concatenation, and star as follows. Union : AB = {x|x  A or x  B} Concatenation : (simply no written) A  B = {xy|x  A and y  B} Star : A* = {x1x2x3 … xk | k  0 and each xi  A}

Union () xy where x and y are strings of characters from an alphabet
means “either x or y”

Example Consider the language T defined over the alphabet
all the words in T begin with an a or a c and then are followed by some number of b’s.

Finite language L L = language((ab) (ab) (ab))
We can define any finite language by our new expression. For example, consider a finite language L contains all the strings of a’s and b’s of length 3 exactly: The first letter can be either a or b. so do the 2nd and 3rd letter. L = language((ab) (ab) (ab))

Finite language (cont.)
or we can simply write shortly as L = language(ab)3 if we write (ab)*, it means the set of all possible strings of letters from the alphabet including the null string 

a(ab)*b = a(arbitrary string)b
Examples If we write a(ab)* we can describe all words that begin with the letter a. If we would like to describe all words that begin with an a and end with a b, we can define by the expression a(ab)*b = a(arbitrary string)b

Formal definition of regular expressions
The new definition we have talked about is claimed as “Regular Expression”. Languages which are able to be described by RE, are called “Regular Languages”. Not every languages are able to be described by RE. Regular languages may also be described by another fine definitions, besides the RE.

Regular Expression The symbols that appear in RE are
the letters of the alphabet  the symbol of null string  parentheses ( ) star operator *  sign Test

Formal Definition of a Regular Expression
Say that R is a regular expression if R is a for some a in the alphabet , , , (R1R2), where R1 and R2 are regular expressions, (R1  R2), where R1 and R2 are regular expressions, or (R1*), where R1 is regular expression.

Regular Expressions’ rules
Rule 1: Every letter of  can be made into a regular expression Rule 2: If r1 and r2 are regular expressions, then so are (i) (r1) (ii) r1r2 or r1 r2 (iii) r1r2 (iv) r1* Rule 3: Nothing else is a regular expression.

Why not r1+ ? We could have included the plus sign as part of the definition, but since we know that this would add nothing valuable.

Parentheses We use parentheses ( ) as an option to eliminate the ambiguity when we apply * or + to the expressions. For example: if r1 = aab then what is r1* ? Is the r1* = aa+b* or (aa+b)* ? They are both REs but very different. Ans. the later choice. In this case we should put the ( ) when we substitute aab to r1*

Null Language  is the symbol of null string in regular expression.
 is the symbol for “Null Language” Don’t confuse! R =  represents the language containing a single string, the empty string.  {} R =  represents the language that doesn’t contain any strings.

Definitions If we let R be any regular expression, R = R :
Adding the empty language to any other language will not change it. R   = R : Adding the empty string to any other language will not change it. R may not equal to R e.g., if R = 0, the L(R) = {0} but L(R) ={0,} R   may not equal to R e.g., if R = 0, the L(R) = {0} but L(R  ) =

Example Let consider the language defined by (ab)*a(ab)*
What does it produce ? Ans. The language which is the set of all words over the alphabet  = {a,b} that have an a in somewhere. Only words which are not in this language are those that have only b’s and the word 

Union of two languages Those words which compose of only b’s are defined by the expression b*. (b* also includes the null string ) Therefore, the language of all strings over the alphabet  = {a,b} are all strings = (all strings with an a)  (all string without an a) (ab)* = (ab)*a (ab)*  b*

(ab)*a (ab)*a (ab)*
Example How can we describe the language of all words that have at least 2 a’s ? Ans (ab)*a (ab)*a (ab)* = (some beginning)(the first a)(some middle)(the second a)(some end) where the arbitrary parts can have as many a’s (or b’s) as they want.

Example Is there any other RE that can define the language with at least 2 a’s ? Ans. Yes. For example: b*ab*a(ab)* =(some beginning of b’s (if any))(the first a) (some middle of b’s)(the second a) (some end)

Equivalent expressions
(ab)*a (ab)*a (ab)* = b*ab*a(ab)* Both expressions are equivalent b/c they both describe the same item. We could write language ((ab)*a (ab)*a (ab)*) = language(b*ab*a(ab)*) = all words with at least two a’s = (ab)*ab*ab* = b*a(ab)* ab*

Example If we wanted all words with exactly 2 a’s, we could use the expression b*ab*ab* it can describes such words as aab, baba, bbbabbbab, … Question: Can it make the word aab ? Ans.: Yes. by having the first and second b* = 

Example How about the language with at least one a and at least one b ? (ab)*a (ab)*b (ab)* It can only produce words which an a precede ab. To produce words which have ab precede an a, we can describe by (ab)*b (ab)*a (ab)* Thus, the set of all words : (ab)*a (ab)*b (ab)* (ab)*b (ab)*a (ab)*

(ab)*a (ab)*b (ab)* (ab)*b (ab)*a (ab)*
Example (ab)*a (ab)*b (ab)* can produce all words with at least one a and at least one b, However, it doesn’t contain the words of the forms some b’s followed by some a’s. These exceptions are all defined by bb*aa* Thus, we have all strings over  = {a,b} (ab)*a (ab)*b (ab)* (ab)*b (ab)*a (ab)* = (ab)*a (ab)*b (ab)* bb*aa*

(ab)*a (ab)*b (ab)* bb*aa*
generates all words which have both a and b in them somewhere. Words which are not included in the above expression are words of all a’s, all b’s or  a*, b* Now, we have all words which can be generated above the alphabet (ab)* = (ab)*a (ab)*b (ab)* bb*aa*  a*  b*

Distributive Law Let V be the language of all strings of a’s and b’s in which either the strings are all b’s or else there is an a followed by some b’s. Let V also contains the word  we can define V by b*ab* or

Finite languages are regular
If L is a finite language, then L can be defined by a regular expression. For example, If L = {aa ab ba bb} the regular expression described L is aaabbabb another regular expression is (ab) (ab) note : the regular expression which defines the language need not to be unique.

Regular Languages A language is called a regular language if some finite automaton recognizes it. Every regular languages can be defined by a RE. Not all languages are regular.

Closure Properties Theorem:
If L1 and L2 are regular languages, then L1L2, L1 L2 and L1* are also regular languages. We said that the set of regular language is closed under union, concatenation, and Kleene closure.

Proof: If L1 and L2 are regular languages, there are RE r1 and r2 that define these languages. Then (r1r2) is a RE that defines the language L1L2. The language L1L2 can be defined by the RE r1r2. The language L1* can be defined by the RE (r1)*. Therefore, all 3 of these sets of words are defined by REs and so are themselves regular languages.

Theorem 1.12 The class of regular languages is closed under the union operation. In other words, if A1 and A2 are regular language so is A1A2

Formal Proof of Theorem 1.12
Let M1 recognize A1, where M1= (Q1,,1,q1,F1) M2 recognize A2, where M2=(Q2,,2,q2,F2) Construct M to recognize A1A2,where M = (Q,,,q0,F) Q= {(r1, r2) | r1  Q1 and r2  Q2}, the Cartesian product Q1xQ2. , the alphabet. If M1 and M2 have different sets of alphabet then  = 1  2 , the transition function, for each (r1,r2)  Q and each a  , let ((r1,r2),a) = ((r1,a),(r2,a)) q0 is the pair (q1,q2) F is the set of pairs in which either member is an accept state of M1 or M2. F= {(r1,r2)|r1  F1 or r2  F2 }

Theorem 1.13 The class of regular languages is closed under the concatenation operation. In other words, if A1and A2 are regular languages then so is A1 A2

Proof of Theorem 1.13 Try to proof this theorem as the 1.12 one,
When we create a machine M which must accept input if it can be broken into two pieces, where M1 accepts the first piece and M2 accepts the second piece, we have a problem. The problem is that M doesn’t know where to break its input. (i.e., where the first part ends and the second begins.)

Determinism So far, every step of a computation follows in a unique way from the preceding step. When the machine is in a given state and reads the next input symbol, we know what the next state will be – it is called deterministic computation Deterministic Finite Automata -- DFA

Nondeterminism In a nondeterministic machine, several choices may exist for the next state at any point. Nondeterminism is a generalization of the determinism, so every deterministic finite automaton is automatically a nondeterministic finite automaton. Nondeterministic Finite Automata--NFA

Example of DFA vs. NFA q1 1 0,1 q2 q3 DFA: q1 q2 q3 q4 0,1 1 0, NFA:

Differences between DFA & NFA
Every state of DFA always has exactly one exiting transition arrow for each symbol in the alphabet while the NFA can violate the rule. In a DFA, labels on the transition arrows are from the alphabet while NFA can have an arrow with the label . Final 2013

How does the NFA work? When we are at a state with multiple choices to proceed (including  symbol), the machine splits into multiple copies of itself and follow all the possibilities in parallel. Each copy of the machine takes one of possible ways to proceed and continuous as before. If there are subsequent choices, the machine splits again. If the next input symbol doesn’t appear on any of the arrows exiting the state occupied by a copy of the machine, that copy dies. If any one of these copies is in an accept state at the end of the input, the NFA accepts the input string.

Tree of possibilities Think of a nondeterministic computation as a tree of possibilities The root of the tree corresponds to the start of the computation. Every branch point in the tree corresponds to a point in the computation at which the machine has multiple choices. The machine accepts if at least one of the computation branches ends in the an accept state.

Tree of possibilities   Nondeterministic computation
start accept or reject  reject accept

Example: 010110 q1 q1 q2 q3 q1 q1 q2 q3 q3 q1 q4 q4 Start NFA: q1 q2
0,1 1 0, NFA: q1 q2 q3 q1 q1 q2 q3 q3 q1 q4 q4

Properites of NFA Every NFA can be converted into an equivalent DFA.
Constructing NFAs is sometimes easier than directly construction DFAs. NFA may be much smaller than it DFA counterpart. NFA’s functioning may be easier to understand. Good introduction to nondeterminism in more powerful computational models because FA are especially easy to understand.

Example: Converting NFA into DFA
q1 q2 q3 q4 0,1 1 NFA: recognizes language which contains 1 in the third position from the end Equivalent DFA: q000 q100 q001 q010 q011 q101 q111 q110 1

Formal definition of NFA
A nondeterministic finite automaton is a 5-tuple (Q,,,q0,F) , where Q is a finite set of states,  is a finite alphabet,  : Q x   P(Q) is the transition function, q0 is the start state, and F  Q is the set of accept states. Notation: P(Q) is called power set of Q (a collection of all subsets of Q). and  = {}

Example: Formal definition of NFA
q1 q2 q3 q4 0,1 1 0, NFA: Formal definition of N1 is (Q,,,q0,F) , where Q = {q1,q2,q3,q4}  = {0,1}  is given as q0 is the start state, and F = {q4} 1  q1 {q1} {q1,q2}  q2 {q3} q3 {q4} q4

N accepts w Let N = (Q,,,q0,F) be an NFA and w a string over the alphabet . Then we say that N accepts w if we can write w as w = y1y2…ym, where each yi, is a member of  and a sequence of states r0,r1,…,rm exists in Q with the following 3 conditions r0 = q0 ri+1  (ri, yi+1) for i = 0,…,m-1, and rm  F.

Equivalence of NFAs and DFAs
say that two machines are equivalent if they recognize the same language. Theorem: Every nondeterministic finite automaton has an equivalent deterministic finite automaton.

Proof of the Theorem Let N = (Q,,,q0,F) be an NFA recognizing some language A. We construction a DFA M recognizing A. Let’s first consider the easier case wherein N has no  arrows.

Proof of the Theorem Construct M = (Q’,,’,q0’,F’)
Q’ = P(Q) Every state of M is a set of states of N. For RQ’ and a let ’(R,a) = {q  Q| q  (r,a) for some rR} if R is a state of M, it is also a set of states of N. When M reads a symbol a in state R, it shows where a takes each state in R. q0’ = {q0} M starts in the start corresponding to the collection containing just the start state of N. F’ = {RQ’|R contains an accept state of N} M accepts if one of possible states that N could be in at this point is an accept state.

E(R) Now we consider the  arrows. To do so, we set up an extra bit of notation. For any state R of M we define E(R) to be the collection of states that can be reached from R by going only along  arrows, including the members of R themselves. Formally, for R  Q let E(R) = {q|q can be reached from R by traveling along 0 or more  arrows} Thus, ’(R,a) = {q  Q| q  E((r,a)) for some rR}.

Example: NFA DFA  1 2 3 b a a,b Equivalent DFA
Q’ = P’(Q) = {,{1},{2},{3},{1,2},{1,3},{2,3},{1,2,3}} Start state : all possible states that can be reached from the start state of NFA along the  arrows. start state = E({1}) = {1,3} accept states are those containing the NFA’s accept state {1} = {{1},{1,2},{1,3},{1,2,3}}

Example: NFA DFA a,b a b  {1} {2} {1,2} a,b b b a a a {3} {1,3}
{2,3} {1,2,3} a a b b Simplify the machine by eliminate the state {1} and {1,2} as they don’t have any arrows point to them.

Closure Properties proved by NFA
Proof 2: (by NFA machine) because L1 and L2 are regular languages, there must be NFAs that accept them. Let NFA1 accepts L1 and NFA2 accepts L2. Assume that both NFAs have a unique start state and a unique separate final state. - q2 NFA2 q1 NFA1  The NFA which accepts L1L2

Closure Properties proved by NFA
The NFA which accepts L1 L2  1 2 NFA1 NFA2 The NFA which accepts L1*   NFA1 If the start state has internal edges leading back to it, we must add a duplicate start state.

Example Let the alphabet be  = {a,b} and
L1 = all words of two or more letters that begin and end with the same letter L2 = all words that contain the substring aba a a,b b NFA1: NFA2: aba a,b r2= (ab)*aba(ab)* r1= a(ab)*ab(ab)*b

Example : L1L2 L1L2 is defined by RE :
[a(ab)*ab(ab)*b]  [(ab)*aba(ab)*]   aba a,b a a,b b

Example : L1L2 L1L2 is defined by RE :
[a(ab)*ab(ab)*b] [(ab)*aba(ab)*] a,b 2 aba a,b a a  1 b b a,b

Example : L1* L1* is defined by RE : [a(ab)*ab(ab)*b] * a a,b b 

Equivalence with FA Theorem
A language is regular if and only if some regular expression describes it. this theorem has 2 directions, we have to prove each direction as a separate lemma.

Lemma 1: If a language is described by a regular expression, then it is a regular. Proof: Say that we have a RE R describing some language A. We show how to convert R into an NFA recognizing A.

Proof Consider 6 cases in the formal definition of RE
R = a for some a in . Then L(R)={a} R = . Then L(R)={} a

Proof R = . Then L(R)={} R = R1R2 R = R1R2 R = R1*
for the last 3 cases, use the construction given in the proofs that the class of RL is closed under the regular operations.

Example Convert the following regular expression into an NFA (aba)*

Lemma 2: If a language is regular, then it is described by a regular expression. Proof: We need to show that if a language A is regular, a regular expression describes it. Because A is regular, it is accepted by a DFA. So, we will describe a procedure for converting DFAs into equivalent regular expressions.

GNFA Generalized nondeterministic finite automata properties
simply nondeterministic finite automata so it may have several different ways to process at the same input string. its transition arrows may have any regular expressions as labels reads block of symbols from the input, not necessary just one symbol at a time moves along a transition arrow connecting 2 states by reading a block of symbols from the input which themselves constitute a string described by the RE on that arrow.

GNFA For convenience we require GNFA always have a special conditions
Start state has transition arrows going to every other state but no arrows coming in from any other state. Only a single accept state, and it has arrows coming in from every other state but no going out. Accept state is not the same as the start state. Except for the start and accept states, one arrows goes from every state to every other state and also from each state to itself. Test

Example

Convert GNFA into RE add a new start state with an  arrow to the old start state add a new accept state with an  arrows to the old accept states the old start state and accept states become just simple states if any arrows have multiple labels (or if there are multiple arrows going between the same 2 states in the same direction), replace each with a single arrow whose label is the union of the previous label add arrow labeled  between states that had no arrows

Convert GNFA into RE How to convert GNFA into RE
GNFA has k-states, as GNFA has a start state and an accept state (different from each other), we know that k  2. If k > 2, we construct an equivalent GNFA form with k-1 states. If k = 2, GNFA has a single arrow that goes from the start state to the accept state. The label of this arrow (from start to accept state) is the equivalent regular expression.

Constructing equivalent GNFA when k > 2
Selecting a state, ripping it out of the machine qrip Repairing the remainder labels by adding back the lost computations so that the same language is still recognized Any state will do, provided that it is not the start or accept state

Convert(G) Let k be the number of states of G.
If k=2, then G must consist of a start state, an accept state, and a single arrow connecting them and labeled with a regular expression R. Return the expression R.

’(qi,qj) = (R1)(R2)*(R3)  (R4)
Convert(G) If k>2, we select any state qrip  Q different from qstart and qaccept and let G’ be the GNFA(Q’,,’,qstart,qaccept), where Q’ = Q – {qrip} and for any qi  Q’-{qaccept} and any qj  Q’-{qstart} let ’(qi,qj) = (R1)(R2)*(R3)  (R4) for R1 = (qi,qrip),R2 = (qrip,qrip), R3 = (qrip,qj), and R4 = (qi,qj). Compute Convert(G’) and return this value.

Definition of GNFA A generalized nondeterministic finite automata (Q,,,qstart,qaccept), is a 5-tuple where Q is the finite set of states  is the input alphabet : (Q-{qaccept})x(Q-{qstart})  R is the transition function qstart is the start state, and qaccept is the accept state.

Examples 1 2 b 3 a 1 2 a b a,b

Nonregular Languages Finite automata have limitations.
Here, we are going to prove that certain languages cannot be recognized by any finite automaton. Languages which can’t be recognized by FA called Nonregular languages. Example, B = {0n1n|n  0}

Pumping Lemma for RL All regular languages have a special property.
If we can show that a language does not have this property, we are guaranteed that it is not regular.

Pumping Lemma for RL Property:
“ All strings in the language can be “pumped” if they are longer than a certain special value, called the pumping length.” Each such string contains a section that can be repeated any number of times with the resulting string remaining in the language.

Theorem: Pumping Lemma
IF A is a regular language, then there is a number p (the pumping length) where, if s is any string in A of length at least p, then s may be divided into 3 pieces, s = xyz, satisfying the following conditions: for each i  0, xyiz  A, |y| > 0, and |xy|  p.

Theorem: Pumping Lemma
Recall |s| is the length of string s yi means that i copies of y are concatenated together y0 equals  When s is divided into xyz, either x or z may be , but condition 2 says that y .

Example Let B be the language {0n1n|n  0}.
Use the pumping lemma to prove that B is not regular. Proof by contradiction. Assume that B is regular. Choose a member of B with the length greater than p let s be the string 0p1p s = xyz where for any i  0, the string xyiz is in B.

Thus, B is not a regular language.
Example Consider 3 cases The string y consists only of 0s. In this case the string xyyz has more 0s than 1s and so is not a member of B, violating condition 1. This case is a contradiction. The string y consists only of 1s. This case also gives a contradiction. The string y consists of both 0s and 1s, but they will be out of order with some 1s before 0s. Hence it is not a member of B, which is a contradiction. Thus, B is not a regular language.

Finite Automata with Output
So far, we have known machines which work as language recognizers. We also mentioned about the mathematical model of a computer. As we know, computers often have the more useful function of performing calculations and conveying results. We said that the input string represents the program and input data.

Output Reading the letters from the string is analogous to executing instructions in that it changes the state of the machine. changes the contents of memory, changes the control section of the computer and so on. We can say that, changing states of the machine produces some outputs which are the changing of the contents of memory, the control section, etc.

Moore Machines We could consider the output as part of the total state of the machine. Therefore, each state should reflect its outputs corresponding to the read input letter. If we consider that reaching a particular computer state means changing to memory a certain way and print a specified character, the output of each state is the printed character which corresponding to the input letter of the state.

Moore Machines (cont.) We shall investigate two different but equivalent models for FAs with output capabilities. One created by G. H. Mealy (1955) The other one created by E. F. Moore (1956), independently. The original purpose of the inventors was to design a mathematical model for sequential circuits, which are only one component of the architecture of a whole computer. Test

Moore machine’s Definition
A Moore machine is a collection of 5 things: 1. A finite set of states q0, q1, q2,... , where q0 is designated as the start state. 2. An alphabet of letters for forming the input string 3. An alphabet of possible output characters 4. A transition table that shows for each state and each input letter what state is reached next. 5. An output table that shows what character from  is printed by each state as it is entered. called letters called characters. Sometimes use 0,1

Start state of Moore machine
We shall adopt policy that a Moore machine always begins by printing the character dictated by the start state. It means that the number of printed character will always be greater than the number of input letters by 1.

Example Let us consider an example defined first by a table:
Input alphabet:  = {a,b} Output alphabet: = {0,1} Names of states: q0,q1,q2,q3 (q0=start state) Transition Table Old state Output by the old state New state After input a After input b -q0 q1 q2 q3 1 q0

Example q0/1 q3/1 q2/0 q1/0 Test: abab output = 10010 a b

Example Suppose we were interested in knowing exactly how many times the sub string aab occurs in a long input string. The following Moore machine will count this for us. q0/0 q3/1 q2/0 q1/0 a b

Example A Moore machine can be said to define the language of all input strings whose output ends in a 1. The previous machine with q0 as start state and q3 as accept state accepts all words that end in aab.

Mealy machines Mealy machine is like a Moore machine except that now we do our printing while we are traveling along the edges, not in the states themselves. If there are 2 different edges from qi to qj, one an a-edge and one a b-edge, it is possible that they will have different printing instructions for us.

Definition A Mealy machine is a collection 4 things:
1. A finite set of states q0, q1, q2,... , where q0 is designated as the start state. 2. An alphabet of letters  = {a b c …} for forming input strings. 3. An alphabet of possible output characters  = {x y z …}

Definition (cont.) 4. A pictorial representation with states represented by small circles and directed edges indicating transitions between states. Each edge is labeled with a compound symbol of the form i/o, where i is an input letter and o is an output character. Every state must have exactly one outgoing edge for each possible input letter.

Example A Mealy machine: Test: aaabb Output = 01110
q0 a/0 b/1 q3 q2 q1 a/1 b/0 A Mealy machine: Test: aaabb Output = 01110 It is clearly seen that an a-edge and a b-edge from a state do not have the same output. However, each state must have exactly an a-edge and a b-edge due to they are only 2 letters in alphabet. Notice: output string of mealy machine has exactly the same number as the number of letters in input string.

Comma If there are 2 edges going in the same direction between the same pair of states, we can draw only one arrow and represent the choice of label by the usual comma. q4 q7 a/x b/y a/x, b/y =

Example A Mealy machine which prints out the 1’s complement of an input bit string. This means that we want to produce a bit string that has a 1 whenever the input string has a 0, and a 0 wherever the input has a 1. q0 0/1, 1/0

Example A Mealy machine called the increment machine.
Increment machine assumes that its input is a binary number and prints out the binary number that is one larger. For example: input sting = 1110 = 10112 we would like to get the output = 1210 = The machine will take an input string in reverse manner (backwards). input string = 1011 but we put it in the backwards direction 1101

increment machine The machine will have 3 states: start state
owe-carry state represents the overflow when two bits equal to 1 are added, we print a 0 and we carry a 1. no-carry state Overflow situation is when the input is 1111 then we get output 0000 start no carry owe carry 0/1 1/0 0/0, 1/1 Test Test: 11 = 10112 backwards (input)= 11012 output = 00112 reverse output = = 1210.

Mealy machine & sequential circuits
There is a connection between Mealy machine and sequential circuits that makes them a very valuable component of computer theory. Once we have an incrementer, we can build a machine that can perform the addition of binary numbers then we can use the 1’s complementing machine to build a subtracting machine based on the following principle

Mealy machine & sequential circuits
if a and b are strings of bits, then the subtraction a-b can be performed by (1) adding the 1’s complement of b to a, ignoring any overflow digit, (2) incrementing the result by 1.

Subtraction example 14-5 (decimal) = (binary) = ’s complement (0101) + 1 (binary) = (binary) = [1]1001 binary = 9 decimal (dropping the [1]) same trick works in decimal notation if we use 9’s complements: replace each digit d in the second number by the digit (9-d). e.g = [1]29 29

Moore = Mealy Definition:
“Given the Mealy machine Me and the Moore machine Mo, which prints the automatic start state character x, we will say that these two machine are equivalent if for every input string the output string from Mo is exactly x concatenated with the output from Me.”

THEOREM If Mo is a Moore machine, then there is a Mealy machine Me that is equivalent to it. Proof. (constructive algorithm) q0/t a b c becomes a/t q0 b/t c/t Consider any particular state in Mo (qi). It gives instructions to print a certain character (t). Change the label of every edges that enter this state as a/t or b/t or c/t … and let erase the t from inside the state qi If we repeat this procedure for every state, we turn Mo into Me.

THOEREM For every Mealy machine Me, there is a Moore machine Mo that is equivalent to it. Proof. (constructive algorithm) since the a-edges and b-edges connected to a particular state can have different instruction output therefore we need to split the state into 2 different output states.

Example b/0 q4 a/0 b/1 a/1 b q24/1 q14/0 a

Proof If there is a loop edge in Me, it may become two edges in Mo
one edge that is not loop and one that is loop.

Example If we repeat this procedure, it will produce Mo. a a/0 q13/0
b q3 b/1 q23/1 If we repeat this procedure, it will produce Mo.

Transducers as models of sequential circuits
We must have met these machines on computer logic or architecture. They are commonly used to describe the action of sequential circuits that involve flip-flops and other feedback electronic devices for which the output of the circuits is not only a function of the specific instantaneous inputs, but also a function of the previous state of the system. Automata with input and output are sometimes called transducers b/c of their connection to electronics.

Example NAND DELAY OR New B = old A
input output A B Possible 4 states are state changes according to the rules: state A B q0 q1 q2 q3 1 New B = old A New A = (input) NAND (old A or old B) Output = (input) or (old B)

Sequential circuit  Me
we are about to create a transition table which indicates also the output of each state with particular input 0/1. If we are in state q0 : old A = 0, old B = 0 and we read 0 new B = old A = 0 new A = (input) NAND (old A or old B) = 0 NAND 0 = 1 output = input or old B = 0 or 0 = 0 New state A = 1, B = 0 : q2 with output = 0

Sequential circuit  Me
If we are in state q0 : old A = 0, old B = 0 and we read 1 new B = old A = 0 new A = (input) NAND (old A or old B) = 1 NAND 0 = 1 output = input or old B = 1 or 0 = 1 New state A = 1, B = 0 : q2 with output = 1

Result If we repeat doing the same procedure to every states, we will get Mealy machine q0 q1 q3 q2 0/1 1/1 0/0,1/1 0/0 old state After input 0 output q0 q1 q2 q3 1 new state After input 1

BCS 2143 Theory of Computer Science

Similar presentations

Presentation on theme: "BCS 2143 Theory of Computer Science"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

BCS 2143 Theory of Computer Science

Similar presentations

Presentation on theme: "BCS 2143 Theory of Computer Science"— Presentation transcript:

Similar presentations

About project

Feedback