Theory of computing, part 3

Slides:



Advertisements
Similar presentations
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Advertisements

4b Lexical analysis Finite Automata
Lecture 24 MAS 714 Hartmut Klauck
Theory Of Automata By Dr. MM Alam
Regular Expressions and DFAs COP 3402 (Summer 2014)
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
CS5371 Theory of Computation
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Theory of computing, part 3. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
CHAPTER 1 Regular Languages
Theory of computing, part 1. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of.
Lecture # 12. Nondeterministic Finite Automaton (NFA) Definition: An NFA is a TG with a unique start state and a property of having single letter as label.
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
Finite State Machines 1.Finite state machines with output 2.Finite state machines with no output 3.DFA 4.NDFA.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 1 Regular Languages Some slides are in courtesy.
Theory of computing, part 4. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of.
Modeling Computation: Finite State Machines without Output
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
1 Section 11.2 Finite Automata Can a machine(i.e., algorithm) recognize a regular language? Yes! Deterministic Finite Automata A deterministic finite automaton.
1 Chapter Pushdown Automata. 2 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations.
Lecture 15: Theory of Automata:2014 Finite Automata with Output.
1/29/02CSE460 - MSU1 Nondeterminism-NFA Section 4.1 of Martin Textbook CSE460 – Computability & Formal Language Theory Comp. Science & Engineering Michigan.
Theory of Computation Automata Theory Dr. Ayman Srour.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Kleene’s Theorem and NFA
Finite State Machines Dr K R Bond 2009
Lexical analysis Finite Automata
Copyright © Cengage Learning. All rights reserved.
Regular Expressions.
Chapter 2 Finite Automata
Finite Automata a b A simplest computational model
Pushdown Automata.
Pushdown Automata.
Complexity and Computability Theory I
Lecture 9 Theory of AUTOMATA
[Week#03,04] (b) - Finite Automata
Chapter 2 FINITE AUTOMATA.
Jaya Krishna, M.Tech, Assistant Professor
REGULAR LANGUAGES AND REGULAR GRAMMARS
Hierarchy of languages
Non-deterministic Finite Automata (NFA)
Some slides by Elsa L Gunter, NJIT, and by Costas Busch
CS 154, Lecture 3: DFANFA, Regular Expressions.
Chapter Two: Finite Automata
Intro to Data Structures
Non Deterministic Automata
Theory of computing, part 4
NFAs and Transition Graphs
Finite Automata Reading: Chapter 2.
Finite Automata.
Finite Automata.
4b Lexical analysis Finite Automata
CSCI 2670 Introduction to Theory of Computing
4b Lexical analysis Finite Automata
Language Recognition (12.4)
CSC312 Automata Theory Transition Graphs Lecture # 9
Recap lecture 19 NFA corresponding to Closure of FA, Examples, Memory required to recognize a language, Example, Distinguishing one string from another,
CSCI 2670 Introduction to Theory of Computing
NFAs and Transition Graphs
Lecture 5 Scanning.
Presentation transcript:

Theory of computing, part 3

Course outline 1 Introduction 2 Theoretical background Biochemistry/molecular biology 3 Theoretical background computer science 4 History of the field 5 Splicing systems 6 P systems 7 Hairpins 8 Detection techniques 9 Micro technology introduction 10 Microchips and fluidics 11 Self assembly 12 Regulatory networks 13 Molecular motors 14 DNA nanowires 15 Protein computers 16 DNA computing - summery 17 Presentation of essay and discussion

Finite automata

Outline Introduction Deterministic finite automata (DFA’s) Non-deterministic finite automata (NFA’s) NFA’s to DFA’s Simplifying DFA’s Regular expressions  finite automata

A B Automatic one way door Consider the control system for a one-way swinging door: There are two states: Open and Closed It has two inputs, person detected at position A and person detected at position B If the door is closed, it should open only if a person is detected at A but not B Door should close only if no one is detected A B

Control schematic Open Closed A, no B No A or B A and B A, no B

Finite automaton A finite automaton is usually represented like this as a directed graph Two parts of a directed graph: The states (also called nodes or vertices) The edges with arrows which represent the allowed transitions One state is usually picked out as the starting point For so-called ‘accepting automata,’ some states are chosen to be final states

Strings and automata The input data are represented by a string over some alphabet and it determines how the machine progresses from state to state. Beginning in the start state, the characters of the input string cause the machine to change from one state to another. Accepting automata give only yes or no answers, depending on whether they end up in a ‘final state.’ Strings which end in a final state are accepted by the automaton.

Example The labeled graph in the figure above represents a FA over the alphabet Σ = {a, b} with start state 0 and final state 3. Final states are denoted by a double circle.

Deterministic finite automata (DFA’s) The previous graph was an example of a deterministic finite automaton – every node had two edges (a and b) coming out A DFA over a finite alphabet Σ is a finite directed graph with the property that each node emits one labeled edge for each distinct element of Σ.

More formally A DFA accepts a string w in Σ* if there is a path from the start state to some final state such that w is the concatenation of the labels on the edges of the path. Otherwise, the DFA rejects w . The set of all strings accepted by a DFA M is called the language of M and is denoted by L(M)

Example: (a+b)* Construct a DFA to recognize the regular languages represented by the regular expression (a + b)* over alphabet Σ = {a, b}. This is the set {a, b}* of all strings over {a, b}. This can be recognised by

Example: a(a+b)* Find a DFA to recognize the language represented by the regular expression a(a + b)* over the alphabet Σ = {a, b}. This is the set of all strings in Σ* which begin with a. One possible DFA is:

Example: pattern recognition Build a DFA to recognize the regular language represented by the regular expression (a + b)*abb over the alphabet Σ = {a, b}. The language is the set of strings that begin with anything, but must end with the string abb. Effectively, we’re looking for strings which have a particular pattern to them

Solution: (a+b)*abb The diagram below shows a DFA to recognize this language. If in state 1: the last character was a If in state 2 : the last two symbols were ab If in state 3: the last three were abb

State transition function

State transition function We can also represent a DFA by a state transition function, which we'll denote by T, where any state transition of the form is represented by: T(i,a) = j To describe a full DFA we need to know: what states there are, which are the start and final ones, the set of transitions between them.

Regular languages The class of regular languages is exactly the same as the class of languages accepted by DFAs! Kleene (1956) For any regular language, we can find a DFA which recognizes it!

Applications of DFA’s DFA’s are very often used for pattern matching, e.g. searching for words/structures in strings This is used often in UNIX, particularly by the grep command, which searches for combinations of strings and wildcards (*, ?) grep stands for Global (search for) Regular Expressions Parser DFA’s are also used to design and check simple circuits, verifying protocols, etc. They are of use whenever significant memory is not required

Non-deterministic finite automata DFA’s are called deterministic because following any input string, we know exactly which state its in and the path it took to get there For NFA’s, sometimes there is more than one direction we can go with the same input character Non-determinism can occur, because following a particular string, one could be in many possible states, or taken different paths to end at the same state!

NFA’s A non-deterministic finite automaton (NFA) over an alphabet Σ is a finite directed graph with each node having zero or more edges, Each edge is labelled either with a letter from Σ or with . Multiple edges may be emitted from the same node with the same label. Some letters may not have an edge associated with them. Strings following such paths are not recognised.

Non-determinism If an edge is labelled with the empty string , then we can travel the edge without consuming an input letter. Effectively we could be in either state, and so the possible paths could branch. If there are two edges with the same label, we can take either path. NFA’s recognise a string if any one of its many possible states following it is a final state Otherwise, it rejects it.

NFA’s versus DFA’s NFA for a*a : DFA for a*a : Why is the top an NFA while the bottom is a DFA?

Example Draw two NFAs to recognize the language of the regular expression ab + a*a. This NFA has a  edge, which allows us to travel to state 2 without consuming an input letter. The upper path corresponds to ab and the lower one to a*a

An equivalent NFA This NFA also recognizes the same language. Perhaps it's easier to see this by considering the equality ab + a*a = ab + aa*

NFA transition functions Since there may be non-determinism, we'll let the values of this function be sets of states. For example, if there are no edges from state k labelled with a, we'll write T(k, a) =  If there are three edges from state k, all labelled with a, going to states i, j and k, we'll write  T(k, a) = {i, j, k}

Comments on non-determinism All digital computers are deterministic; quantum computers may be another story! The usual mechanism for deterministic computers is to try one particular path and to backtrack to the last decision point if that path proves poor. Parallel computers make non-determinism almost realizable. We can let each process make a random choice at each branch point, thereby exploring many possible trees.

Some facts The class of regular languages is exactly the same as the class of languages accepted by NFAs! Rabin and Scott (1959) Just like for DFA’s! Every NFA has an equivalent DFA which recognises the same language.

From NFA’s to DFA’s We prove the equivalence of NFA’s and DFA’s by showing how, for any NFA, to construct a DFA which recognises the same language Generally the DFA will have more possible states than the NFA. If the NFA has n states, then the DFA could have as many as 2n states! Example: NFA has three states {A}, {B}, {C} the DFA could have eight: {}, {A}, {B}, {C}, {A, B}, {A, C}, {B, C}, {A, B, C} These correspond to the possible states the NFA could be in after any string

DFA construction Begin in the NFA start state, which could be a multiple state if its connected to any by  Determine the set of possible NFA states you could be in after receiving each character. Each set is a new DFA state, and is connected to the start by that character. Repeat for each new DFA state, exploring the possible results for each character until the system is closed DFA final states are any that contain a NFA final state

Example (a+b)*ab C A B b a NFA a,b A A,B A,C b a DFA b a a b The start state is A, but following an a you could be in A or B; following a b you could only be in state A A A,B A,C b a DFA b a a b

Summary Regular expressions represent the regular languages. DFA’s recognize the regular languages. NFA’s also recognize the regular languages.

Finite automata So far, we’ve introduced two kinds of automata: deterministic and non-deterministic. We’ve shown that we can find a DFA to recognise anything language that a given NFA recognises. We’ve asserted that both DFA’s and NFA’s recognise the regular languages, which themselves are represented by regular expressions. We prove this by construction, by showing how any regular expression can be made into a NFA and vice versa.

Regular expressions  finite automata Given a regular expression, we can find an automata which recognises its language. Start the algorithm with a machine that has a start state, a single final state, and an edge labelled with the given regular expression as follows:

Four step algoritm If an edge is labelled with , then erase the edge. Transform any diagram like into the diagram

Four step algoritm 3. Transform any diagram like into the diagram

Four step algoritm 4. Transform any diagram like into the diagram

Example a*+ab Construct a NFA for the regular expression, a* + ab Start with Apply rule 2 a* + ab a* ab

Example a*+ab a Apply rule 4 to a*   ab a Apply rule 3 to ab   a b

Finite automata  regular expressions 1 Create a new start state s, and draw a new edge labelled with  from s to the original start state. 2 Create a new final state f, and draw new edges labelled with  from all the original final states to f

Finite automata  regular expressions 3 For each pair of states i and j that have more than one edge from i to j, replace all the edges from i to j by a single edge labelled with the regular expression formed by the sum of the labels on each of the edges from i to j. 4 Construct a sequence of new machines by eliminating one state at a time until the only states remaining are s and the f.

Eliminating states As each state is eliminated, a new machine is constructed from the previous machine as follows: Let old(i,j) denote the label on edge i,j of the current machine. If no edge exists, label it . Assume that we wish to eliminate state k. For each pair of edges i,k (incoming edge) and k,j (outgoing edge) we create a new edge label new(i, j)

new(i,j) = old(i,j) + old(i, k) old(k, k)* old(k,j) Eliminate state k The label of this new edge is given by: new(i,j) = old(i,j) + old(i, k) old(k, k)* old(k,j) All other edges, not involving state k, remain the same: new(i, j) = old(i, j) After eliminating all states except s and f, we wind up with a two-state machine with the single edge s, f labelled with the desired regular expression new(s, f)

Example Initial DFA Steps 1 and 2 Add start and final states

Example Eliminate state 2 (No path to f) Eliminate state 0 Final regular expression

Finding simpler automata Sometimes our constructions lead to more complicated automata than we need, having more states than are really necessary Next, we look for ways of making DFA’s with a minimum number of states Myhill-Nerode theorem: ‘Every regular expression has a unique* minimum state DFA’ * up to a simple renaming of the states

Finding minimum state DFA Two steps to minimizing DFA: 1 Discover which, if any, pairs of states are indistinguishable. Two states, s and t, are equivalent if for all possible strings w, T(s,w) and T(t,w) are both either final or non-final. 2 Combine all equivalent states into a single state, modifying the transition functions appropriately.

Consider the DFA b a,b a 1 a b a 2 b b a,b a,b a States 1 and 2 are indistinguishable! Starting in either, b* is rejected and anything with a in it is accepted. b a,b a,b a

Part 1, finding indistinguishable pairs Remove all inaccessible states, where no path exists to them from start. Construct a grid of pairs of states. Begin by marking those pairs which are clearly distinguishable, where one is final and the other non-final. Next eliminate all pairs, which on the same input, lead to a distinguishable pair of states. Repeat until you have considered all pairs. The remaining pairs are indistinguishable.

Part 2, construct minimum DFA Construct a new DFA where any pairs of indistinguishable states form a single state in the new DFA. The start state will be the state containing the original start state. The final states will be those which contain original final states. The transitions will be the full set of transitions from the original states (these should all be consistent.)

Example 1 a b b b a,b a 2 4 b a a 3 What are the distinguishable pairs of states? Clearly, {0, 4} {1, 4} {2, 4} {3, 4} are all distinguishable because 4 is final but none of the others are.

Grid of pairs of state 1 2 3 4 ? ? ? ? ? ? x x x x 0 1 2 3 1 2 3 4 x ? ? ? ? ? x x x x 0 1 2 3 We eliminate these as possible indistinguishable pairs. Next consider {0, 1}. With input a, this becomes {3, 4} which is distinguishable, so {0, 1} is as well. Similarly, we can show {0, 2} and {0, 3} are also distinguishable, leading to the modified grid… 1 2 3 4 x x ? x ? ? x x x x 0 1 2 3

Remaining pairs We are left with {1, 2} given a {4, 4} given b {2, 1} These do not lead to pairs we know to be distinguishable, and are therefore indistinguishable!

Construct minimal DFA States 1, 2, and 3 are all indistinguishable, thus the minimal DFA will have three states: {0} {1, 2, 3} {4} Since originally T(0, a) = 3 and T(0, b) = 1, the new transitions are T(0, a) = T(0, b) = {1,2,3} Similarly, T({1,2,3}, a) = 4 and T({1,2,3}, b) = {1,2,3} Finally, as before, T(4, a) = 4 and T(4, b) = 4

Resulting minimal DFA b a,b a,b a 1, 2, 3 4 The resulting DFA is much simpler: b a,b a,b a 1, 2, 3 4 This recognises regular expressions of the form, (a + b) b* a (a + b)* This is the simplest DFA which will recognise this language!

conclusions We now have many equivalent ways of representing regular languages: DFA’s, NFA’s, regular expressions and regular grammars. We can also now simply(?!) move between these various representations. We’ll see next lecture that the automata representation leads to a simple way of recognising some languages which are not regular. We’ll also begin to consider more powerful language types and correspondingly more powerful computing models!

Formal definition M = (Q, Σ, δ, q0, F) Q = states a finite set Σ = alphabet a finite set δ = transition function a total function in Q  Σ  Q q0 = initial/starting state q0  Q F = final states F  Q