Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.

Slides:



Advertisements
Similar presentations
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Advertisements

Why empty strings? A bit like zero –you may think you can do without –but it makes definitions & calculations easier Definitions: –An alphabeth is a finite.
Specifying Languages Our aim is to be able to specify languages for use in the computer. The sketch of an FSA is easy for us to understand, but difficult.
Computational Complexity
CS 3240: Languages and Computation
Non-Deterministic Finite Automata
Formal Languages: main findings so far A problem can be formalised as a formal language A formal language can be defined in various ways, e.g.: the language.
Formal Languages Languages: English, Spanish,... PASCAL, C,... Problem: How do we define a language? i.e. what sentences belong to a language? e.g.Large.
Kleene's Theorem We have defined the regular languages, using regular expressions, which are convenient to write down and use. We have also defined the.
Theory of Computing Lecture 23 MAS 714 Hartmut Klauck.
Finite-State Machines with No Output Ying Lu
Finite State Machines Finite state machines with output
4b Lexical analysis Finite Automata
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 2 Mälardalen University 2005.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Chapter Section Section Summary Set of Strings Finite-State Automata Language Recognition by Finite-State Machines Designing Finite-State.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
61 Nondeterminism and Nodeterministic Automata. 62 The computational machine models that we learned in the class are deterministic in the sense that the.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Nondeterminism.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Normal forms for Context-Free Grammars
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
Introduction to Finite Automata Adapted from the slides of Stanford CS154.
Theory of Computing Lecture 22 MAS 714 Hartmut Klauck.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
Finite-State Machines with No Output
REGULAR LANGUAGES.
1 Unit 1: Automata Theory and Formal Languages Readings 1, 2.2, 2.3.
PZ02B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ02B - Regular grammars Programming Language Design.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Lecture 05: Theory of Automata:08 Kleene’s Theorem and NFA.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
CHAPTER 1 Regular Languages
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Modeling Computation: Finite State Machines without Output
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
BİL711 Natural Language Processing1 Regular Expressions & FSAs Any regular expression can be realized as a finite state automaton (FSA) There are two kinds.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Lecture #4 Thinking of designing an abstract machine acts as finite automata. Advanced Computation Theory.
  An alphabet is any finite set of symbols.  Examples: ASCII, Unicode, {0,1} ( binary alphabet ), {a,b,c}. Alphabets.
Lexical analysis Finite Automata
Non Deterministic Automata
Finite Automata a b A simplest computational model
Pushdown Automata.
Chapter 2 FINITE AUTOMATA.
Solution Prove by induction the following statement:
CSC 4170 Theory of Computation Nondeterminism Section 1.2.
Some slides by Elsa L Gunter, NJIT, and by Costas Busch
Non-Deterministic Finite Automata
Non-Deterministic Finite Automata
Intro to Data Structures
Non Deterministic Automata
Introduction to Finite Automata
Finite Automata.
4b Lexical analysis Finite Automata
Chapter Five: Nondeterministic Finite Automata
4b Lexical analysis Finite Automata
CSC 4170 Theory of Computation Nondeterminism Section 1.2.
What is it? The term "Automata" is derived from the Greek word "αὐτόματα" which means "self-acting". An automaton (Automata in plural) is an abstract self-propelled.
Presentation transcript:

Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising infinite languages? i.e. given a language description and a string, is there an algorithm which will answer yes or no correctly? We will define an abstract machine which takes a candidate string and produces the answer yes or no. The abstract machine will be the specification of the language.

Finite State Automata A finite state automaton is an abstract model of a simple machine (or computer). The machine can be in a finite number of states. It receives symbols as input, and the result of receiving a particular input in a particular state moves the machine to a specified new state. Certain states are finishing states, and if the machine is in one of those states when the input ends, it has ended successfully (or has accepted the input). Example: A a b b b a a a,b

Formal definition of FSAs Well present the general case In practice, well focus on a subset of particularly well behaved FSAs, known as deterministic FSAs (DFSAs)

FSA: Formal Definition A Finite State Automaton (FSA) is a 5-tuple (Q, I, F, T, E) where: Q = states = a finite set; I = initial states = a nonempty subset of Q; F = final states = a subset of Q; T = an alphabet; E = edges = a subset of Q (T + ) Q. FSA = labelled, directed graph =set of nodes (some final/initial) + directed arcs (arrows) between nodes + each arc has a label from the alphabet. Example: formal definition of A 1 Q = {1, 2, 3, 4} I = {1} F = {4} T = {a, b} E = { (1,a,2), (1,b,4), (2,a,3), (2,b,4), (3,a,3), (3,b,3), (4,a,2), (4,b,4) } A1A a b b b a a a,b

What does it mean to accept a string/language? If (x,a,y) is an edge, x is its start state and y is its end state. A path is a sequence of edges such that the end state of one is the start state of the next. path p 1 = (2,b,4), (4,a,2), (2,a,3) A path is successful if the start state of the first edge is an initial state, and the end state of the last is a final state. path p 2 = (1,b,4),(4,a,2),(2,b,4),(4,b,4) The label of a path is the sequence of edge labels. label(p 1 ) = baa.

What does it mean to accept a string/language? A string is accepted by a FSA if it is the label of a successful path. Let A be a FSA. The language accepted by A is the set of strings accepted by A, denoted L(A). babb = label(p 2 ) is accepted by A 1.

A string is accepted by a FSA if it is the label of a successful path. Let A be a FSA. The language accepted by A is the set of strings accepted by A, denoted L(A). babb = label(p 2 ) is accepted by A 1. The language accepted by A 1 is a b b b a a a,b

A string is accepted by a FSA if it is the label of a successful path. Let A be a FSA. The language accepted by A is the set of strings accepted by A, denoted L(A). babb = label(p 2 ) is accepted by A 1. The language accepted by A 1 is the set of strings of a's and b's which end in b, and in which no two a's are adjacent a b b b a a a,b

Some simple examples (assuming determinism) 1.Draw an FSA to accept the set of bitstrings starting with 0 2.Draw an FSA to accept the set of bitstrings ending with 0 3.Draw an FSA to accept the set of bitstrings containing a sequence 00 4.Can you draw an FSA to accept the set of bitstrings that contain an equal number of 0 and 1?

L1. Bitstrings starting with 0 q1 q2 q

L1. Bitstrings starting with 0 q1 q2 q Can you think of a smaller FSA that accepts the same language?

L2. Bitstrings ending with 0 q1 q2 q At home: Can you find a smaller FSA that accepts the same language?

L3. Bitstrings containing 00 q1 q3 0 1 q

L4. Bitstrings with equal numbers of 0 and 1 This cannot be done. FSAs are not powerful enough. Later we shall meet automata that can do it We shall also see: Some problems cannot be solved by any automaton –Course section on computability

Moving beyond DFSAs The above definition of an FSA allows nondeterminism So far, we have not exploited FSAs ability to behave in nondeterministic ways Lets see how an FSA can be nondeterministic

Non-Determinism 1.From one state there could be a number of edges with the same label. have to remember possible branching points as we trace out a path, and investigate all branches. 2.Some of the edges could be labelled with, the empty string how does this affect our algorithm? 3.May be more than one initial state where do we start? a a a b b Not Algorithmic a λ b Read ab (= aλb)

Deterministic FSA A FSA is deterministic if: (i) there are no -labelled edges; (ii) for any pair of state and symbol (q,t), there is at most one edge (q,t, p); and (iii) there is only one initial state. DFSA and NDFSA stand for deterministic and non-deterministic FSA respectively. At home: revise old definition of FSA, to become the definition of a DFSA. All three conditions must hold Note: acceptance (of a string or a language) has been defined in a declarative (i.e., non- procedural) way.

(Nondeterministic) Transition Functions The transition function of A is the function (x, t) = {y : edge (x, t, y) in A} If A = (Q,I,F,T,E), then the transition matrix for A is a matrix with one row for each state in Q, one column for each symbol in T, s.t. the entry in row x and column t is (x,t). Example from our deterministic FSA A 1 : ab 1{2}{4} 2{3}{4} 3{3}{3} 4{2}{4} where x denotes an initial state and x denotes a finish state.

Recognition Algorithm Problem: Given a DFSA, A = (Q,I,F,T,E), and a string w, determine whether w L(A). Note: denote the current state by q, and the current input symbol by t. Since A is deterministic, (q,t) will always be a singleton set or will be undefined. If it is undefined, denote it by ( Q). Algorithm: Add symbol # to end of w. q := initial state t := first symbol of w#. while (t # and q ) begin q := (q,t) t := next symbol of w# end return ((t == #) & (q F))

Equivalence of DFSA's and NDFSA's Theorem: DFSA = NDFSA Let L be a language. L is accepted by a NDFSA iff L is accepted by a DFSA Algorithm: NDFSA -> DFSA Proof omitted here There is an algorithm to create a DFSA equivalent (i.e. accepts same language) to a given NDFSA. The reverse direction is trivial.

Why study NDFSAs? It might appear that NDFSAs are useless Well soon see that they are not –λ transitions will be useful

Minimum Size of FSA's Let A = (Q,I,F,T,E). Definition: For any two strings, x, y in T*, x and y are distinguishable w.r.t. A if there is a string z T* s.t. exactly one of xz and yz are in L(A). z distinguishes x and y w.r.t. A. This means that with x and y as input, A must end in different states - A has to distinguish x and y in order to give the right results for xz and yz. This is used to prove the next result: Theorem: (proof omitted) Let L T*. If there is a set of n elements of T* s.t. any two of its elements are distinguishable w.r.t. A, then any FSA that recognises L must have at least n states.

Applying the theorem (1) L3 (above) = the set of bitstrings containing 00 Distinguishable: all of {11,10,00} {11,10}: 100 is in L, 110 is out, {11,00}: 001 is in L, 111 is out, {10,00}: 001 is in L, 101 is out. {11,10,00} has 3 elements, hence, the DFA for L3 requires at least 3 states

Applying the theorem (2) L4 again: equal numbers of 0 and 1 n=2: {01,001} need 2 states n=3: {01,001,0001} need 3 states n=4: {01,001,0001,00001} need 4 states … For any finite n, theres a set of n elements that are mutually distinguishable the FSA for L4 would need more than finitely many states, (which is not permitted)!

An early taste of the spirit of this course This theorem tells you something about the kind of automaton (in terms of its number of states) thats required given a particular kind of problem (i.e., a particular kind of language) It also tells you that certain languages cannot be accepted by any FSA We now move to a different way in which to formally define a language