Theory of computing, part 1. Von Neumann Turing machine Finite state machines NP complete problems -maximum clique -travelling salesman problem -colour.

Slides:



Advertisements
Similar presentations
Natural Language Processing - Formal Language - (formal) Language (formal) Grammar.
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Regular Grammars Formal definition of a regular expression.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Normal forms for Context-Free Grammars
CS 3240 – Chuck Allison.  A model of computation  A very simple, manual computer (we draw pictures!)  Our machines: automata  1) Finite automata (“finite-state.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Chapter 2 Languages.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
L ECTURE 1 T HEORY OF A UTOMATA. P RAGMATICS  Pre-Requisites  No Pre-Requisite  Text book  Introduction to Computer Theory by Daniel I.A. Cohen 
1 Chapter 1 Automata: the Methods & the Madness Angkor Wat, Cambodia.
Lecture Two: Formal Languages Formal Languages, Lecture 2, slide 1 Amjad Ali.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
1 INFO 2950 Prof. Carla Gomes Module Modeling Computation: Language Recognition Rosen, Chapter 12.4.
1 Chapter 1 Introduction to the Theory of Computation.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Grammars CPSC 5135.
Theory of computing, part 3. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of.
1 Introduction to Regular Expressions EELS Meeting, Dec Tom Horton Dept. of Computer Science Univ. of Virginia
CS 3813: Introduction to Formal Languages and Automata
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
Introduction to Language Theory
Module 2 How to design Computer Language Huma Ayub Software Construction Lecture 8.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Language: Set of Strings
Theory of computing, part 1. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of.
Enter Chomsky Grammars. 2 What has Chomsky* to do with computing? Linguistics and computing intersect at various places: Things that are used to create.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 203: Introduction to Formal Languages and Automata
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Theory of computing, part 4. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Formal Languages and Grammars
Theory of computation Introduction theory of computation: It comprises the fundamental mathematical properties of computer hardware, software,
Lecture # 4.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
1 Section 11.1 Regular Languages Problem: Suppose the input strings to a program must be strings over the alphabet {a, b} that contain exactly one substring.
1 Chapter Regular Languages. 2 Section 11.1 Regular Languages Problem: Suppose the input strings to a program must be strings over the alphabet.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:
Lecture 03: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Theory of Languages and Automata By: Mojtaba Khezrian.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
CS6800 Advance Theory of Computation Spring 2016 Nasser Alsaedi
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Lecture #2 Advanced Theory of Computation. Languages & Grammar Before discussing languages & grammar let us deal with some related issues. Alphabet: is.
Chapter 1 INTRODUCTION TO THE THEORY OF COMPUTATION.
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Theory of Computation Lecture #
Context-Free Grammars: an overview
Lecture 1 Theory of Automata
Complexity and Computability Theory I
Natural Language Processing - Formal Language -
Context-Free Languages
REGULAR LANGUAGES AND REGULAR GRAMMARS
A HIERARCHY OF FORMAL LANGUAGES AND AUTOMATA
Intro to Data Structures
CHAPTER 2 Context-Free Languages
Chapter 1 Introduction to the Theory of Computation
COMPILER CONSTRUCTION
Presentation transcript:

Theory of computing, part 1

Von Neumann Turing machine Finite state machines NP complete problems -maximum clique -travelling salesman problem -colour graph

1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing systems 6P systems 7Hairpins 8Detection techniques 9Micro technology introduction 10Microchips and fluidics 11Self assembly 12Regulatory networks 13Molecular motors 14DNA nanowires 15Protein computers 16DNA computing - summery 17Presentation of essay and discussion Course outline

Old computers

Abacus

Born December 26, 1791 in Teignmouth, Devonshire UK, Died 1871, London; Known to some as the Father of Computing for his contributions to the basic design of the computer through his Analytical machine.. The difference engine (1832)

The ENIAC machine occupied a room 30x50 feet. The controls are at the left, and a small part of the output device is seen at the right. ENIAC (1946)

The IBM 360 was a revolutionary advance in computer system architecture, enabling a family of computers covering a wide range of price and performance. ENIAC (1946)

Books

Introduction

 Finite state machines (automata) Pattern recognition Simple circuits (e.g. elevators, sliding doors)  Automata with stack memory (pushdown autom.) Parsing computer languages  Automata with limited tape memory  Automata with infinite tape memory Called `Turing machines’, Most powerful model possible Capable of solving anything that is solvable Models of computing

 Regular grammars  Context free grammars  Context sensitive grammars  Unrestricted grammars Chomsky hierarchy of grammars

Computers can be made to recognize, or accept, the strings of a language. There is a correspondence between the power of the computing model and the complexity of languages that it can recognize!  Finite automata only accept regular grammars.  Push down automata can also accept context free grammars.  Turing machines can accept all grammars. Computers can recognise languages

Languages

 A set is a collection of things called its elements. If x is an element of set S, we can write this: x  S  A set can be represented by naming all its elements, for example: S = {x, y, z}  There is no particular order or arrangement of the elements, and it doesn’t matter if some appear more than once. These are all the same set: {x, y, z}={y, x, z}={y, y, x, z, z} Sets

 A set with no elements is called the empty set, or a null set. It is denoted by  ={}.  If every element of set A is also an element of set B, then A is called a subset of B : A  B  The union of two sets is the set with all elements which appear in either set: C = A  B  The intersection of two sets is the set with all the elements which appear in both sets: C = A  B Combining sets

 A string a sequence of symbols that are placed next to each other in juxtaposition  The set of symbols which make up a string are taken from a finite set called an alphabet E.g. {a, b, c} is the alphabet for the string abbacb.  A string with no elements is called an empty string and is denoted .  If Σ is an alphabet, the infinite set of all strings made up from Σ is denoted Σ*. E.g., if Σ ={a}, then Σ *={ , a, aa, aaa, …} Alphabet and strings

 A language is a set of strings.  If Σ is an alphabet, then a language over Σ is a collection of strings whose components come from Σ.  So Σ* is the biggest possible language over Σ, and every other language over Σ is a subset of Σ*. Languages

 Four simple examples of languages over an alphabet Σ are the sets , {  }, Σ, and Σ *.  For example, if Σ={a} then these four simple languages over Σ are ,{  },{a},and{ , a, aa, aaa, …}.  Recall {  } is the empty string while  is the empty set. Σ* is an infinite set. Examples of languages

 The alphabet is Σ = {a,b,c,d,e…x,y,z}  The English language is made of strings formed from Σ: e.g. fun, excitement.  We could define the English Language as the set of strings over Σ which appear in the Oxford English dictionary (but it is clearly not a unique definition). Example: English

 The natural operation of concatenation of strings places two strings in juxtaposition.  For example, if then the concatenation of the two strings aab and ba is the string aabba.  Use the name cat to denote this operation. cat(aab, ba) = aabba. Concatenation

 Languages are sets of strings, so they can be combined by the usual set operations of union, intersection, difference, and complement.  Also we can combine two languages L and M by forming the set of all concatenations of strings in L with strings in M. Combining languages

 This new language is called the product of L and M and is denoted by L  M.  A formal definition can be given as follows: L  M = {cat(s, t) | s  L and t  M}  For example, if L = {ab, ac} and M = {a, bc, abc}, then the product LM is the language L  M = {aba, abbc, ababc, aca, acbc, acabc} Products of languages

 The following simple properties hold for any language L: L  {  } = {  }  L = L L   =   L =   The product is not commutative. In other words, we can find two languages L and M such that L  M  M  L  The product is associative. In other words, if L, M, and N are languages, then L  (M  N) = (L  M)  N Properties of products

 If L is a language, then the product L  L is denoted by L 2.  The language product L n for every n  {0, 1, 2, …} is as follows: L 0 = {  } L n = L  L n-1 if n > 0 Powers of languages

For example, if L = { a, bb} then the first few powers of L are L 0 = {  } L 1 = L = {a, bb} L 2 = L  L = {aa, abb, bba, bbbb} L 3 = L  L 2 = {aaa, aabb, abba, abbbb, bbaa, bbabb, bbbba, bbbbbb} Example

 If L is a language over Σ (i.e. L  A*) then the closure of L is the language denoted by L* and is defined as follows: L* = L 0  L 1  L 2  …  The positive closure of L is the language denoted by L + and defined as follows: L + = L 1  L 2  L 3  … Closure of a language

 It follows that L* = L +  {  }. But it’s not necessarily true that L + = L* - {  }  For example, if we let our alphabet be Σ = {a} and our language be L = { , a}, then L + = L*  Can you find a condition on a language L such that L + = L* - {  }? L* vs. L+

The closure of Σ coincides with our definition of Σ* as the set of all strings over Σ. In other words, we have a nice representation of Σ* as follows: A* = A 0  A 1  A 2  …. Closure of an alphabet

Let L and M be languages over the alphabet Σ. Then: a){  }* =  * = {  } b)L* = L*  L* = (L*)* c)  L if and only if L + = L* d)(L*  M*)* = (L*  M*)* = (L  M)* e)L  (M  L)* = (L  M)*  L Properties of closure

Grammars

 A grammar is a set of rules used to define the structure of the strings in a language.  If L is a language over an alphabet Σ, then a grammar for L consists of a set of grammar rules of the following form:     where  and  denote strings of symbols taken from Σ and from a set of grammar symbols (non-terminals) that is disjoint from Σ Grammars

A grammar rule    is often called a production, and it can be read in any of several ways as follows: replace  by   produces   rewrites to   reduces to  Productions

 Every grammar has a special grammar symbol called a start symbol, and there must be at least one production with left side consisting of only the start symbol.  For example, if S is the start symbol for a grammar, then there must be at least one production of the form S   Where to begin ……

1.An alphabet N of grammar symbols called non-terminals. (Usually upper case letters.) 2.An alphabet T of symbols called terminals. (Identical to the alphabet of the resulting language.) 3.A specific non-terminal called the start symbol. (Usually S. ) 4.A finite set of productions of the form   , where  and  are strings over the alphabet N  T The 4 parts of a grammar

Let Σ = {a, b, c}. Then a grammar for the language Σ* can be described by the following four productions: S   S  aS S  bS S  cS Or in shorthand: S   | aS | bS | cS S can be replaced by either , or aS, or bS, or cS. Example

 A string made up of grammar symbols and language strings is called a sentential form.  If x and y are sentential forms and    is a production, then the replacement of  by  in x  y is called a derivation, and we denote it by writing x  y  x  y String

S   | aS | bS | cS, S  aS S  aS  aaS. S  aS  aaS  aacS  aacbS.. S  aS  aaS  aacS  aacbS  aacb  = aacb A short hand way of showing a derivation exists: S  * aacb Sample derivation

The following three symbols with their associated meanings are used quite often in discussing derivations:  derives in one step,  + derives in one or more steps,  *derives in zero or more steps. Other shorthand

S  AB A   | aA B   | bB We can deduce that the grammar non-terminal symbols are S, A, and B, the start symbol is S, and the language alphabet includes , a, and b. A more complex grammar

 Let's consider the string aab.  The statement S  + aab means that there exists a derivation of aab that takes one or more steps.  For example, we have S  AB  aAB  aaAB  aaB  aabB  aab Another derivation

If G is a grammar, then the language of G is the set of language strings derived from the start symbol of G. The language of G is denoted by: L(G) Grammar specifies the language

If G is a grammar with start symbol S and set of language strings T, then the language of G is the following set: L(G) = {s | s  T* and S  + s} Grammar specifies the language

If the language is finite, then a grammar can consist of all productions of the form S  w for each string w in the language. For example, the language {a, ba} can be described by the grammar S  a | ab. Finite languages

 If the language is infinite, then some production or sequence of productions must be used repeatedly to construct the derivations.  Notice that there is no bound on the length of strings in an infinite language.  Therefore there is no bound on the number of derivation steps used to derive the strings.  If the grammar has n productions, then any derivation consisting of n + 1 steps must use some production twice Infinite languages

 For example, the infinite language {a n b | n  0 } can be described by the grammar, S  b | aS  To derive the string a n b, use the production S  aS  repeatedly --n times to be exact-- and then stop the derivation by using the production S  b  The production S  aS allows us to say If S derives w, then it also derives aw Finite languages

 A production is called recursive if its left side occurs on its right side.  For example, the production S  aS is recursive.  A production S   is indirectly recursive if S derives (in two or more steps) a sentential form that contains S. Recursion

 For example, suppose we have the following grammar: S  b | aA A  c | bS  The productions S  aA and A  bS are both indirectly recursive S  aA  abS A  bS  baA Indirect recursion

 A grammar is recursive if it contains either a recursive production or an indirectly recursive production.  A grammar for an infinite language must be recursive!  However, a given language can have many grammars which could produce it. Recursion

 Suppose M and N are languages. We can describe them with grammars have disjoint sets of nonterminals.  Assign the start symbols for the grammars of M and N to be A and B, respectively: M : A   N :B  …  Then we have the following rules for creating new languages and grammars: Combining grammars

The union of the two languages, M  N, starts with the two productions S  A | B followed by the grammars of M and N A   … B   … Combining grammars, union rule

Similarly, the language M  N starts with the production S  AB followed by, as above, A   … B   … Combining grammars, product rule

Finally, the grammar for the closure of a language, M*, starts with the production S  AS |  followed by the grammar of M A   … Combining grammars, closure rule

Suppose we want to write a grammar for the following language: L = { , a, b, aa, bb,..., a n, b n,..} L is the union of the two languages M = { a n | n  N} N = { b n | n  N} Example, union

Thus we can write a grammar for L as follows: S  A | B union rule, A   | aAgrammar for M, B   | bBgrammar for N. Example, union

Suppose we want to write a grammar for the following language: L = { a m b n | m,n  N} L is the product of the two languages M = {a m | m  N} N = {b n | n  N} Example, product

Thus we can write a grammar for L as follows: S  AB product rule, A   | aA grammar for M B   | bBgrammar for N Example, product

Suppose we want to construct the language L of all possible strings made up from zero or more occurrences of aa or bb. L = {aa, bb}* = M* M = {aa, bb} Example, closure

So we can write a grammar for L as follows: S  AS |  closure rule, A  aa | bbgrammar for {aa, bb}. Example, closure

We can simplify this grammar:  Replace the occurrence of A in S  AS by the right side of A  aa to obtain the production S  aaS.   Replace A in S  AS by the right side of A  bb to obtain the production S  bbS.  This allows us to write the the grammar in simplified form as: S  aaS | bbS |  An equivalent grammar

LanguageGrammar {a, ab, abb, abbb} S  a | ab | abb | abbb { , a, aa, aaa, …}S  aS |  {b, bbb, bbbbb, … b 2n+1 } S  bbS | b {b, abc, aabcc, …, a n bc n } S  aSc | b {ac, abc, abbc, …, ab n c} S  aBc B  bB |  Some simple grammars

Regular languages

There are many possible ways of describing the regular languages:  Languages that are accepted by some finite automaton  Languages that are inductively formed from combining very simple languages  Those described by a regular expression  Any language produced by a grammar with a special, very restricted form What is a regular language?

We start with a very simple basis of languages and build more complex ones by combining them in particular ways: Basis: , {  } and {a} are regular languages for all a  Σ. Induction: If L and M are regular languages, then the following languages are also regular: L  M, L  M and L* Building a regular language

For example, the basis of the definition gives us the following four regular languages over the alphabet Σ = {a,b}: ,{  },{a}, {b} Sample building blocks

Regular languages over {a, b}. Language { , b} We can write it as the union of the two regular languages {  } and {b}: { ,b} = {  }  {b} Example 1

Language {a, ab } We can write it as the product of the two regular languages {a} and { , b}: {a, ab} = {a}  { , b} Example 2

Language { , b, bb, …, b n,…} It's just the closure of the regular language {b}: {b}* = { , b, bb,..., b n,...} Example 3

{a, ab, abb,..., ab n,...} = {a}  { , b, bb,..., b n,... } = {a}  {b}* Example 4

{ , a, b, aa, bb,..., a n, …, b m,...} = {a}*  {b}* Example 5

 A regular expression is basically a shorthand way of showing how a regular language is built from the basis  The symbols are nearly identical to those used to construct the languages, and any given expression has a language closely associated with it  For each regular expression E there is a regular language L(E) Regular expressions

The symbols of the regular expressions are distinct from those of the languages Regular expressionLanguage  L (  ) =   L (  ) = {  } aL {a} = {a} Basis of regular expressionslanguages Regular expressions versus languages

There are two binary operations on regular expressions (+ and  ) and one unary operator (*) Regular expressionLanguage R + SL (R + S ) = L (R )  L (S ) R  S, R SL (R  S ) = L (R )  L (S ) R*L (R* ) = L (R )* These are closely associated with the union, product and closure operations on the corresponding languages Operators on regular expressions

Like the languages they represent, regular expressions can be manipulated inductively to form new regular expressions Basis: ,  and a are regular expressions for all a  Σ. Induction: If R and S are regular expressions, then the following expressions are also regular: (R), R + S, R  S and R* Building regular expressions

For example, here are a few of the infinitely many regular expressions over the alphabet Σ = {a, b }: , , a, b  + b, b*, a + (b  a), (a + b)  a, a  b*, a* + b* Regular expressions

 To avoid using too many parentheses, we assume that the operations have the following hierarchy: *highest (do it first)  +lowest (do it last)  For example, the regular expression a + b  a*  can be written in fully parenthesized form as (a + (b  (a*))) Order of operations

 Use juxtaposition instead of  whenever no confusion arises. For example, we can write the preceding expression as a + ba*  This expression is basically shorthand for the regular language {a}  ({b}  ({a}*))  So you can see why it is useful to write an expression instead! Implicit products

Find the language of the regular expression a + bc* L(a + bc*) = L(a)  L(bc*) = L(a)  (L(b)  L(c*)) = L(a)  (L(b)  L(c)*) = {a}  ({b}  {c}*) = {a}  ({b)  { , c, c 2,., c n,…}) = {a}  {b, bc, bc 2, bc n,…} = {a, b, bc, bc 2,..., bc n,...}. Example

Many infinite languages are easily seen to be regular. For example, the language {a, aa, aaa,..., a n,... } is regular because it can be written as the regular language {a}  {a}*, which is represented by the regular expression aa*. Regular language

The slightly more complicated language { , a, b, ab, abb, abbb,..., ab n,...} is also regular because it can be represented by the regular expression  + b + ab* However, not all infinite languages are regular! Regular language

 Distinct regular expressions do not always represent distinct languages.  For example, the regular expressions a + b and b + a are different, but they both represent the same language, L(a + b) = L(b + a) = {a, b} Regular language

We say that regular expressions R and S are equal if L(R) = L(S) and we denote this equality by writing the following familiar relation: R = S Regular language

For example, we know that L (a + b) = {a, b} = {b, a} = L (b + a) Therefore we can write a + b = b + a We also have the equality (a + b) + (a + b) = a + b Regular language

Properties of regular language Additive (+) properties: R + T = T + R R +  =  + R= R R + R = R (R +S) +T = R + (S+ T) These follow simply from the properties of the union of sets

Properties of regular language Product () properties R  =  R =  R  =  R = R (RS)T =R(ST) Distributive properties R(S + T) = RS + RT (S + T)R = SR +TR

 * =  * =  R* = R*R* = (R*)* = R+R* RR* = R*R R(SR)* = (RS)* R (R+S)* = (R*S*)* = (R* + S*)* = R*(SR*)* Closure properties

Show that (  + a + b)* = a*(ba*)* (  + a + b)*= (a + b)*(+ property) = a*(ba*)*(closure property) Example

Regular grammars

A regular grammar is one where each production takes one of the following forms: S   S  w S  T S  wT where the capital letters are non- terminals and w is a non-empty string of terminals Regular grammar

 Only one nonterminal can appear on the right side of a production. It must appear at the right end of the right side.  Therefore the productions A  aBc and S  TU are not part of a regular grammar, but the production A  abcA is. Regular grammar