Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.

Slides:



Advertisements
Similar presentations
Theory Of Automata By Dr. MM Alam
Advertisements

1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Regular Grammars Formal definition of a regular expression.
CS 3240 – Chapter 3.  How would you delete all C++ files from a directory from the command line?  How about all PowerPoint files that start with the.
CS5371 Theory of Computation
Costas Busch - RPI1 Grammars. Costas Busch - RPI2 Grammars Grammars express languages Example: the English language.
1 Regular Grammars Generate Regular Languages. 2 Theorem Regular grammars generate exactly the class of regular languages: If is a regular grammar then.
CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)
1 Reverse of a Regular Language. 2 Theorem: The reverse of a regular language is a regular language Proof idea: Construct NFA that accepts : invert the.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Normal forms for Context-Free Grammars
CS 3240 – Chuck Allison.  A model of computation  A very simple, manual computer (we draw pictures!)  Our machines: automata  1) Finite automata (“finite-state.
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
Theory of Languages and Automata
Formal Methods in SE Theory of Automata Qasiar Javaid Assistant Professor Lecture # 06.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
1 Language Definitions Lecture # 2. Defining Languages The languages can be defined in different ways, such as Descriptive definition, Recursive definition,
1 INFO 2950 Prof. Carla Gomes Module Modeling Computation: Language Recognition Rosen, Chapter 12.4.
Automata, Computability, & Complexity by Elaine Rich ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Slides provided by author Slides edited for.
1 Chapter 1 Introduction to the Theory of Computation.
Lecture # 3 Regular Expressions 1. Introduction In computing, a regular expression provides a concise and flexible means to "match" (specify and recognize)
Grammars CPSC 5135.
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
1 Introduction to Regular Expressions EELS Meeting, Dec Tom Horton Dept. of Computer Science Univ. of Virginia
Module 2 How to design Computer Language Huma Ayub Software Construction Lecture 8.
L ECTURE 3 Chapter 4 Regular Expressions. I MPORTANT T ERMS Regular Expressions Regular Languages Finite Representations.
Regular Grammars Chapter 7. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
Regular Grammars Chapter 7 1. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
Context Free Grammars.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
CHAPTER 1 Regular Languages
Copyright © Curt Hill Finite State Automata Again This Time No Output.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
CS 3813: Introduction to Formal Languages and Automata
CS 203: Introduction to Formal Languages and Automata
Recursive Definations Regular Expressions Ch # 4 by Cohen
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Three Basic Concepts Languages Grammars Automata.
Regular Grammars Reading: 3.3. What we know so far…  FSA = Regular Language  Regular Expression describes a Regular Language  Every Regular Language.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2007.
Lecture # 4.
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
Regular Expressions CS 130: Theory of Computation HMU textbook, Chapter 3.
1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2006.
Lecture 02: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Lecture 03: Theory of Automata:2014 Asif Nawaz Theory of Automata.
1 1. Eliminate all  -transitions from the following FA without changing the number of states and the language accepted by the automaton. You should also.
Lecture 17: Theory of Automata:2014 Context Free Grammars.
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Chapter 1 INTRODUCTION TO THE THEORY OF COMPUTATION.
Theory of Computation Lecture #
Regular Languages, Regular Operations, Closure
Context-Free Grammars: an overview
Jaya Krishna, M.Tech, Assistant Professor
Context-Free Languages
REGULAR LANGUAGES AND REGULAR GRAMMARS
Chapter 7 Regular Grammars
CHAPTER 2 Context-Free Languages
Chapter 1 Introduction to the Theory of Computation
Recap Lecture 3 RE, Recursive definition of RE, defining languages by RE, { x}*, { x}+, {a+b}*, Language of strings having exactly one aa, Language of.
Presentation transcript:

Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars

Languages  A language (over an alphabet Σ) is any subset of the set of all possible strings over Σ. The set of all possible strings is written as Σ*.  Example: Σ = {a, b, c} Σ = {a, b, c} Σ* = {, a, b, c, ab, ac, ba, bc, ca, aaa, …} Σ* = {, a, b, c, ab, ac, ba, bc, ca, aaa, …} one language might be the set of strings of length less than or equal to 2. one language might be the set of strings of length less than or equal to 2. L = {, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc}

Regular Languages  A regular language (over an alphabet Σ) is any language for which there exists a finite automaton that recognizes it.

Mathematical Models of Computation  This course studies a variety of mathematical models corresponding to notions of computation.  The finite automaton was our first example.  The finite automaton is an example of an automaton model.  There are other models as well.

Mathematical Models of Computation  Another important model is that of a grammar.  We will shortly look at regular grammars.  But first, a digression:

Regular Expressions  A regular expression is a mathematical model for describing a particular type of language.  Regular expressions are kind of like arithmetic expressions.  The regular expression is defined recursively.

Regular Expressions  Given an alphabet Σ , λ and a  Σ are all regular expressions. , λ and a  Σ are all regular expressions. If r 1 and r 2 are regular expressions, then so are r 1 + r 2, r 1  r 2, r 1 * and (r 1 ). If r 1 and r 2 are regular expressions, then so are r 1 + r 2, r 1  r 2, r 1 * and (r 1 ). Note: we usually write r 1  r 2 as r 1 r 2.Note: we usually write r 1  r 2 as r 1 r 2. These are the only things that are regular expressions. These are the only things that are regular expressions. empty set empty string

Regular Expressions  Meaning:  represents the empty language  represents the empty language λ represents the language {λ} λ represents the language {λ} a represents the language {a} a represents the language {a} r 1 + r 2 represents the language L(r 1 )  L(r 2 ) r 1 + r 2 represents the language L(r 1 )  L(r 2 ) r 1  r 2 represents L(r 1 ) L(r 2 ) r 1  r 2 represents L(r 1 ) L(r 2 ) r 1 * represents (L(r 1 ))* r 1 * represents (L(r 1 ))*

Regular Expressions  Example 1: What does a*(a + b) represent? What does a*(a + b) represent? It represents zero or more a's followed by either an a or a b. It represents zero or more a's followed by either an a or a b. {a, b, aa, ab, aaa, aab, aaaa, aaab …} {a, b, aa, ab, aaa, aab, aaaa, aaab …}

Regular Expressions  Example 2: What does (a + b)*(a + bb) represent? What does (a + b)*(a + bb) represent? It represents zero or more symbols, each of which can be an a or a b, followed by either a or bb. It represents zero or more symbols, each of which can be an a or a b, followed by either a or bb. {a, bb, aa, abb, ba, bbb, aaa, aabb, aba, abbb, baa, babb, bba, bbbb, …} {a, bb, aa, abb, ba, bbb, aaa, aabb, aba, abbb, baa, babb, bba, bbbb, …}

Regular Expressions  Example 3: What does (aa)*(bb)*b represent? What does (aa)*(bb)*b represent? All strings over {a, b} that start with an even number of a's which are then followed by an odd number of b's. All strings over {a, b} that start with an even number of a's which are then followed by an odd number of b's. It's important to understand the underlying meaning of a regular expression. It's important to understand the underlying meaning of a regular expression.

Regular Expressions  Example 4: Find a regular expression for strings of 0's and 1's which have at least one pair of consecutive 0's. Find a regular expression for strings of 0's and 1's which have at least one pair of consecutive 0's. Each such string must have a 00 somewhere in it. Each such string must have a 00 somewhere in it. It could have any string in front of it and any string after it, as long as it's there!!! It could have any string in front of it and any string after it, as long as it's there!!! Any string is represented by (0 + 1)* Any string is represented by (0 + 1)* Answer: (0 + 1)*00(0 + 1)* Answer: (0 + 1)*00(0 + 1)*

Regular Expressions  Example: Find a regular expression for strings of 0's and 1's which have no pairs of consecutive 0's. Find a regular expression for strings of 0's and 1's which have no pairs of consecutive 0's. It's a repetition of strings that are either 1's or, if a substring begins with 0, it must be followed by at least one 1.It's a repetition of strings that are either 1's or, if a substring begins with 0, it must be followed by at least one 1. ( *)*( *)* or equivalently, (1 + 01)*or equivalently, (1 + 01)* But such strings can't end in a 0.But such strings can't end in a 0.

Regular Expressions  Example: Find a regular expression for strings of 0's and 1's which have no pairs of consecutive 0's. Find a regular expression for strings of 0's and 1's which have no pairs of consecutive 0's. ( *)*( *)* (1 + 01)*(1 + 01)* But such strings can't end in a 0.But such strings can't end in a 0. So we add (0 + λ) to the end to allow for this.So we add (0 + λ) to the end to allow for this. (1 + 01)* (0 + λ)(1 + 01)* (0 + λ) This is only one of many possible answers. This is only one of many possible answers.

Regular Expressions  Why are they called regular expressions?  Because, as it turns out, the set of languages they describe is that of the regular languages.  That means that regular expressions are just another model for the same thing as finite automata.

Regular Expressions  Homework: Chapter 3, Section 1 Chapter 3, Section 1 Problems 1-11, 17, 18Problems 1-11, 17, 18

Regular Expressions and Regular Languages  As we have said, regular expressions and finite automata are really different ways of expressing the same thing.  Let's see why.  Given a regular expression, how can we build an equivalent finite automaton?  (We won't bother going the other way, although it can be done.)

Regular Expressions and Regular Languages  Clearly there are simple finite automata corresponding to the simple regular expressions:  λ λ a λ a Note that each of these has an initial state and one accepting state.

Regular Expressions and Regular Languages  On the previous slide, we saw that the simplest regular expressions can be represented by a finite automaton with an initial state (duh!) and one isolated accepting state:

Regular Expressions and Regular Languages  We can build more complex automata for more complex regular expressions using this model:

Regular Expressions and Regular Languages  Here's how we build an nfa for r 1 + r 2 : λ λλ λ r1r1r1r1 r2r2r2r2 r 1 + r 2

Regular Expressions and Regular Languages  Here's how we build an nfa for r 1 r 2 : r1r1r1r1 r2r2r2r2 λ λ λ r1 r2r1 r2r1 r2r1 r2

Regular Expressions and Regular Languages  Here's how we build an nfa for (r 1 )*: λλ λ λ r1r1r1r1 (r 1 )* λ Note: the last state added is not in book. For safety, I do it to have only one arc going into the final state.

Building an nfa from a regular expression  Example: Consider the regular expression (a + bb)(a+b)*(bb) Consider the regular expression (a + bb)(a+b)*(bb) a b b λ λ λ λ λ λ a b λ λ λ λ λ λ λ λ b b sometimes we just get tired and take an obvious shortcut

Building regular expression from a finite automaton  The book goes on to show that it works the other way around as well: we can find a corresponding regular expression for any finite automaton.  It's fairly easy in some cases and you can "just do it."  However, it's generally complicated and not worth the bother studying.  You are not responsible for this material

Building regular expression from a finite automaton  The above automaton clearly corresponds to a*(a+b)c* a, b c a

Regular Expressions and nfa's  Homework: Chapter 3, Section 2 Chapter 3, Section 2 Problems 1-5Problems 1-5

Regular Grammars  Review: A grammar is a quadruple G = (V, T, S, P) where V is a finite set of variables V is a finite set of variables T is a finite set of symbols, called terminals T is a finite set of symbols, called terminals S is in V and is called the start symbol S is in V and is called the start symbol P is a finite set of productions, which are rules of the form α → β P is a finite set of productions, which are rules of the form α → β where α and β are strings consisting of terminals and variables.where α and β are strings consisting of terminals and variables.

Regular Grammars  A grammar is said to be right-linear if every production in P is of the form A → xB or A → xB or A → x A → x where A and B are variables (perhaps the same, perhaps the start symbol S) in V where A and B are variables (perhaps the same, perhaps the start symbol S) in V and x is any string of terminal symbols (including the empty string λ) and x is any string of terminal symbols (including the empty string λ)

Regular Grammars  An alternate (and better) definition of a right- linear grammar says that every production in P is of the form A → aB or A → aB or A → a or A → a or S → λ (to allow λ to be in the language) S → λ (to allow λ to be in the language) where A and B are variables (perhaps the same, but B can't be S) in V where A and B are variables (perhaps the same, but B can't be S) in V and a is any terminal symbol and a is any terminal symbol

Regular Grammars  The reason I prefer the second definition (although I accept the first one that happens to be used in the book) is It's easier to work with in proving things. It's easier to work with in proving things. It's the much more common definition. It's the much more common definition.

Regular Grammars  A grammar is said to be left-linear if every production in P is of the form A → Bx or A → Bx or A → x A → x where A and B are variables (perhaps the same, perhaps the start symbol S) in V where A and B are variables (perhaps the same, perhaps the start symbol S) in V and x is any string of terminal symbols (including the empty string λ) and x is any string of terminal symbols (including the empty string λ)

Regular Grammars  The alternate definition of a left-linear grammar says that every production in P is of the form A → Ba or A → Ba or A → a or A → a or S → λ S → λ where A and B are variables (perhaps the same, but B can't be S) in V where A and B are variables (perhaps the same, but B can't be S) in V and a is any terminal symbol and a is any terminal symbol

Regular Grammars  Any left-linear or right-linear grammar is called a regular grammar.

Regular Grammars  For brevity, we often write a set of productions such as A → x 1 A → x 1 A → x 2 A → x 2 A → x 3 A → x 3  As A → x 1 | x 2 | x 3 A → x 1 | x 2 | x 3

Regular Grammars  A derivation in grammar G is any sequence of strings in V and T, connected with connected with starting with S and ending with a string containing no variables starting with S and ending with a string containing no variables where each subsequent string is obtained by applying a production in P is called a derivation. where each subsequent string is obtained by applying a production in P is called a derivation.  S  x 1  x 2  x 3 ...  x n abbreviated as:  S x n  *

Regular Grammars  S  x 1  x 2  x 3 ...  x n  abbreviated as:  S x n  We say that x n is a sentence of the language generated by G, L(G).  We say that the other x's are sentential forms.  *

Regular Grammars  L(G) = {w | w  T* and S x n }  We call L(G) the language generated by G  L(G) is the set of all sentences over grammar G  *

Example 1  S → abS | a is an example of a right-linear grammar.  Can you figure out what language it generates?  L = {w  {a,b}* | w contains alternating a's and b's, begins with an a, and ends with a b}  {a}  L((ab)*a)

Example 2  S → Aab A → Aab | aB B → a is an example of a left-linear grammar.  Can you figure out what language it generates?  L = {w  {a,b}* | w is aa followed by at least one set of alternating ab's}  L(aaab(ab)*)

Example 3  Consider the grammar S → A A → aB | λ B → Ab  This grammar is NOT regular.  No "mixing and matching" left- and right- recursive productions.

Regular Grammars and nfa's  It's not hard to show that regular grammars generate and nfa's accept the same class of languages: the regular languages!  It's a long proof, where we must show that any finite automaton has a corresponding left- or right-linear grammar, any finite automaton has a corresponding left- or right-linear grammar, and any regular grammar has a corresponding nfa. and any regular grammar has a corresponding nfa.  We won't bother with the details.

Regular Grammars and nfa's  We get a feel for this by example. Let S → aA A → abS | b Let S → aA A → abS | b SA a b b a

Regular Grammars and Regular Expressions  Example: L(aab*a)  We can easily construct a regular language for this expression: S → aA S → aA A → aB A → aB B → bB B → bB B → a B → a

Regular Languages regular expressions regular grammars finite automata

Regular Languages  Homework: Chapter 3, Section 3 Chapter 3, Section 3 Problems Problems