1 Regular Languages, Regular Operations September 11, 2001.

Slides:



Advertisements
Similar presentations
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 3 graded.
Advertisements

Nondeterministic Finite Automata CS 130: Theory of Computation HMU textbook, Chapter 2 (Sec 2.3 & 2.5)
Regular Expressions and DFAs COP 3402 (Summer 2014)
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
1 Introduction to Computability Theory Lecture2: Non Deterministic Finite Automata Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
CS5371 Theory of Computation
Finite Automata and Non Determinism
1 Introduction to Computability Theory Lecture2: Non Deterministic Finite Automata (cont.) Prof. Amos Israeli.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 4 Updated by Marek Perkowski.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
1 Single Final State for NFAs and DFAs. 2 Observation Any Finite Automaton (NFA or DFA) can be converted to an equivalent NFA with a single final state.
Normal forms for Context-Free Grammars
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
COMS 3261, Lecture 2 Strings, Languages, Automata September 6, 2001.
1 Regular Languages Finite Automata eg. Supermarket automatic door: exit or entrance.
Introduction to Finite Automata Adapted from the slides of Stanford CS154.
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Language Recognizer Connecting Type 3 languages and Finite State Automata Copyright © – Curt Hill.
Regular Expressions. Notation to specify a language –Declarative –Sort of like a programming language. Fundamental in some languages like perl and applications.
CSC312 Automata Theory Lecture # 2 Languages.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
Automata, Computability, & Complexity by Elaine Rich ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Slides provided by author Slides edited for.
1 Chapter 1 Introduction to the Theory of Computation.
Lecture 05: Theory of Automata:08 Kleene’s Theorem and NFA.
Module 2 How to design Computer Language Huma Ayub Software Construction Lecture 8.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
Computability Review homework. Regular Operations. Nondeterministic machines. NFSM = FSM Homework: By hand, build FSM version of specific NFSM. Try to.
CSCI 2670 Introduction to Theory of Computing September 1, 2005.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
CHAPTER 1 Regular Languages
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
CS 3240 – Chapter 4.  Closure Properties  Algorithms for Elementary Questions:  Is a given word, w, in L?  Is L empty, finite or infinite?  Are L.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Formal Definition of Computation Let M = (Q, ∑, δ, q 0, F) be a finite automaton and let w = w 1 w w n be a string where each wi is a member of the.
Fundamentals of Informatics
CS 203: Introduction to Formal Languages and Automata
Recursive Definations Regular Expressions Ch # 4 by Cohen
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
Fundamentals of Informatics Lecture 3 Turing Machines Bas Luttik.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 1 Regular Languages Some slides are in courtesy.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
using Deterministic Finite Automata & Nondeterministic Finite Automata
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
 2004 SDU Lecture4 Regular Expressions.  2004 SDU 2 Regular expressions A third way to view regular languages. Say that R is a regular expression if.
CSCI 2670 Introduction to Theory of Computing September 11, 2007.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Deterministic Finite Automata Nondeterministic Finite Automata.
Lecture #5 Advanced Computation Theory Finite Automata.
Languages.
Languages Prof. Busch - LSU.
Languages Costas Busch - LSU.
Regular Expressions.
Non-Deterministic Finite Automata
CSE322 Minimization of finite Automaton & REGULAR LANGUAGES
Chapter 1 Regular Language
Languages Fall 2018.
Presentation transcript:

1 Regular Languages, Regular Operations September 11, 2001

2 Agenda Today Regular languages  Finite languages are regular Regular operations on languages  Union (  )  Concatenation (  )  Kleene star (*) For next time: Read 1.3 and handout on minimization Thursday, 9/20 (revised ): HW1 collected

3 Definition of Regular Language Recall the definition of a regular language: DEF: The language accepted by an FA M is the set of all strings which are accepted by M and is denoted by L (M). Would like to understand what types of languages are regular. Languages of this type are amenable to super-fast recognition of their elements Would be nice to know for example, which of the following are regular:

4 Language Examples Unary prime numbers: { 11, 111, 11111, , , … } = {1 2, 1 3, 1 5, 1 7, 1 11, 1 13, … } = { 1 p | p is a prime number } Unary squares: { , 1, 1 4, 1 9, 1 16, 1 25, 1 36, … } = { 1 n | n is a perfect square } Palindromic bit strings: { , 0, 1, 00, 11, 000, 010, 101, 111, …} = {x  {0,1}* | x = x R } o Will explore whether or not these are regular in future.

5 Finite Languages All the previous examples had the following property in common: infinite cardinality NOTE: The strings which made up the language were finite (as they always will be in this course); however, the collection of such strings was infinite. Before looking at infinite languages, should definitely look at finite languages.

6 Languages of Cardinality 1 Q: Is the singleton language containing one string regular? For example, is { banana } regular?

7 Languages of Cardinality 1 A: Yes. Q: What’s, wrong with this example?

8 Languages of Cardinality 1 A: Nothing, really. This an example of a nondeterministic FA. This turns out to be the most concise way to encapsulate the language { banana } But we will deal with nondeterminism in coming lectures. So: Q: Is there a way of fixing this and making it deterministic?

9 Languages of Cardinality 1 A: Yes, just add a fail state q7; I.e., put a state that sucks in all strings different from “banana” for all eternity –unless they happen to be the “banana” prefixes { , b, ba, ban, bana, banan}.

10 ABCEZ Goes Bananas Show how ABCEZ works on this example.

11 Two Strings Q: How about two strings? For example { banana, nab } ?

12 Two Strings A: Just add another route:

13 Arbitrary Finite Number of Strings Q1: How about more? For example { banana, nab, ban, babba } ? Q2: Or less (the empty set): Ø = {} ?

14 Arbitrary Finite Number of Strings A1:

15 Arbitrary Finite Number of Strings: Empty Language A2: Build a 1-state automaton whose accept states set F is empty!

16 Arbitrary Finite Number of Strings THM: All finite languages are regular. Proof : Can always construct a tree whose leaves are word-ending. In our example the tree is: Now make word endings into accept states, add a fail sink-state and add links to the fail state to finish the construction. b a a b a n b a n b a n

17 Infinite Cardinality Q: Are all regular languages finite?

18 Infinite Cardinality A: No! Many infinite languages are regular. Common Mistake 1: The strings of regular languages are finite, therefore the regular languages must be finite. Common Mistake 2: Regular languages are –by definition– accepted by finite automata, therefore regular languages are finite. Q: Give an example of a infinite but regular language.

19 Infinite Cardinality bit strings with an even number of b’s Simplest example is    many, many more Home exercise: think of a criterion for non- finiteness

20 Regular Operations You may have come across the regular operations when doing advanced searches utilizing programs such as emacs, egrep, perl, python, etc. There are three basic operations we will work with: 1. Union 2. Concatenation 3. Kleene-star And a fourth definable in terms of the previous: 4. Kleene-plus

21 Regular Operations – Summarizing Table OperationSymbolUNIX versionMeaning Union  | match one of the patterns Concatenation  implicit in UNIX match patterns in sequence Kleene- star ** Match pattern 0 or more times Kleene- plus ++ Match pattern 1 or more times

22 Regular operations - Union UNIX: to search for all lines containing vowels in a text one could use the command egrep -i `a|e|i|o|u’ Here the pattern “vowel ” is matched by any line containing one of a, e, i, o or u. Q: What is a string pattern?

23 String Patterns A: A good way to define a pattern is as a set of strings, i.e. a language. The language for a given pattern is the set of all strings satisfying the predicate of the pattern. EG: vowel-pattern = { the set of strings which contain at least one of: a e i o u }

24 UNIX patterns vs. Computability patterns In UNIX, a pattern is implicitly assumed to occur as a substring of the matched strings. In our course, however, a pattern needs to specify the whole string, and not just a substring.

25 Regular operations - Union Computability: union is exactly what we expect. If you have patterns A = {aardvark}, B = {bobcat}, C = {chimpanzee} union the patterns together to get A  B  C = {aardvark, bobcat, chimpanzee}

26 Regular operations - Concatenation UNIX: to search for all consecutive double occurrences of vowels, use: egrep -i `(a|e|i|o|u)(a|e|i|o|u)’ Here the pattern “vowel ” has been repeated. Parentheses have been introduced to specify where exactly in the pattern the concatenation is occurring.

27 Regular operations - Concatenation Computability. Consider the previous result: L = {aardvark, bobcat, chimpanzee} Q: What language results when we concatenate L with itself obtaining L  L ?

28 Regular operations - Concatenation A: L  L = {aardvark, bobcat, chimpanzee}  {aardvark, bobcat, chimpanzee} = {aardvarkaardvark, aardvarkbobcat, aardvarkchimpanzee, bobcataardvark, bobcatbobcat, bobcatchimpanzee, chimpanzeeaardvark, chimpanzeebobcat, chimpanzeechimpanzee} Q1: What is L  ? Q2: What is L  Ø ?

29 Algebra of Languages A1: L  = L. In general,  is the identity in the “algebra” of languages. I.e., if we think of concatenation as being like multiplication,  acts like the number 1. A2: L  Ø = Ø. Opposite to , Ø acts like the number zero obliterating everything it is concatenated with. Note: We can carry on the analogy between numbers and languages. Addition becomes union, multiplication becomes concatenation. This forms a so-called “algebra”.

30 Regular operations – Kleene-* UNIX: search for lines consisting purely of vowels (including the empty line): egrep -i `^(a|e|i|o|u)*$’ NOTE: ^ and $ are special symbols in UNIX regular expressions which respectively anchor the pattern at the beginning and end of a line. The trick above can be used to convert any Computability regular expression into an equivalent UNIX form.

31 Regular operations – Kleene-* Computability: Suppose we have a language B = { ba, na } Q: What is the language B * ?

32 Regular operations – Kleene-* A: B * = { ba, na }*= { , ba, na,  baba, bana, naba, nana,  bababa, babana, banaba, banana,  nababa, nabana, nanaba, nanana,  babababa, bababana, … }

33 Regular operations – Kleene-+ Kleene-+ is just like Kleene-* except that the pattern is forced to occur at least once. UNIX: search for lines consisting purely of vowels (not including the empty line): egrep -i `^(a|e|i|o|u)+$’ Computability: B + = { ba, na } + = {ba, na,  baba, bana, naba, nana,  bababa, babana, banaba, banana,  nababa, nabana, nanaba, nanana,  babababa, bababana, … }

34 Generating the Regular Languages The real reason that regular languages are called regular is the following: THM: The regular languages are all those languages which can be generated starting from the finite languages by applying the regular operations. This will be proved in the coming lectures. Q: Can we start with even more basic languages than arbitrary finite languages?

35 Generating the Regular Languages A: Yes. We can start with languages consisting of single strings which are themselves just a single character. These are the “atomic” regular languages. EG: To generate the finite language L = { banana, nab } we can start with the atomic languages A = {a}, B = {b}, N = {n}. Then we can express L as: L = (B  A  N  A  N  A)  (N  A  B )

36 Blackboard Exercises Express the DFA patterns from the previous board-exercises using regular operations in both UNIX-style and Computability-style.