CS 3813: Introduction to Formal Languages and Automata

Slides:



Advertisements
Similar presentations
CS 345: Chapter 9 Algorithmic Universality and Its Robustness
Advertisements

Chapter 5 Pushdown Automata
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
1 CSCI-2400 Models of Computation. 2 Computation CPU memory.
CS 490: Automata and Language Theory Daniel Firpo Spring 2003.
Normal forms for Context-Free Grammars
CS 3240 – Chuck Allison.  A model of computation  A very simple, manual computer (we draw pictures!)  Our machines: automata  1) Finite automata (“finite-state.
Grammars, Languages and Finite-state automata Languages are described by grammars We need an algorithm that takes as input grammar sentence And gives a.
Languages & Strings String Operations Language Definitions.
Week 14 - Friday.  What did we talk about last time?  Exam 3 post mortem  Finite state automata  Equivalence with regular expressions.
1 Introduction to Automata Theory Reading: Chapter 1.
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
1 Theory of Computation 計算理論 2 Instructor: 顏嗣鈞 Web: Time: 9:10-12:10 PM, Monday Place: BL 103.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
INTRODUCTION TO THE THEORY OF COMPUTATION INTRODUCTION MICHAEL SIPSER, SECOND EDITION 1.
1 Chapter 1 Automata: the Methods & the Madness Angkor Wat, Cambodia.
Lecture Two: Formal Languages Formal Languages, Lecture 2, slide 1 Amjad Ali.
Introduction to Theory of Automata
1 Chapter 1 Introduction to the Theory of Computation.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Grammars CPSC 5135.
1 Introduction to Regular Expressions EELS Meeting, Dec Tom Horton Dept. of Computer Science Univ. of Virginia
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
Introduction to Language Theory
Part VII. Models for Context-Free Languages 1/50.
Strings and Languages CS 130: Theory of Computation HMU textbook, Chapter 1 (Sec 1.5)
CSE 3813 Introduction to Formal Languages and Automata Chapter 10 Other Models of Turing Machines These class notes are based on material from our textbook,
1 Theory of Computation 計算理論 2 Instructor: 顏嗣鈞 Web: Time: 9:10-12:10 PM, Monday Place: BL.
CS 3813: Introduction to Formal Languages and Automata Chapter 2 Deterministic finite automata These class notes are based on material from our textbook,
CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook,
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 3813: Introduction to Formal Languages and Automata
CS 203: Introduction to Formal Languages and Automata
Three Basic Concepts Languages Grammars Automata.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Formal Languages and Grammars
Discrete Structures ICS252 Chapter 5 Lecture 2. Languages and Grammars prepared By sabiha begum.
CSCI 2670 Introduction to Theory of Computing October 13, 2005.
Theory of computation Introduction theory of computation: It comprises the fundamental mathematical properties of computer hardware, software,
CS 154 Formal Languages and Computability February 4 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
1 Section 13.1 Turing Machines A Turing machine (TM) is a simple computer that has an infinite amount of storage in the form of cells on an infinite tape.
1 Turing Machines and Equivalent Models Section 13.1 Turing Machines.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
1 Course Overview Why this course “formal languages and automata theory?” What do computers really do? What are the practical benefits/application of formal.
C Sc 132 Computing Theory Professor Meiliu Lu Computer Science Department.
Lecture 6: Context-Free Languages
Akram Salah ISSR Basic Concepts Languages Grammar Automata (Automaton)
Theory of Languages and Automata By: Mojtaba Khezrian.
CS 3813: Introduction to Formal Languages and Automata Chapter 11 A Hierarchy of Formal Languages and Automata These class notes are based on material.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
Lecture 17: Theory of Automata:2014 Context Free Grammars.
Introduction to Automata Theory
Chapter 1 INTRODUCTION TO THE THEORY OF COMPUTATION.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
CSE202: Introduction to Formal Languages and Automata Theory
Welcome to Automata Theory Course
Lecture 1 Theory of Automata
CIS Automata and Formal Languages – Pei Wang
Welcome to Automata Theory Course
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz
Theory of Computation Theory of computation is mainly concerned with the study of how problems can be solved using algorithms.  Therefore, we can infer.
Natural Language Processing - Formal Language -
Context Sensitive Languages and Linear Bounded Automata
Course 2 Introduction to Formal Languages and Automata Theory (part 2)
A HIERARCHY OF FORMAL LANGUAGES AND AUTOMATA
Chapter 1 Introduction to the Theory of Computation
Presentation transcript:

CS 3813: Introduction to Formal Languages and Automata Chapter 1 Introduction to the Theory of Computation These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata, 3rd ed., by Peter Linz, published by Jones and Bartlett Publishers, Inc., Sudbury, MA, 2001. They are intended for classroom use only and are not a substitute for reading the textbook.

Acknowledgement Unless otherwise credited, all class notes on this website are based on the required textbook for this course, Linz, Peter. An Introduction to Formal Languages and Automata, 3rd ed. Sudbury, Mass.: Jones and Bartlett Publishers, 2001. These notes are intended solely for the use of the students in the CS 3813 class at Mississippi State University. Please assume any errors to be mine, and not the author of the textbook.

Topic of course What are the fundamental capabilities and limitations of computers? To answer this, we will study abstract mathematical models of computers These mathematical models abstract away many of the details of computers to allow us to focus on the essential aspects of computation It allows us to develop a mathematical theory of computation

Review of set theory Can specify a set in two ways: - list of elements: A = {6, 12, 28} - characteristic property: B = {x | x is a positive, even integer} Set membership: 12  A, 9  A Set inclusion: A  B (A is a subset of B) A  B (A is a proper subset of B) Set operations: union: A  {9, 12} = {6, 9, 12, 28} intersection: A  {9, 12} = {12} difference: A - {9, 12} = {6, 28}

Set theory (continued) Another set operation, called “taking the complement of a set”, assumes a universal set. Let U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} be the universal set. Let A = {2, 4, 6, 8} Then = U - A = {0, 1, 3, 5, 7, 9} The empty set:  = {}

Set theory (continued) The cardinality of a set is the number of elements in a set. Let S = {2, 4, 6} Then |S| = 3 The powerset of S, represented by 2S, is the set of all subsets of S. 2S = {{}, {2}, {4}, {6},{2,4}, {2,6}, {4,6}, {2,4,6}} The number of elements in a powerset is |2S| = 2|S|

What does the title of this course mean? Formal language a subset of the set of all possible strings from from a set of symbols example: the set of all syntactically correct C programs Automata abstract, mathematical model of computer examples: finite automata, pushdown automata Turing machine, RAM, PRAM, many others Let’s consider each of these in turn

Formal language Alphabet = finite set of symbols or characters examples: S = {a,b}, binary, ASCII String = finite sequence of symbols from an alphabet examples: aab, bbaba, also computer programs A formal language is a set of strings over an alphabet Examples of formal languages over alphabet S = {a, b}: L1 = {aa, aba, aababa, aa} L2 = {all strings containing just two a’s and any number of b’s} A formal language can be finite or infinite.

Formal languages (continued) We often use string variables; u = aab, v = bbaba Operations on strings length: |u| = 3 reversal: uR = baa concatenation: uv = aabbbaba The empty string, denoted  , has some special properties: |  | = 0  w = w  = w

Formal languages (continued) If w is a string, then wn stands for the string obtained by repeating w n times. w0 =  S+ = S* - {} L0 = {} L1 = L

Operations on languages Set operations: L1  L2 = {x | x  L1 or x  L2} is union L1  L2 = {x | x  L1 and x  L2} is intersection L1 - L2 = {x | x  L1 and x  L2} is difference = * - L is complement L1  L2 = (L1 - L2)  (L2 - L1) is “symmetric difference” String operations: LR = {wR | w  L} is “reverse of language” L1L2 = {xy | x  L1, y  L2} is “concatenation of languages” L* = {x = x1…xk | k  0 and x1, …, xk  L} = L0  L1  L2 . . . . is “Kleene star” or "star closure" L+ = L1  L2 . . . . is positive closure

Some review questions What is {, 01, 001}  {, 00, 10}? What is the concatenation of {0, 11, 010} and {, 10, 010}? What are the 5 shortest strings in the language {0i1i | i  0}? What is the powerset {a, b, ab}?

Important example of a formal language alphabet: ASCII symbols string: a particular C++ program formal language: set of all legal C++ programs

Grammars A grammar G is defined as a quadruple: G = (V, T, S, P) Where V is a finite set of objects called variables T is a finite set of objects called terminal symbols S  V is a special symbol called the Start symbol P is a finite set of productions or "production rules" Sets V and T are nonempty and disjoint

Grammars Production rules have the form: x  y where x is an element of (V  T)+ and y is in (V  T)* Given a string of the form w = uxv and a production rule we can apply the rule, replacing x with y, giving z = uyv We can then say that w  z Read as "w derives z", or "z is derived from w"

Grammars If u  v, v  w, w  x, x  y, and y  z, then we say: * u  z This says that u derives z in an unspecified number of steps. Along the way, we may generate strings which contain variables as well as terminals. These are called sentential forms.

Grammars What is the relationship between a language and a grammar? Let G = (V, T, S, P) The set * L(G) = {w  T* : S  w} is the language generated by G.

Grammars Consider the grammar G = (V, T, S, P), where: S  aSb V = {S} T = {a, b} S = S, P = S  aSb S  

Grammars What are some of the strings in this language? S  aSb  ab S  aSb  aaSbb  aabb S  aSb  aaSbb  aaaSbbb  aaabbb It is easy to see that the language generated by this grammar is: L(G) = {anbn : n  0} (See proof on pp. 21-22 in Linz)

Grammars Let's go the other way, from a description of a language to a grammar that generates it. Find a grammar that generates: L = {anbn+1 : n  0} So the strings of this language will be: b (0 a's and 1 b) abb (1 a and 2 b's) aabbb (2 a's and 3 b's) . . . In order to generate a string with no a's and 1 b, you might want to write rules for the grammar that say: S  ab a   But you can't do this; a is a terminal, and you can't change a terminal, only variables

Grammars So, instead of: S  ab a   we create another variable, A (we often use capital letters to stand for variables), to use in place of the terminal, a: S  Ab A   Now you might think that we can use another S rule here to generate the other part of the string, the anbn part S  aSb But you can't, because that will generate ab, aabb, etc. Note, however, that if we use A in place of S, that will solve our problem: A  aAb

Grammars So, here are our rules: S  Ab A  aAb A   The S  Ab rule creates a single b terminal on the right, preceded by other strings (including possibly the empty string) on the left. The A   rule allows the single b string to be generated. The A  aAb rule and the A   rule allows ab, aabb, aaabbb, etc. to be generated on the left side of the string.

Language-recognition problem There are many types of computational problem. We will focus on the simplest, called the “language-recognition problem.” Given a string, determine whether it belongs to a language or not. (Practical application for compilers: Is this a valid C++ program?) We study simple models of computation called “automata,” and measure their computational power in terms of the class of languages they can recognize.

Automata Input File Control Unit (with finite states) Temporary Storage Output

“Computer” or Turing machine (Alan Turing 1936) 1 2 3 X B Finite-state control Infinite tape or “memory” Read/write head

Finite automata Developed in 1940’s and 1950’s for neural net models of brain and computer hardware design Finite memory! Many applications: text-editing software: search and replace many forms of pattern-recognition (including use in WWW search engines) compilers: recognizing keywords (lexical analysis) sequential circuit design software specification and design communications protocols

Pushdown automata Noam Chomsky’s work in 1950’s and 1960’s on grammars for natural languages infinite memory, organized as a stack Applications: compilers: parsing computer programs programming language design

Computational power TM LBA PDA FSA

Automata, languages, and grammars In this course, we will study the relationship between automata, languages, and grammars Recall that a formal language is a set of strings over a finite alphabet Automata are used to recognize languages Grammars are used to generate languages (All these concepts fit together)

Classification of automata, languages, and grammars Turing machine Recursively enumerable Linear-bounded automaton Context sensitive Nondeterministic push-down automaton Context free Finite-state automaton regular

Besides developing a theory of classes of languages and automata, we will study the limits of computation. We will consider the following two important questions: What problems are impossible for a computer to solve? What problems are too difficult for a computer to solve in practice (although possible to solve in principle)?

Uncomputable (undecidable) problems Many well-defined (and apparently simple) problems cannot be solved by any computer Examples: For any program x, does x have an infinite loop? For any two programs x and y, do these two programs have the same input/output behavior? For any program x, does x meet its specification? (i.e., does it have any bugs?)

Intractable problems We will learn how to mathematically characterize the difficulty of computational problems. There is a class of problems that can be solved in a reasonable amount of time and another class that cannot (What good is it for a problem to be solvable, if it cannot be solved in the lifetime of the universe?) The field of cryptography, for example, relies on the fact that the computational problem of “breaking a code” is intractable

Why study the theory of computing? Core mathematics of CS (has not changed in over 30 years) Many applications, especially in design of compilers and programming languages Important to be able to recognize uncomputable and intractable problems Need to know this in order to be a computer scientist, not simply a computer programmer