Languages and Machines Unit two: Regular languages and Finite State Automata.

Slides:



Advertisements
Similar presentations
Nondeterministic Finite Automata CS 130: Theory of Computation HMU textbook, Chapter 2 (Sec 2.3 & 2.5)
Advertisements

1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
YES-NO machines Finite State Automata as language recognizers.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
Languages. A Language is set of finite length strings on the symbol set i.e. a subset of (a b c a c d f g g g) At this point, we don’t care how the language.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
Languages, grammars, and regular expressions
Finite state automaton (FSA)
CS 490: Automata and Language Theory Daniel Firpo Spring 2003.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Languages and Machines Unit one: Formal Languages.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
Topics Automata Theory Grammars and Languages Complexities
Costas Busch - LSU1 Non-Deterministic Finite Automata.
Regular Expressions & Automata Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Regular Expressions/Languages Regular languages –Inductive definitions –Regular expressions syntax semantics Not covered in lecture.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Scanner Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source language? Is the.
CPSC 388 – Compiler Design and Construction
Topic #3: Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
Finite-State Machines with No Output
CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CSC312 Automata Theory Lecture # 2 Languages.
Lecture Two: Formal Languages Formal Languages, Lecture 2, slide 1 Amjad Ali.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
1 Chapter 1 Introduction to the Theory of Computation.
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
1 Introduction to Regular Expressions EELS Meeting, Dec Tom Horton Dept. of Computer Science Univ. of Virginia
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
Lexical Analyzer (Checker)
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
1 Module 14 Regular languages –Inductive definitions –Regular expressions syntax semantics.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Regular Expressions CIS 361. Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite.
1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II:
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
CHAPTER 1 Regular Languages
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Lecture # 12. Nondeterministic Finite Automaton (NFA) Definition: An NFA is a TG with a unique start state and a property of having single letter as label.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 10 Automata, Grammars and Languages.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2007.
using Deterministic Finite Automata & Nondeterministic Finite Automata
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Set, Alphabets, Strings, and Languages. The regular languages. Clouser properties of regular sets. Finite State Automata. Types of Finite State Automata.
Lecture 03: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Deterministic Finite Automata Nondeterministic Finite Automata.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Compilers Lexical Analysis 1. while (y < z) { int x = a + b; y += x; } 2.
Department of Software & Media Technology
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Deterministic Finite-State Machine (or Deterministic Finite Automaton) A DFA is a 5-tuple, (S, Σ, T, s, A), consisting of: S: a finite set of states Σ:
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
CS314 – Section 5 Recitation 2
CS314 – Section 5 Recitation 3
Two issues in lexical analysis
REGULAR LANGUAGES AND REGULAR GRAMMARS
Non-Deterministic Finite Automata
Finite Automata.
Compiler Construction
Presentation transcript:

Languages and Machines Unit two: Regular languages and Finite State Automata

2 Review of week one A language is a set of strings (the set of different things you can say). May be infinite. A string is a sequence of symbols. Minimum length zero, maximum length some finite number. A symbol is just some mark on the page or screen. A language has a finite alphabet of symbols.

3 Review of week one In a context-dependent language, the meaning of a phrase depends on the context In a context-sensitive language, the structure of a phrase depends on the context Most natural languages are context-dependent but not context-sensitive A context-free language is one where the structure of a phrase is always the same, independent of context A regular language is a context-free language which has simple rules for forming valid strings (e.g. "94", "getWidth()“)

4 Classes of formal language regular phrase structure context-free context-sensitive

5 Regular languages Here are examples of strings from a regular language with alphabet {a,b}: a b ab aaaaa ababab

6 Regular languages 1.the empty set is a regular language 2.the set consisting of the empty string (  ) is a regular language 3.the set consisting of a one-symbol string is a regular language 4.a new regular language can be made by taking a string from a regular language and concatenating it with a string from a regular language 5.a new regular language can be made by taking the disjoint union of two regular languages

7 Recognizing regular languages regular languages can be recognized and interpreted by a finite-state machine for example, here is a machine to recognize a two-bit string: acceptor states

8 Regular expressions Wouldn’t it be nice if we had a compact way of specifying a regular language? we have! it’s a special notation called a regular expression

9 Examples of regular languages 1.the set of all two-symbol strings containing the letters a and b (a|b) 2 2.the set of all two-bit strings (0|1) 2 3.the set of all possible words (a|..|z) + 4.the set of all decimal integers (0|(1|..|9)(0|..|9) * ) 5.the set of Java identifiers JavaLetter JavaLetterOrDigit *

10 More examples of regular languages 1.all the possible three-bit strings (0|1) 3 2.all the single-digit decimal numbers (0|1|2|3|4|5|6|7|8|9) (0|..|9) 3.all the possible repetitions of the traffic-light sequence (red, amber, green, amber) (red amber green amber) *

11 Activity Write down the regular expression denoting the following regular languages: The language with two strings “the cat” and “the mat” Arithmetic expressions with two operands, e.g , 3 × 4 The allowed operator are: +, -, ×, ÷ The allowed operands are: single digit decimal numbers The language consisting of all possible binary strings The language of HTML tags such as

12 Suggested Answers The language with two strings “the cat” and “the mat” the (cat | mat) or (the (c|m)at) Arithmetic expressions with two operands, e.g , 3 × 4. (0|..|9) (+|-|×|÷) (0|..|9) The language consisting of all possible binary strings (0|1) * The language of HTML tags such as

13 A cautionary note You have been using a metalanguage! The regular expression strings form a language having terminal symbols ( ) + * | plus literal symbols e.g. a stands for the letter a this can cause problems when the metalanguage and the language get confused e.g. the language consisting of strings of one to three vertical bars: | | || | |||

14 A cautionary note we can fix this by some ghastly escape convention, e.g. convert the above to "|" | "||" | "|||" now we have problems with the quote symbol! the best idea is to choose metalanguage symbols which are rarely encountered in the language being described, and use bold-face or color to distinguish

15 Regular languages and regular expressions Regular Language 1.the empty set 2.the set consisting of the empty string (  ) 3.the set consisting of a one- symbol string (e.g. "a") 4.a new regular language can be made by taking a string from a regular language and concatenating it with a string from a regular language 5.a new regular language can be made by taking the union of two regular languages Regular Expression a 4. a b 5. a | b

16 Regular languages and regular expressions The other ways of forming regular expressions are just shorthand: a 0 = a 1 =a a 2 =aa a*= | a | aa | aaa |... a + =a | aa | aaa |...

17 Regular languages and regular expressions Brackets are used to show precedence of the operations (a | b ) *  a | b * default precedence is: * or + or n concatenation |

18 Activity Give examples of the following languages: 1. (x | y | z) 3 2. x | y | z * 3. a b 2 4. (a b) 2

19 Suggested Answers Give examples of the following languages: 1. (x | y | z) 3 xzy 2. x | y | z * 3. a b 2 abb 4. (a b) 2 abab

20 From Regular Expressions to Finite State Automata 1.It is an amazing fact that any regular expression has an equivalent finite state automaton which recognizes it 2.and every finite state automaton recognizes some regular expression we will prove these propositions later

21 01 E D 00 Finite State Machines an FSM to add two binary numbers A B C F start state transition input symbol end state output symbol

22 Finite state automata These are simple machines with no output symbols they can only recognize strings of input symbols acceptance is shown by a special state

23 NFAs The kind of finite state automata we shall be using are called nondeterministic finite automata "nondeterministic" means we can do naughty things like: have a transition without a symbol label two exit transitions with the same symbol not show the paths which lead to failure

24 Example of an NFA what regular language does this NFA represent? a b | a b c | a+ a a a a b bc

25 a Examples of conversion from REs to NFAs (a b) 2 a b 2 (a | b) 2 (a | b) * a ba b a bb b a b a b

26 Convert the following regular expressions to NFAs: 1. JavaLetter JavaLetterOrDigit * 2. (red amber green amber) * Convert the following NFAs to REs: Activity a b a b c d

27 Suggested answer (ab) * 4. (ac|bd) + javaLetter javaLetterOrDigit red amber green amber

28 Summary regular expressions give us a neat notation for describing regular languages nondeterministic finite automata (NFAs) provide a diagrammatic version of regular expressions these notations are equivalent finite automata theory is crucial in generating lexical analyzers from regular expressions