UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.

Slides:



Advertisements
Similar presentations
Finite-State Machines with No Output Ying Lu
Advertisements

CSE 311 Foundations of Computing I
4b Lexical analysis Finite Automata
CSC 361NFA vs. DFA1. CSC 361NFA vs. DFA2 NFAs vs. DFAs NFAs can be constructed from DFAs using transitions: Called NFA- Suppose M 1 accepts L 1, M 2 accepts.
Nondeterministic Finite Automata CS 130: Theory of Computation HMU textbook, Chapter 2 (Sec 2.3 & 2.5)
Regular Expressions and DFAs COP 3402 (Summer 2014)
Finite Automata CPSC 388 Ellen Walker Hiram College.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
YES-NO machines Finite State Automata as language recognizers.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
CS5371 Theory of Computation
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Fall 2008.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
Languages. A Language is set of finite length strings on the symbol set i.e. a subset of (a b c a c d f g g g) At this point, we don’t care how the language.
Topics Automata Theory Grammars and Languages Complexities
CSC 361Finite Automata1. CSC 361Finite Automata2 Formal Specification of Languages Generators Grammars Context-free Regular Regular Expressions Recognizers.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Regular Expressions and Finite State Automata. Introduction Regular expressions are equivalent to Finite State Automata in recognizing regular languages,
Regular Expressions (RE) Empty set Φ A RE denotes the empty set Empty string λ A RE denotes the set {λ} Symbol a A RE denotes the set {a} Alternation M.
1 Introduction to Automata Theory Reading: Chapter 1.
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
Finite-State Machines with No Output
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
Regular Expressions. Notation to specify a language –Declarative –Sort of like a programming language. Fundamental in some languages like perl and applications.
Nondeterministic Finite Automata CS 130: Theory of Computation HMU textbook, Chapter 2 (Sec 2.3 & 2.5)
Basic Concepts and Formal Language theory Unit –I By T.H. Gurav 8 October 2015.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
REGULAR LANGUAGES.
1 Unit 1: Automata Theory and Formal Languages Readings 1, 2.2, 2.3.
By: Er. Sukhwinder kaur.  What is Automata Theory? What is Automata Theory?  Alphabet and Strings Alphabet and Strings  Empty String Empty String 
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Copyright © Curt Hill Finite State Automata Again This Time No Output.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
Finite State Machines 1.Finite state machines with output 2.Finite state machines with no output 3.DFA 4.NDFA.
Modeling Computation: Finite State Machines without Output
using Deterministic Finite Automata & Nondeterministic Finite Automata
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
BİL711 Natural Language Processing1 Regular Expressions & FSAs Any regular expression can be realized as a finite state automaton (FSA) There are two kinds.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:
Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.
Finite Automata A simple model of computation. 2 Finite Automata2 Outline Deterministic finite automata (DFA) –How a DFA works.
Theory of Computation Automata Theory Dr. Ayman Srour.
1 Section 11.2 Finite Automata Can a machine(i.e., algorithm) recognize a regular language? Yes! Deterministic Finite Automata A deterministic finite automaton.
Deterministic Finite Automata Nondeterministic Finite Automata.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
Converting Regular Expressions to NFAs Empty string   is a regular expression denoting  {  } a is a regular expression denoting {a} for any a in 
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Lexical analysis Finite Automata
Regular Expressions.
Jaya Krishna, M.Tech, Assistant Professor
Some slides by Elsa L Gunter, NJIT, and by Costas Busch
Non-Deterministic Finite Automata
Regular Expressions.
Finite Automata.
4b Lexical analysis Finite Automata
4b Lexical analysis Finite Automata
Chapter 1 Regular Language
What is it? The term "Automata" is derived from the Greek word "αὐτόματα" which means "self-acting". An automaton (Automata in plural) is an abstract self-propelled.
Presentation transcript:

UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with  transitions – Significance acceptance of languages NFA to DFA conversion minimization of DFA Finite Automata with output Moore and Mealy machines Constructing finite Automata for a given regular expressions Conversion of Finite Automata to Regular expressions.

What is automata theory Automata theory is the study of abstract computational devices Abstract devices are (simplified) models of real computations Computations happen everywhere: On your laptop, on your cell phone, in nature, … Why do we need abstract models? BATTERY SWITCH off on start f f input: switch output: light bulb actions: f for “flip switch” states: on, off bulb is on if and only if there was an odd number of flips A simple “computer”

Alphabets and Languages An alphabet is a finite non-empty set. We use the symbol ∑ (sigma) to denote an alphabet Examples: – Binary: ∑ = {0,1} – All lower case letters: ∑ = {a,b,c,..z} – Alphanumeric: ∑ = {a-z, A-Z, 0-9} – DNA molecule letters: ∑ = {a,c,g,t} Strings A string or word is a finite sequence of symbols chosen from ∑ Empty string is  (or “epsilon”) Length of a string w, denoted by “|w|”, is equal to the number of (non- ) characters in the string E.g., x = |x| = 6 x = 01  0  1  00  |x| = ? xy = concatentation of two strings x and y

4 Powers of an alphabet Let ∑ be an alphabet. – ∑ k = the set of all strings of length k – ∑* = ∑ 0 U ∑ 1 U ∑ 2 U … – ∑ + = ∑ 1 U ∑ 2 U ∑ 3 U … L is a said to be a language over alphabet ∑, only if L  ∑*  this is because ∑* is the set of all strings (of all possible length including 0) over the given alphabet ∑ Examples: 1.Let L be the language of all strings consisting of n 0’s followed by n 1’s: L = {,01,0011,000111,…} 2.Let L be the language of all strings of with equal number of 0’s and 1’s: L = {,01,10,0011,1100,0101,1010,1001,…} Definition:Ø denotes the Empty language Let L = {}; Is L=Ø?

Formal Language A formal language L is a set of finite-length words (or "strings") over some finite alphabet A.  is the empty word. Example: A = {a, b, c} L 1 = {ab, c} Some examples of formal languages: the set of all words over {a, b}, the set { a n | n is a prime number }, the set of syntactically correct programs in some programming language

Several operations can be used to produce new languages from given ones. Suppose L1 and L2 are languages over some common alphabet. The concatenation L1L2 consists of all strings of the form vw where v is a string from L1 and w is a string from L2. The intersection of L1 and L2 consists of all strings which are contained in L1 and also in L2. The union of L1 and L2 consists of all strings which are contained in L1 or in L2. The complement of the language L1 consists of all strings over the alphabet which are not contained in L1. The Kleene star L1* consists of all strings which can be written in the form w1w2...wn with strings wi in L1 and n ≥ 0. Note that this includes the empty string ε because n = 0 is allowed.

Regular Expressions A regular expression defines a regular language over an alphabet : –  is a regular language: {} – Any symbol from  is a regular language:  = { a, b, c} {a} {b} {c} – Two concatenated regular languages is a regular language:  = { a, b, c} {ab} {bc} {ca} – The union (or disjunction) of two regular languages is a regular language:  = { a, b, c} {ab|bc} {ca|bb} – The Kleene closure (denoted by the Kleene star: *) of a regular language is a regular language:  = { a, b, c} {a*} {(ab|ca)*} – Positive closure – Positive closure of a language L L + = L *  L 0 = L * – {e} – Parentheses group a sub-language to override operator precedence – A regular set is a set represented by a regular expression.

RE Examples L(001) = {001} L(0+10*) = { 0, 1, 10, 100, 1000, 10000, … } L(0*10*) = {1, 01, 10, 010, 0010, …} i.e. {w | w has exactly a single 1} L()* = {w | w is a string of even length} L((0(0+1))*) = { ε, 00, 01, 0000, 0001, 0100, 0101, …} L((0+ε)(1+ ε)) = {ε, 0, 1, 01} L(1Ø) = Ø; concatenating the empty set to any set yields the empty set. Rε = R R+Ø = R Exercise: Write a regular expression for the set of strings that contains an even number of 1’s over  ={0,1}. Treat zero 1’s as an even number.

Identity Rules

What are the strings represented by 10* A 1 followed by any number of 0s (including no zeros) (10)* Any number of copies of 10 (including null string) the string 0 or the string 01 0 (0 + 1)* Any string beginning with 0 (0*1)* Any string not ending with a 0 (including null string) Find a regular expression The set of bit strings with even length ( )* Set of bit strings ending with a 0 not containing 11 not the null string (0 +10)*(0+10) or (0+10) + The set of bit strings containing and odd number of 0s 1*01*(01*01*)*

11 Finite State Automata A finite state automata over an alphabet is: – a directed graph – a finite set of states defined by the nodes – edges are labeled with elements of alphabet, or empty string; they define state transitions – some nodes (or states), marked as final – one node marked as start state is a transition is a state is a final state is the start state

Finite-state Automata q0q0 q1q1 q2q2 q3q3 q4q4  = { a, b, c } abca transition final state start state state Representation –An FSA may also be represented with a state- transition table. The table for the above FSA: Input State abc 01  1  2  2  3 34  4 

Given an input string, an FSA will either accept or reject the input. – If the FSA is in a final (or accepting) state after all input symbols have been consumed, then the string is accepted (or recognized). – Otherwise (including the case in which an input symbol cannot be consumed), the string is rejected. q0q0 q1q1 q2q2 q3q3 q4q4  = { a, b, c } abca Input State abc 01  1  2  2  3 34  4  abca ccba abcac IS 1 : IS 2 : IS 3 :

Determinism – An FSA may be either deterministic (DFSA or DFA) or non-deterministic (NFSA or NFA). An FSA is deterministic if its behavior during recognition is fully determined by the state it is in and the symbol to be consumed. – I.e., given an input string, only one path may be taken through the FSA. Conversely, an FSA is non-deterministic if, given an input string, more than one path may be taken through the FSA. – One type of non-determinism is  -transitions, i.e. transitions which consume the empty string (no symbols). A finite state automata M = (∑, Q, δ, q 0, F) ∑: alphabet Q: set of states δ: Q ⅹ ∑  Q, a transition function q 0 : the start state F: final states Formal Definition of FSA

Non-deterministic Finite Automata A nondeterministic finite automaton M is a five-tuple M = (Q, , , q 0, F), where: – Q is a finite set of states of M –  is the finite input alphabet of M – : Q    power set of Q, is the state transition function mapping a state-symbol pair to a subset of Q – q 0 is the start state of M – F  Q is the set of accepting states or final states of M NFA that recognizes the language of strings that end in 01 q0q0 q2q2 0,1 0 1 q1q1 note:  (q 0,0) = {q 0,q 1 }  (q 1,0) = {}

Deterministic Finite Automata A DFA is an NFA with the following restrictions:  moves are not allowed For every state s S, there is one and only one path from s for every input symbol a  . start 03 b 21 ba b a b a a What Language is Accepted?

Algorithm to construct a NFA for any regular expression Basic building blocks: (1) Any letter a of the alphabet is recognized by: (2) The empty set  is recognized by: (3) The empty string  is recognized by: (Thompson Construction)

(4) Given a regular expression for R and S, assume these boxes represent the finite automata for R and S: (5) To construct a nfa for RS (concatenation): (6) To construct a nfa for R | S (alternation):

(7) To construct a nfa for R* (closure): Construct NFA for the regular expression (ab*c) | (a(b|c*)) b     ac c       b a     

NFA to DFA conversion (Subset construction method)

Convert the given RE into DFA using Subset Construction ( a | b ) * abb a, b q0q0 q1q1 q4q4 q2q2 q3q3  abb contains q 4 (final state) Iter. new state ε-closure(move(sj,x)) nameContains a b 0s0 q0, q1 q1,q2 q1 1s1 q1, q2 q1,q3 s2 q1 q1,q2 q1 2s3 q1,q3 q1,q2 q1,q4 3s4 q1, q4 q1,q2 q1 NFA to DFA s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b

25 Converting DFAs to REs 1.Combine serial links by concatenation 2.Combine parallel links by alternation 3.Remove self-loops by Kleene closure 4.Select a node (other than initial or final) for removal. Replace it with a set of equivalent links whose path expressions correspond to the in and out links 5.Repeat steps 1-4 until the graph consists of a single link between the entry and exit nodes.

26 Example d a b c d 7 5 a b d d b c da|b|cd 7 5 a b d d b|c 043 d(a|b|c)d 5 ad b(b|c)d

d(a|b|c)d 5 ad b(b|c)da 043 d(a|b|c)d 5 a(b(b|c)da)*d 0 d(a|b|c)da(b(b|c)da)*d 5