Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.

Slides:



Advertisements
Similar presentations
Lecture 24 MAS 714 Hartmut Klauck
Advertisements

Regular Expressions Finite State Automaton. Programming Languages2 Regular expressions  Terminology on Formal languages: –alphabet : a finite set of.
Compiler Construction
Lecture 3UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 3.
Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
CS5371 Theory of Computation
1 The scanning process Main goal: recognize words/tokens Snapshot: At any point in time, the scanner has read some input and is on the way to identifying.
1 The scanning process Goal: automate the process Idea: –Start with an RE –Build a DFA How? –We can build a non-deterministic finite automaton (Thompson's.
Topics Automata Theory Grammars and Languages Complexities
CSC 361Finite Automata1. CSC 361Finite Automata2 Formal Specification of Languages Generators Grammars Context-free Regular Regular Expressions Recognizers.
1.Defs. a)Finite Automaton: A Finite Automaton ( FA ) has finite set of ‘states’ ( Q={q 0, q 1, q 2, ….. ) and its ‘control’ moves from state to state.
Scanner Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source language? Is the.
CPSC 388 – Compiler Design and Construction
1 Introduction to Automata Theory Reading: Chapter 1.
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions.
1 Outline Informal sketch of lexical analysis –Identifies tokens in input string Issues in lexical analysis –Lookahead –Ambiguities Specifying lexers –Regular.
By: Er. Sukhwinder kaur.  What is Automata Theory? What is Automata Theory?  Alphabet and Strings Alphabet and Strings  Empty String Empty String 
어휘분석 (Lexical Analysis). Overview Main task: to read input characters and group them into “ tokens. ” Secondary tasks: –Skip comments and whitespace;
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Lexical Analysis Constructing a Scanner from Regular Expressions.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
Brian Mitchell - Drexel University MCS680-FCS 1 Patterns, Automata & Regular Expressions int MSTWeight(int graph[][], int size)
Finite State Machines 1.Finite state machines with output 2.Finite state machines with no output 3.DFA 4.NDFA.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Introduction to Automata Theory
using Deterministic Finite Automata & Nondeterministic Finite Automata
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
1 Course Overview Why this course “formal languages and automata theory?” What do computers really do? What are the practical benefits/application of formal.
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 12 Mälardalen University 2007.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
 2004 SDU Lecture4 Regular Expressions.  2004 SDU 2 Regular expressions A third way to view regular languages. Say that R is a regular expression if.
Set, Alphabets, Strings, and Languages. The regular languages. Clouser properties of regular sets. Finite State Automata. Types of Finite State Automata.
Deterministic Finite Automata Nondeterministic Finite Automata.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Introduction to Automata Theory
Department of Software & Media Technology
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Welcome to Automata Theory Course
Lecture 1 Theory of Automata
Lexical analysis Finite Automata
CIS Automata and Formal Languages – Pei Wang
Course 1 Introduction to Formal Languages and Automata Theory (part 1)
Welcome to Automata Theory Course
PROGRAMMING LANGUAGES
Theory of Computation Theory of computation is mainly concerned with the study of how problems can be solved using algorithms.  Therefore, we can infer.
CSCE 355 Foundations of Computation
CS314 – Section 5 Recitation 3
Recognizer for a Language
Jaya Krishna, M.Tech, Assistant Professor
Introduction to Automata Theory
LECTURE NOTES On FINITE AUTOMATA.
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
COSC 3340: Introduction to Theory of Computation
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
4. Properties of Regular Languages
Finite Automata.
4b Lexical analysis Finite Automata
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Compiler Construction
4b Lexical analysis Finite Automata
Chapter 1 Regular Language
Lexical Analysis Uses formalism of Regular Languages
Presentation transcript:

Topic 3: Automata Theory 1

OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy. 2

What is Automata Theory? Study of abstract computing devices, or “machines” Automaton = an abstract computing device Note: A “device” need not even be a physical hardware! A fundamental question in computer science: Find out what different models of machines can do and cannot do The theory of computation Computability vs. Complexity 3

4 Alan Turing ( )  Father of Modern Computer Science  English mathematician  Studied abstract machines called Turing machines even before computers existed  Heard of the Turing test? (A pioneer of automata theory)

Languages & Grammars Languages: “A language is a collection of sentences of finite length all constructed from a finite alphabet of symbols” Grammars: “A grammar can be regarded as a device that enumerates the sentences of a language” - nothing more, nothing less N. Chomsky, Information and Control, Vol 2, Or “words” Image source: Nowak et al. Nature, vol 417, 2002

The Chomsky Hierachy 6 Regular (DFA) Context- free (PDA) Context- sensitive (LBA) Recursively- enumerable (TM) A containment hierarchy of classes of formal languages

The Central Concepts of Automata Theory 7

AlphabetAlphabet An alphabet is a finite, non-empty set of symbols We use the symbol ∑ (sigma) to denote an alphabet Examples: Binary: ∑ = {0,1} All lower case letters: ∑ = {a,b,c,..z} Alphanumeric: ∑ = {a-z, A-Z, 0-9} DNA molecule letters: ∑ = {a,c,g,t} … 8

StringsStrings A string or word is a finite sequence of symbols chosen from ∑ Empty string is  (or “epsilon”) Length of a string w, denoted by “|w|”, is equal to the number of (non-  ) characters in the string E.g., x = |x| = 6 x = 01  0  1  00  |x| = ? xy = concatentation of two strings x and y 9

LanguagesLanguages 10

The Membership Problem 11

12 LanguagesLanguages  Let  be a set of characters.  is called the alphabet.  A language over  is set of strings of characters drawn from 

13 Example of Languages Alphabet = English characters Language = English sentences Alphabet = ASCII Language = C++ programs, Java, C#

14 NotationNotation  Languages are sets of strings (finite sequence of characters)  Need some notation for specifying which sets we want

15 Regular Languages  Each regular expression is a notation for a regular language (a set of words).  If A is a regular expression, we write L(A) to refer to language denoted by A.

16 Regular Expression  A regular expression (RE) is defined inductively aordinary character from   the empty string

17 Regular Expression R|S= either R or S RS= R followed by S (concatenation) R*= concatenation of R zero or more times (R*=  |R|RR|RRR...)

18 RE Extentions R?=  | R (zero or one R) R + = RR* (one or more R) (R)= R (grouping)

19 RE Extentions [abc]= a|b|c (any of listed) [a-z]= a|b|....|z (range) [^ab]= c|d|... (anything but ‘a’‘b’)

20 Regular Expression REStrings in L(R) a “a” ab “ab” a|b “a” “b” (ab)* “” “ab” “abab”... (a|  )b “ab” “b”

21 Example: integers  integer: a non-empty string of digits  digit = ‘0’|’1’|’2’|’3’|’4’| ’5’|’6’|’7’|’8’|’9’  integer= digit digit*

22 Example: identifiers  identifier: string or letters or digits starting with a letter  C identifier: [a-zA-Z_][a-zA-Z0-9_]*

23 RecapRecap Language L(R): set of strings represented by a regular expression R. L(R) is the language denoted by regular expression R.

24 How to Use REs  We need mechanism to determine if an input string w belongs to L(R), the language denoted by regular expression R.

25 AcceptorAcceptor  Such a mechanism is called an acceptor. input string language w L acceptor yes, if w  L no, if w  L

26 Finite Automata (FA)  Specification: Regular Expressions  Implementation: Finite Automata

27 Finite Automata Finite Automaton consists of  An input alphabet (   A set of states  A start (initial) state  A set of transitions  A set of accepting (final) states

28 Finite Automaton State Graphs A state The start state An accepting state

29 Finite Automaton State Graphs a A transition

30 Finite Automata  A finite automaton accepts a string if we can follow transitions labelled with characters in the string from start state to some accepting state.

31 FA Example A FA that accepts only “1” 1

32 FA Example  A FA that accepts any number of 1’s followed by a single 0 0 1

33 FA Example  A FA that accepts ab*a  Alphabet: {a,b} a b a

34 Table Encoding of FA  Transition table a b a ab 01err

35 RE → Finite Automata  Can we build a finite automaton for every regular expression?  Yes, – build FA inductively based on the definition of Regular Expression

36 NFANFA Nondeterministic Finite Automaton (NFA)  Can have multiple transitions for one input in a given state  Can have  - moves

37 Epsilon Moves  ε – moves machine can move from state A to state B without consuming input  A B

38 NFANFA operation of the automaton is not completely defined by input On input “11”, automaton could be in either state ABC

39 Execution of FA A NFA can choose  Whether to make  -moves.  Which of multiple transitions to take for a single input.

40 Acceptance of NFA  NFA can get into multiple states  Rule: NFA accepts if it can get in a final state ABC 0

41 DFA and NFA Deterministic Finite Automata (DFA)  One transition per input per state.  No  - moves

42 Execution of FA A DFA  can take only one path through the state graph.  Completely determined by input.

43 NFA vs DFA  NFAs and DFAs recognize the same set of languages (regular languages)  DFAs are easier to implement – table driven.

44 NFA vs DFA  For a given language, the NFA can be simpler than the DFA.  DFA can be exponentially larger than NFA.

45 NFA vs DFA  NFAs are the key to automating RE → DFA construction.

46 RE → NFA Construction Thompson’s construction ( CACM 1968 )  Build an NFA for each RE term.  Combine NFAs with  -moves.

47 RE → NFA Construction Subset construction NFA → DFA  Build the simulation.  Minimize number of states in DFA (Hopcroft’s algorithm)

48 RE → NFA Construction Key idea:  NFA pattern for each symbol and each operator.  Join them with  -moves in precedence order.

49 RE → NFA Construction  s0s0 a s1s1 NFA for a s0s0 a s1s1 NFA for ab s3s3 b s4s4

50 RE → NFA Construction s0s0 a s1s1 NFA for a

51 RE → NFA Construction s0s0 a s1s1 NFA for a s3s3 b s4s4 NFA for b

52 RE → NFA Construction s0s0 a s1s1 NFA for a s0s0 a s1s1 s3s3 b s4s4 s3s3 b s4s4 NFA for b

53 RE → NFA Construction  s0s0 a s1s1 NFA for a s0s0 a s1s1 NFA for ab s3s3 b s4s4 s3s3 b s4s4 NFA for b

54 RE → NFA Construction  s0s0 s5s5 s1s1 a s2s2 NFA for a | b s3s3 b s4s4   

55 RE → NFA Construction s1s1 a s2s2 NFA for a

56 RE → NFA Construction s1s1 a s2s2 s3s3 b s4s4 NFA for a and b

57 RE → NFA Construction  s0s0 s5s5 s1s1 a s2s2 NFA for a | b s3s3 b s4s4   

58 RE → NFA Construction  s0s0 s4s4 s1s1 a s2s2 NFA for a*   

59 RE → NFA Construction s1s1 a s2s2 NFA for a

60 RE → NFA Construction  s0s0 s4s4 s1s1 a s2s2 NFA for a*   

61 Example RE → NFA NFA for a ( b|c )*  s3s3 s9s9 s4s4 s5s5 s6s6 s7s7 s8s8 s0s0 s1s1 s2s2 a       b c 

62 Example RE → NFA building NFA for a ( b|c )* s0s0 s1s1 a

63 Example RE → NFA NFA for a, b and c s4s4 s5s5 s6s6 s7s7 s0s0 s1s1 a b c

64 Example RE → NFA NFA for a and b|c s3s3 s4s4 s5s5 s6s6 s7s7 s8s8 s0s0 s1s1 a     b c

65 Example RE → NFA NFA for a and ( b|c )*  s3s3 s9s9 s4s4 s5s5 s6s6 s7s7 s8s8 s0s0 s1s1 s2s2 a       b c 

66 Example RE → NFA NFA for a ( b|c )*  s3s3 s9s9 s4s4 s5s5 s6s6 s7s7 s8s8 s0s0 s1s1 s2s2 a       b c 