1 Approximate string matching using factor automata Jan Holub and Borivoj Melichar Theoretical Computer Science vol.249 p.305-311 Speaker: L. C. Chen Advisor:

Slides:



Advertisements
Similar presentations
1 Approximate string matching using factor automata J. Holub and B. Melichar Theoretical Computer Science vol.249 p Speaker: L. C. Chen Advisor:
Advertisements

4b Lexical analysis Finite Automata
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
Complexity and Computability Theory I Lecture #4 Rina Zviel-Girshin Leah Epstein Winter
YES-NO machines Finite State Automata as language recognizers.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
Intro to DFAs Readings: Sipser 1.1 (pages 31-44) With basic background from Sipser 0.
Intro to DFAs Readings: Sipser 1.1 (pages 31-44) With basic background from Sipser 0.
Lecture 3UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 3.
1 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY (For next time: Read Chapter 1.3 of the book)
CS5371 Theory of Computation
61 Nondeterminism and Nodeterministic Automata. 62 The computational machine models that we learned in the class are deterministic in the sense that the.
Finite Automata Finite-state machine with no output. FA consists of States, Transitions between states FA is a 5-tuple Example! A string x is recognized.
1 The scanning process Goal: automate the process Idea: –Start with an RE –Build a DFA How? –We can build a non-deterministic finite automaton (Thompson's.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.
1 Regular Languages Finite Automata eg. Supermarket automatic door: exit or entrance.
Topics Automata Theory Grammars and Languages Complexities
1.Defs. a)Finite Automaton: A Finite Automaton ( FA ) has finite set of ‘states’ ( Q={q 0, q 1, q 2, ….. ) and its ‘control’ moves from state to state.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Great Theoretical Ideas in Computer Science.
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Basics of automata theory
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
1 Chapter 1 Introduction to the Theory of Computation.
Athasit Surarerks THEORY OF COMPUTATION 07 NON-DETERMINISTIC FINITE AUTOMATA 1.
CHAPTER 1 Regular Languages
Natural Language Processing Lecture 4 : Regular Expressions and Automata.
INHERENT LIMITATIONS OF COMPUTER PROGAMS CSci 4011.
Brian Mitchell - Drexel University MCS680-FCS 1 Patterns, Automata & Regular Expressions int MSTWeight(int graph[][], int size)
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
Modeling Computation: Finite State Machines without Output
Nondeterministic Finite Automata (NFAs). Reminder: Deterministic Finite Automata (DFA) q For every state q in Q and every character  in , one and only.
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Conversions Regular Expression to FA FA to Regular Expression.
1 Section 11.2 Finite Automata Can a machine(i.e., algorithm) recognize a regular language? Yes! Deterministic Finite Automata A deterministic finite automaton.
Set, Alphabets, Strings, and Languages. The regular languages. Clouser properties of regular sets. Finite State Automata. Types of Finite State Automata.
Theory of Languages and Automata By: Mojtaba Khezrian.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Theory of Languages and Automata By: Mojtaba Khezrian.
Fall 2004COMP 3351 Finite Automata. Fall 2004COMP 3352 Finite Automaton Input String Output String Finite Automaton.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Lecture Three: Finite Automata Finite Automata, Lecture 3, slide 1 Amjad Ali.
Nondeterminism The Chinese University of Hong Kong Fall 2011
Text Search ~ k A R B n f u j ! k e
Theory of Computation Lecture # 9-10.
Finite Automata & Regular Languages
Jaya Krishna, M.Tech, Assistant Professor
An Introduction to Finite Automata
Non-Deterministic Finite Automata
COSC 3340: Introduction to Theory of Computation
Nondeterministic Finite Automata
CS 350 — Fall 2018 gilray.org/classes/fall2018/cs350/
Finite Automata.
Chapter 3. Lexical Analysis (2)
Chapter 2 Context-Free Language - 01
Chapter 1 Regular Language
CHAPTER 1 Regular Languages
Non Deterministic Automata
Lexical Analysis Uses formalism of Regular Languages
Text Search ~ k A R B n f u j ! k e
Presentation transcript:

1 Approximate string matching using factor automata Jan Holub and Borivoj Melichar Theoretical Computer Science vol.249 p Speaker: L. C. Chen Advisor: R. C. T. Lee

2 Problem D L (P, X) between strings P and X is the minimum number of edit operations replace, insert and delete needed to convert string P to X. Given a text T, a pattern P, and an integer k, k ≦ m ≦ n, approximate string matching can be defined as determining whether string X occurs in text T such that edit distance D L (P, X) between pattern P and string X is less than or equal to k.

3 Basic definition Fac(T): a set contains all the substrings of text T. A nondeterministic finite automaton (NFA) is a five- tuple M=(Q, Σ, δ, q 0, F), where Q is a finite set of states, Σ is a finite input alphabet, δ is a mapping from Q×(Σ ∪ {ε}) into the set of subsets of Q, q 0 Q is an initial state, and F Q is a set of final states. M(Fac(T)): a factor automaton accepts Fac(T).

4 T=aabbabd Fac(T)={a,b,d,aa,ab,bb,ba,bd,aab,abb,bba,bab,abd,aabb,abba,bbab,babd aabba,abbab,bbabd,aabbab,abbabd,aabbabd} Factor automaton Factor automation M(Fac(T)): a deterministic finite automaton (DFA) accepts all substrings of the given text T.

5 A suffix tree can also be used to recognize all substrings of T=aabbabd, Fac(T)={a,b,d,aa,ab,bb,ba,bd,aab,abb,bba,bab,abd,aabb,abba,bbab,babd aabba,abbab,bbabd,aabbab,abbabd,aabbabd}

6 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. One matched, 0 error.

7 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize ab

8 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize aab

9 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize bbab

10 Definition Let An automaton for intersection of M 1 and M 2 is an automaton

11 T=aabbabd P = bab, k=1 Intersection of M(Lk(P)) and M(Fac(T)). Solutions : {ba, bab, bb, bbab, aab, ab}(All end with {3,0} or {3,1}.)

12 T=aabbabd P = bab, k=1 Intersection of M(Lk(P)) and M(Fac(T)).

13 Intersection aabbabd bab T P Delete!

14 Intersection aabbabd bab T P Match!

15 Intersection aabbabd bab T P Delete!

16 Intersection aabbabd bbab T P Insert!

17 Intersection aabbabd b aab T P Replace!

18 Intersection aabbabd bab T P Delete!

19 Lemma The number of automaton is always lower than.

20 T=aabbabd P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}.

21 Thank you!