Sandro Spina, John Abela Department of CS & AI, University of Malta. Mutually compatible and incompatible merges for the search of the smallest consistent.

Slides:



Advertisements
Similar presentations
Characterization of state merging strategies which ensure identification in the limit from complete data Cristina Bibire.
Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille
Traveling Salesperson Problem
Iterative Deepening A* & Constraint Satisfaction Problems Lecture Module 6.
BackTracking Algorithms
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
October 1, 2012Introduction to Artificial Intelligence Lecture 8: Search in State Spaces II 1 A General Backtracking Algorithm Let us say that we can formulate.
1 Temporal Claims A temporal claim is defined in Promela by the syntax: never { … body … } never is a keyword, like proctype. The body is the same as for.
Dynamic Bayesian Networks (DBNs)
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
September 26, 2012Introduction to Artificial Intelligence Lecture 7: Search in State Spaces I 1 After our “Haskell in a Nutshell” excursion, let us move.
Characterization of state merging strategies which ensure identification in the limit from complete data (II) Cristina Bibire.
Best-First Search: Agendas
CS5371 Theory of Computation
Hidden Markov Models Pairwise Alignments. Hidden Markov Models Finite state automata with multiple states as a convenient description of complex dynamic.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
Fuzzy Inference System Learning By Reinforcement Presented by Alp Sardağ.
Pattern Matching II COMP171 Fall Pattern matching 2 A Finite Automaton Approach * A directed graph that allows self-loop. * Each vertex denotes.
Evaluating Hypotheses
Complexity (Running Time)
Fast Temporal State-Splitting for HMM Model Selection and Learning Sajid Siddiqi Geoffrey Gordon Andrew Moore.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,
Backtracking.
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
CS223 Algorithms D-Term 2013 Instructor: Mohamed Eltabakh WPI, CS Introduction Slide 1.
Artificial Intelligence Lecture 9. Outline Search in State Space State Space Graphs Decision Trees Backtracking in Decision Trees.
String Matching String matching: definition of the problem (text,pattern) depends on what we have: text or patterns Exact matching: Approximate matching:
Pairwise Sequence Alignment (I) (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 22, 2005 ChengXiang Zhai Department of Computer Science University.
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
1 Grammatical inference Vs Grammar induction London June 2007 Colin de la Higuera.
A Language Independent Method for Question Classification COLING 2004.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
CS 415 – A.I. Slide Set 5. Chapter 3 Structures and Strategies for State Space Search – Predicate Calculus: provides a means of describing objects and.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
Review: Tree search Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search.
AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)
Problem Reduction So far we have considered search strategies for OR graph. In OR graph, several arcs indicate a variety of ways in which the original.
Post-Ranking query suggestion by diversifying search Chao Wang.
University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Design and Analysis of Algorithms (09 Credits / 5 hours per week) Sixth Semester: Computer Science & Engineering M.B.Chandak
TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.
Search in State Spaces Problem solving as search Search consists of –state space –operators –start state –goal states A Search Tree is an efficient way.
Dynamic programming with more complex models When gaps do occur, they are often longer than one residue.(biology) We can still use all the dynamic programming.
Finding Regular Simple Paths Sept. 2013Yangjun Chen ACS Finding Regular Simple Paths in Graph Databases Basic definitions Regular paths Regular simple.
February 11, 2016Introduction to Artificial Intelligence Lecture 6: Search in State Spaces II 1 State-Space Graphs There are various methods for searching.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
Recursion ITI 1121 N. El Kadri. Reminders about recursion In your 1 st CS course (or its equivalent), you have seen how to use recursion to solve numerical.
Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.
Best-first search is a search algorithm which explores a graph by expanding the most promising node chosen according to a specified rule.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Review: Tree search Initialize the frontier using the starting state
Design and Analysis of Algorithms (09 Credits / 5 hours per week)
CSC 594 Topics in AI – Natural Language Processing
Informed Search and Exploration
Definition In simple terms, an algorithm is a series of instructions to solve a problem (complete a task) We focus on Deterministic Algorithms Under the.
Algorithms (2IL15) – Lecture 2
Haskell Tips You can turn any function that takes two inputs into an infix operator: mod 7 3 is the same as 7 `mod` 3 takeWhile returns all initial.
Taken largely from University of Delaware Compiler Notes
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Presentation transcript:

Sandro Spina, John Abela Department of CS & AI, University of Malta. Mutually compatible and incompatible merges for the search of the smallest consistent DFA Francois Coste INRIA/IRISA, Campus de Beaulieu, Rennes Cedex, France

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI  The motivation behind our work was that of improving the (greedy) heuristic used by EDSM. Work was also carried out on diversification of the search strategy.  EDSM [Price98] is very effective at inferring regular languages, except when training data is sparse.  According to [Price98], Abbadingo style problems can be solved with high confidence (0.93) when the number of matched state labels is greater than 10.  EDSM determines its merge sequence (greedily) by using a heuristic which compares language suffixes between two states in a DFA.  Three Complementary Tracks  Improve on Heuristic Score.  Improve on Search Strategy.  Combine these two. Evidence Driven State Merging

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI  Q. Whenever EDSM does not correctly infer the target language, can we (using a greedy depth-first search) improve the learner’s merge path by gathering and combining information (state label matches) from multiple valid merges? Does the combination of their evidence scores result in valuable information? Can this information be used to guide the search?  We think so !!! Some of the initial results are encouraging. Target Size Convergence Improves Drastically  Classification Rate Does Not Improve Consistently  EDSM score→ focuses on single merge analysis  S-EDSM score → score is a combination of single merge analysis Sharing Evidence

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI  Let M be the set of all possible merges.  A merge M, is said to be valid if all the states in the subtree of q1 are state compatible with the corresponding states in the subtree of q2.  Let M1, M2 2 M be two valid merges  We define the relation ↑ µ M X M as follows  M1↑M2 if M2 remains a valid merge in the hypothesis obtained by applying M1  If M1↑M2, we say that M1 is pairwise compatible to M2 Pairwise Compatible Merges

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI Pairwise Compatible Merges (Simple) Example

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI  Suppose that M1, M2 and M3 2 M where M1 ↑ M2 and M2 ↑ M3  This does not necessarily imply that M1↑M3. This is because some states in M2 can be labelled differently by M1 and M3.  Therefore ↑ is not transitive.  In order to make ↑ a transitive relation ( denoted as ↕ ), M1 ↑ M3 needs to be checked as well to create the set { M1, M2, M3 }  Set cardinality of mutual compatible merges can direct S-EDSM ‘s heuristic score. This is currently not implemented. Mutual Compatible Merges

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI S-EDSM Algorithm

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI  Shared Evidence Driven State Merging (S-EDSM) implements only pairwise compatibility by creating classes of M1 ↑ { M2 … Mn } for the top 30% valid merges. Scores are recalculated and the best merge is determined and executed. Various strategies can be implemented.  In terms of classification rate we are still not consistently performing better than classic EDSM. S-EDSM approximates better the target size of the target automaton. However this improvement does NOT help on its own. It’s only (possibly) an indication of a direction to follow. Initial Results

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI Results II ( 400 State Target Size Convergence )  This graph documents 10 consecutive problems downloaded from Gowachin. Training set consisted of 20,000 strings.

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI Results III ( 256 State Target Size Classification )

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI Pairwise Incompatible Merges for Search m1  M m2  Mm3  M m5  Mm4  M =≠ Classical Search Tree: ……… …… = = = = ≠ ≠ ≠

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI Pairwise Incompatible Merges for Search m1  M m2  M m’3  M  I(m1) m’5  M  I(m2) m4  M =≠ Candidates limitation after backtrack: … … … … … = = = = ≠ ≠ ≠ ≠

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI Rationale: –A merge m’  I(m) may be tried after m. –Introduces diversity in the search Edsm: I(m) may be computed [Coste & Fredouille,ICGI’00] S-Edsm: I(m) is available « for free » Significant improvement when applied to the 3 first choices. Best application of scheme after the choice m’3  M  I(m1) ? –After merging m’3 (=) –After not merging m’3 (≠) Pairwise Incompatible Merges for Search

Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI  Develop a calculus to describe merge interactions. Implement all the relations and functions ( mutual compatibility, dominance, etc. ) of the calculus. Analyse the results achieved from these different implementations.  Combine heuristic with better search strategies and study the best combination of heuristic and search strategy. Introduction of diversity in the exploration of the search space by limiting choice of candidate merges after a backtrack.  Noisy Data !! Can S-EDSM perform better by combining information between different merges. Maybe with information gathered from merge interactions S-EDSM can ‘discover’ noise in the training set.  Ultimately we want to see how far we can push, in terms of data sparseness, DFA learning.  Thank you. Future Directions