Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.

Slides:



Advertisements
Similar presentations
Data Mining Classification: Alternative Techniques
Advertisements

Protein – Protein Interactions Lisa Chargualaf Simon Kanaan Keefe Roedersheimer Others: Dr. Izaguirre, Dr. Chen, Dr. Wuchty, ChengBang Huang.
Bridgette Parsons Megan Tarter Eva Millan, Tomasz Loboda, Jose Luis Perez-de-la-Cruz Bayesian Networks for Student Model Engineering.
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Logic Use mathematical deduction to derive new knowledge.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
THE DISTRIBUTION OF SAMPLE MEANS How samples can tell us about populations.
Data Mining Classification: Alternative Techniques
Programming Types of Testing.
Knowledge Representation and Reasoning Learning Sets of Rules and Analytical Learning Harris Georgiou – 4.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
System Design and Analysis
Induction of Decision Trees
Three kinds of learning
Machine Learning: Symbol-Based
Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Basic Concepts of Computer Networks
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Introduction to Systems Analysis and Design Trisha Cummings.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
SVM by Sequential Minimal Optimization (SMO)
Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach by: Craig A. Knoblock, Kristina Lerman Steven Minton, Ion Muslea Presented.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
Chapter 3 Developing an algorithm. Objectives To introduce methods of analysing a problem and developing a solution To develop simple algorithms using.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Friday, February 4, 2000 Lijun.
Researchers: Preet Bola Mike Earnest Kevin Varela-O’Hara Han Zou Advisor: Walter Rusin Data Storage Networks.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
Introduction Image geometry studies rotation, translation, scaling, distortion, etc. Image topology studies, e.g., (i) the number of occurrences.
Diverse Routing Algorithms
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
For Wednesday Read 20.4 Lots of interesting stuff in chapter 20, but we don’t have time to cover it all.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
First-Order Logic and Inductive Logic Programming.
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Data Mining and Decision Support
CS 5751 Machine Learning Chapter 12 Comb. Inductive/Analytical 1 Combining Inductive and Analytical Learning Why combine inductive and analytical learning?
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Mutation Testing Breaking the application to test it.
#1 Make sense of problems and persevere in solving them How would you describe the problem in your own words? How would you describe what you are trying.
Analyzing Promoter Sequences with Multilayer Perceptrons Glenn Walker ECE 539.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Graphcut Textures:Image and Video Synthesis Using Graph Cuts
Rule Induction for Classification Using
CS 9633 Machine Learning Inductive-Analytical Methods
Challenges in Creating an Automated Protein Structure Metaserver
Creating fuzzy rules from numerical data using a neural network
First-Order Logic and Inductive Logic Programming
Data Mining Classification: Alternative Techniques
Chapter 5. Optimal Matchings
Rule Learning Hankui Zhuo April 28, 2018.
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Regression Testing.
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Implementation of Learning Systems
Presentation transcript:

Theory Revision Chris Murphy

The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning every time we update data – Believe that our rule learners could perform much better if given basic theories to build off of

Two Types of Errors in Theories Over-generalization – Theory covers negative examples – Caused by incorrect rules in theory or by existing rules missing necessary constraints – Example: uncle(A,B) :- brother(A,C). – Solution: uncle(A,B) :- brother(A,C), parent(C,B).

Two Types of Errors in Theories Over-specialization – Theory does not cover all positive examples – Caused by rules having additional, unnecessary constraints or missing rules in the theory that are necessary to proving some examples – Example: uncle(A,B) :- brother(A,C), mother(C,B). – Solution: Uncle(A,B) :- brother(A,C), parent(C,B).

What is Theory Refinement? “…learning systems that have a goal of making small changes to an original theory to account for new data.” Combination of two processes: – Using a background theory to improve rule effectiveness and adequacy on data – Using problem detection and correction processes to make small adjustments to said theories

Basic Issues Addressed Is there an error in the existing theory? What part of the theory is incorrect? What correction needs to be made?

Theory Refinement Basics System is given a beginning theory about domain – Can be incorrect or incomplete (and often is) Well refined theory will: – Be accurate with new/updated data – Make as few changes as possible to original theory – Changes are monitored by a “Distance Metric” that keeps a count of every change made

The Distance Metric Adds every addition, deletion, or replacement of clauses Used to: – Measure syntactical corruptness of original theory – Determine how good a learning system is at replicating human created theories Drawback is that it does not recognize equivalent literals such as less(X,Y). And greq(Y,X). Table on the right shows examples of distance between theories, as well as its relationship to accuracy

Why Preserve the Original Theory? If you understood the original theory, you’ll likely understand the new one Similar theories will likely retain the ability to use abstract predicates from the original theory

Theory Refinement Systems EITHER FORTE AUDREY II KBANN FOCL, KR-FOCL, A-EBL, AUDREY, and more

EITHER Explanation-based and Inductive Theory Extension and Revision First system with ability to fix over-generalizing and over- specialization Able to correct multiple faults Uses one or more failings at a time to learn one or more corrections to a theory Able to correct intermediate points in theories Uses positive and negative examples Able to learn disjunctive rules Specialization algorithm does not allow positives to be eliminated Generalization algorithm does not allow negatives to be admitted

FORTE Attempts to prove all positive and negative examples using the current theory When errors are detected: – Identify all clauses that are candidates for revision – Determine whether clause needs to be specialized or generalized – Determine what operators to test for various revisions Best revision is determined based on its accuracy when tested on complete training set Process repeats until system perfectly classifies the training set or until FORTE finds that no revisions improve the accuracy of the theory

Specializing a Theory Needs to happen when one or more negatives are covered Ways to fix the problem: – Delete a clause: simple, just delete and retest – Add new antecedents to existing clause More difficult FORTE uses two methods... – Add one antecedent at a time, like FOIL, choosing the antecedent that provides the best info gain at any point – Relational Pathfinding – uses graph structures to find new relations in data

Generalizing a Theory Need to generalize when positives are not covered Ways FORTE generalizes: – Delete antecedents from an existing clause (either singly or in groups) – Add a new clause Copy clause identified at the revision point Purposely over-generalize Send over-general rule to specialization algorithm – Use inverse relation operators “identification” and “absorption” These use intermediate rules to provide more options for alternative definitions

AUDREY II Runs in two main phases: – Initial domain theory is specialized to eliminate negative coverage At each step, a best clause is chosen, it is specialized, and the process repeats Best clause is the one that contributes the most negative examples being incorrectly classified and is required by the fewest number of positives If best clause covers no positives, it is deleted, otherwise, literals are added in a FOIL-like manner to eliminate covered negatives

AUDREY II – Revised theory is generalized to cover all positives (without covering any negatives) Uncovered positive example is randomly chosen, and theory is generalized to cover the example Process repeats until all remaining positives are covered If assumed literals can be removed without decreasing positive coverage, that is done If not, AUDREY II tries replacing literals with new conjuction of literals (also uses FOIL-type process) If deleting and replacement fail, system uses a FOIL-like method of determining entirely new clauses for proving the literal

KBANN System that takes a domain theory of Prolog style clauses, and transforms it into knowledge-based neural network (KNN) – Uses the knowledge base (background theory) to determine topology and initial weights of KNN Different units and links within KNN correspond to various components of the domain theory Topologies of KNNs can be different than topologies that we have seen in neural networks

KBANN KNNs are trained on example data, and rules are extracted using an N of M method (saves time) Domain theories for KBANN need not contain all intermediate theories necessary to learn certain concepts – Adding hidden units along with units specified by the domain theory allows the network to induce necessary terms not stated in background info Problems arise when interpreting intermediate rules learned from hidden nodes – Difficult to label them based on the inputs they resulted from – In one case, programmers labeled rules based on the section of info that they were attached to in that topology

System Comparison AUDREY II is better than FOCL at theory revision, but it still has room for improvement – Its revised theories are closer to both original theory and human- created correct theory

System Comparison AUDREY II is slightly more accurate than FORTE, and its revised theories are closer to the original and correct theories KR-FOCL addresses some issues of other systems by allowing user to decide among changes that have the same accuracy

Applications of Theory Refinement Used to identify different parts of both DNA and RNA sequences Used to debug student written basic Prolog programs Used to maintain working theories as new data is obtained