Presentation is loading. Please wait.

Presentation is loading. Please wait.

Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.

Similar presentations


Presentation on theme: "Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning."— Presentation transcript:

1 Theory Revision Chris Murphy

2 The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning every time we update data – Believe that our rule learners could perform much better if given basic theories to build off of

3 Two Types of Errors in Theories Over-generalization – Theory covers negative examples – Caused by incorrect rules in theory or by existing rules missing necessary constraints – Example: uncle(A,B) :- brother(A,C). – Solution: uncle(A,B) :- brother(A,C), parent(C,B).

4 Two Types of Errors in Theories Over-specialization – Theory does not cover all positive examples – Caused by rules having additional, unnecessary constraints or missing rules in the theory that are necessary to proving some examples – Example: uncle(A,B) :- brother(A,C), mother(C,B). – Solution: Uncle(A,B) :- brother(A,C), parent(C,B).

5 What is Theory Refinement? “…learning systems that have a goal of making small changes to an original theory to account for new data.” Combination of two processes: – Using a background theory to improve rule effectiveness and adequacy on data – Using problem detection and correction processes to make small adjustments to said theories

6 Basic Issues Addressed Is there an error in the existing theory? What part of the theory is incorrect? What correction needs to be made?

7 Theory Refinement Basics System is given a beginning theory about domain – Can be incorrect or incomplete (and often is) Well refined theory will: – Be accurate with new/updated data – Make as few changes as possible to original theory – Changes are monitored by a “Distance Metric” that keeps a count of every change made

8 The Distance Metric Adds every addition, deletion, or replacement of clauses Used to: – Measure syntactical corruptness of original theory – Determine how good a learning system is at replicating human created theories Drawback is that it does not recognize equivalent literals such as less(X,Y). And greq(Y,X). Table on the right shows examples of distance between theories, as well as its relationship to accuracy

9 Why Preserve the Original Theory? If you understood the original theory, you’ll likely understand the new one Similar theories will likely retain the ability to use abstract predicates from the original theory

10 Theory Refinement Systems EITHER FORTE AUDREY II KBANN FOCL, KR-FOCL, A-EBL, AUDREY, and more

11 EITHER Explanation-based and Inductive Theory Extension and Revision First system with ability to fix over-generalizing and over- specialization Able to correct multiple faults Uses one or more failings at a time to learn one or more corrections to a theory Able to correct intermediate points in theories Uses positive and negative examples Able to learn disjunctive rules Specialization algorithm does not allow positives to be eliminated Generalization algorithm does not allow negatives to be admitted

12 FORTE Attempts to prove all positive and negative examples using the current theory When errors are detected: – Identify all clauses that are candidates for revision – Determine whether clause needs to be specialized or generalized – Determine what operators to test for various revisions Best revision is determined based on its accuracy when tested on complete training set Process repeats until system perfectly classifies the training set or until FORTE finds that no revisions improve the accuracy of the theory

13 Specializing a Theory Needs to happen when one or more negatives are covered Ways to fix the problem: – Delete a clause: simple, just delete and retest – Add new antecedents to existing clause More difficult FORTE uses two methods... – Add one antecedent at a time, like FOIL, choosing the antecedent that provides the best info gain at any point – Relational Pathfinding – uses graph structures to find new relations in data

14 Generalizing a Theory Need to generalize when positives are not covered Ways FORTE generalizes: – Delete antecedents from an existing clause (either singly or in groups) – Add a new clause Copy clause identified at the revision point Purposely over-generalize Send over-general rule to specialization algorithm – Use inverse relation operators “identification” and “absorption” These use intermediate rules to provide more options for alternative definitions

15 AUDREY II Runs in two main phases: – Initial domain theory is specialized to eliminate negative coverage At each step, a best clause is chosen, it is specialized, and the process repeats Best clause is the one that contributes the most negative examples being incorrectly classified and is required by the fewest number of positives If best clause covers no positives, it is deleted, otherwise, literals are added in a FOIL-like manner to eliminate covered negatives

16 AUDREY II – Revised theory is generalized to cover all positives (without covering any negatives) Uncovered positive example is randomly chosen, and theory is generalized to cover the example Process repeats until all remaining positives are covered If assumed literals can be removed without decreasing positive coverage, that is done If not, AUDREY II tries replacing literals with new conjuction of literals (also uses FOIL-type process) If deleting and replacement fail, system uses a FOIL-like method of determining entirely new clauses for proving the literal

17 KBANN System that takes a domain theory of Prolog style clauses, and transforms it into knowledge-based neural network (KNN) – Uses the knowledge base (background theory) to determine topology and initial weights of KNN Different units and links within KNN correspond to various components of the domain theory Topologies of KNNs can be different than topologies that we have seen in neural networks

18 KBANN KNNs are trained on example data, and rules are extracted using an N of M method (saves time) Domain theories for KBANN need not contain all intermediate theories necessary to learn certain concepts – Adding hidden units along with units specified by the domain theory allows the network to induce necessary terms not stated in background info Problems arise when interpreting intermediate rules learned from hidden nodes – Difficult to label them based on the inputs they resulted from – In one case, programmers labeled rules based on the section of info that they were attached to in that topology

19 System Comparison AUDREY II is better than FOCL at theory revision, but it still has room for improvement – Its revised theories are closer to both original theory and human- created correct theory

20 System Comparison AUDREY II is slightly more accurate than FORTE, and its revised theories are closer to the original and correct theories KR-FOCL addresses some issues of other systems by allowing user to decide among changes that have the same accuracy

21 Applications of Theory Refinement Used to identify different parts of both DNA and RNA sequences Used to debug student written basic Prolog programs Used to maintain working theories as new data is obtained


Download ppt "Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning."

Similar presentations


Ads by Google