For Monday Finish chapter 19 No homework. Program 4 Any questions?

Slides:



Advertisements
Similar presentations
Explanation-Based Learning (borrowed from mooney et al)
Advertisements

Analytical Learning.
Heuristic Search techniques
1 CS 391L: Machine Learning: Rule Learning Raymond J. Mooney University of Texas at Austin.
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions.
Answer Set Programming Overview Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma de Guadalajara
Inductive Logic Programming: The Problem Specification Given: –Examples: first-order atoms or definite clauses, each labeled positive or negative. –Background.
Structure Learning Using Causation Rules Raanan Yehezkel PAML Lab. Journal Club March 13, 2003.
Knowledge Representation and Reasoning Learning Sets of Rules and Analytical Learning Harris Georgiou – 4.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Chapter 10 Learning Sets Of Rules
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Implicit Hitting Set Problems Richard M. Karp Harvard University August 29, 2011.
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.
Chapter 5 Outline Formal definition of CSP CSP Examples
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to.
For Friday Read chapter 22 Program 4 due. Program 4 Any questions?
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
For Monday No reading Homework: –Chapter 18, exercises 1 and 2.
Mohammad Ali Keyvanrad
1 Machine Learning: Lecture 11 Analytical Learning / Explanation-Based Learning (Based on Chapter 11 of Mitchell, T., Machine Learning, 1997)
For Monday Read chapter 18, sections 5-6 Homework: –Chapter 18, exercises 1-2.
Introduction to ILP ILP = Inductive Logic Programming = machine learning  logic programming = learning with logic Introduced by Muggleton in 1992.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 7, 2001.
Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
For Friday No reading No homework. Program 4 Exam 2 A week from Friday Covers 10, 11, 13, 14, 18, Take home due at the exam.
Declarative vs Procedural Programming  Procedural programming requires that – the programmer tell the computer what to do. That is, how to get the output.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
1 Machine Learning: Rule Learning. 2 Learning Rules If-then rules in logic are a standard representation of knowledge that have proven useful in expert-systems.
For Wednesday No reading No homework. Exam 2 Friday. Will cover material through chapter 18. Take home is due Friday.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
Unification Algorithm Input: a finite set Σ of simple expressions Output: a mgu for Σ (if Σ is unifiable) 1. Set k = 0 and  0 = . 2. If Σ  k is a singleton,
Predicate Calculus Syntax Countable set of predicate symbols, each with specified arity  0. For example: clinical data with multiple tables of patient.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
For Wednesday Read 20.4 Lots of interesting stuff in chapter 20, but we don’t have time to cover it all.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
CS 5751 Machine Learning Chapter 10 Learning Sets of Rules1 Learning Sets of Rules Sequential covering algorithms FOIL Induction as the inverse of deduction.
First-Order Logic and Inductive Logic Programming.
For Wednesday Read ch. 20, sections 1, 2, 5, and 7 No homework.
Computer Science CPSC 322 Lecture 22 Logical Consequences, Proof Procedures (Ch 5.2.2)
NMR98 - Logic Programming1 Learning with Extended Logic Programs Evelina Lamma (1), Fabrizio Riguzzi (1), Luís Moniz Pereira (2) (1)DEIS, University of.
1 Chapter 3 Complexity of Classical Planning. 2 Review: Classical Representation Function-free first-order language L Statement of a classical planning.
CpSc 810: Machine Learning Analytical learning. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various.
Data Mining and Decision Support
More Symbolic Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Computational Learning Theory Part 1: Preliminaries 1.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
CS 9633 Machine Learning Explanation Based Learning
CS 9633 Machine Learning Concept Learning
Knowledge Representation
Rule Learning Hankui Zhuo April 28, 2018.
CS 391L: Machine Learning: Rule Learning
Presentation transcript:

For Monday Finish chapter 19 No homework

Program 4 Any questions?

Exam 2 Review questions?

Different Ways of Incorporating Knowledge in Learning Explanation Based Learning (EBL) Theory Revision (or Theory Refinement) Knowledge Based Inductive Learning (in first-order logic - Inductive Logic Programming (ILP)

Explanation Based Learning Requires two inputs –Labeled examples (maybe very few) –Domain theory Goal –To produce operational rules that are consistent with both examples and theory –Classical EBL requires that the theory entail the resulting rules

Why Do EBL? Often utilitarian or speed-up learning Example: DOLPHIN –Uses EBL to improve planning –Both speed-up learning and improving plan quality

Theory Refinement Inputs the same as EBL –Theory –Examples Goal –Fix the theory so that it agrees with the examples Theory may be incomplete or wrong

Why Do Theory Refinement? Potentially more accurate than induction alone Able to learn from fewer examples May influence the structure of the theory to make it more comprehensible to experts

How Is Theory Refinement Done? Initial State: Initial Theory Goal State: Theory that fits training data. Operators: Atomic changes to the syntax of a theory: –Delete rule or antecedent, add rule or antecedent –Increase parameter, Decrease parameter –Delete node or link, add node or link Path cost: Number of changes made, or total cost of changes made.

Theory Refinement As Heuristic Search Finding the “closest” theory that is consistent with the data is generally intractable (NP­hard). Complete consistency with training data is not always desirable, particularly if the data is noisy. Therefore, most methods employ some form of greedy or hill­climibing search. Also, usually employ some form of over­fitting avoidance method to avoid learning an overly complex revision.

Theory Refinement As Bias Bias is to learn a theory which is syntactically similar to the initial theory. Distance can be measured in terms of the number of edit operations needed to revise the theory (edit distance). Assumes the syntax of the initial theory is “approximately correct.” A bias for minimal semantic revision would simply involve memorizing the exceptions to the theory, which is undesirable with respect to generalizing to novel data.

Inductive Logic Programming Representation is Horn clauses Builds rules using background predicates Rules are potentially much more expressive than attribute-value representations

Example Results Rules for family relations from data of primitive or related predicates. uncle(A,B) :­ brother(A,C), parent(C,B). uncle(A,B) :­ husband(A,C), sister(C,D), parent(D,B). Recursive list programs. member(X,[X | Y]). member(X, [Y | Z]) :­ member(X, Z).

ILP Goal is to induce a Horn­clause definition for some target predicate P given definitions of background predicates Q i. Goal is to find a syntactically simple definition D for P such that given background predicate definitions B –For every positive example p i : D  B |= p –For every negative example n i : D  B |/= n Background definitions are either provided –Extensionally: List of ground tuples satisfying the predicate. –Intensionally: Prolog definition of the predicate.

Sequential Covering Algorithm Let P be the set of positive examples Until P is empty do Learn a rule R that covers a large number of positives without covering any negatives. Add R to the list of learned rules. Remove positives covered by R from P

This is just an instance of the greedy algorithm for minimum set covering and does not guarantee that a minimum number of rules is learned but tends to learn a reasonably small rule set. Minimum set covering is an NP­hard problem and the greedy algorithm is a common approximation algorithm. There are several ways to learn a single rule used in various methods.

Strategies for Learning a Single Rule Top­Down (General to Specific): –Start with the most general (empty) rule. –Repeatedly add feature constraints that eliminate negatives while retaining positives. –Stop when only positives are covered. Bottom­Up (Specific to General): –Start with a most specific rule (complete description of a single instance). –Repeatedly eliminate feature constraints in order to cover more positive examples. –Stop when further generalization results in covering negatives.

FOIL Basic top­down sequential covering algorithm adapted for Prolog clauses. Background provided extensionally. Initialize clause for target predicate P to P(X 1,...X r ) :­. Possible specializations of a clause include adding all possible literals: –Q i (V 1,...V r ) –not(Q i (V 1,...V r )) –X i = X j –not(X i = X ) where X's are variables in the existing clause, at least one of V 1,...V r is an existing variable, others can be new. Allow recursive literals if not cause infinite regress.

Foil Input Data Consider example of finding a path in a directed acyclic graph. Intended Clause: path(X,Y) :­ edge(X,Y). path(X,Y) :­ edge(X,Z), path (Z,Y). Examples edge: {,,,,, } path: {,,,,,,,,, } Negative examples of the target predicate can be provided directly or indirectly produced using a closed world assumption. Every pair not in positive tuples for path.

Example Induction + : {,,,,,,,,, } - : {,,,,,,,,,,,,,,,, } Start with empty rule: path(X,Y) :­. Among others, consider adding literal edge(X,Y) (also consider edge(Y,X), edge(X,Z), edge(Z,X), path(Y,X), path(X,Z), path(Z,X), X=Y, and negations) 6 positive tuples and NO negative tuples covered. Create “base case” and remove covered examples: path(X,Y) :­ edge(X,Y).

+ : {,,, } - : {,,,,,,,,,,,,,,,,, } Start with new empty rule: path(X,Y) :­. Consider literal edge(X,Z) (among others...) 4 remaining positives satisfy it but so do 10 of 20 negatives Current rule: path(x,y) :­ edge(X,Z). Consider literal path(Z,Y) (as well as edge(X,Y), edge(Y,Z), edge(X,Z), path(Z,X), etc....) No negatives covered, complete clause. path(X,Y) :­ edge(X,Z), path(Z,Y). New clause actually covers all remaining positive tuples of path, so definition is complete.

Picking the Best Literal Based on information gain (similar to ID3). |p|*(log 2 (|p| /(|p|+|n|)) - log 2 (|P| /(|P|+|N|))) P is number of positives before adding literal L N is number of negatives before adding literal L p is number of positives after adding literal L n is number of negatives after adding literal L Given n predicates of arity m there are O(n2 m ) possible literals to chose from, so branching factor can be quite large.

Other Approaches Golem CHILL Foidl Bufoidl

Domains Any kind of concept learning where background knowledge is useful. Natural Language Processing Planning Chemistry and biology –DNA –Protein structure