CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.

Slides:



Advertisements
Similar presentations
Explanation-Based Learning (borrowed from mooney et al)
Advertisements

Artificial Intelligence 12. Two Layer ANNs
1 CS 391L: Machine Learning: Rule Learning Raymond J. Mooney University of Texas at Austin.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 20 Jim Martin.
Inductive Logic Programming: The Problem Specification Given: –Examples: first-order atoms or definite clauses, each labeled positive or negative. –Background.
CSCI 5417 Information Retrieval Systems Jim Martin Lecture 16 10/18/2011.
Knowledge in Learning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 19 Spring 2004.
Knowledge in Learning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 19 Spring 2005.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 8 Jim Martin.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 18 Jim Martin.
Machine Learning: Symbol-Based
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
Conceptual modelling. Overview - what is the aim of the article? ”We build conceptual models in our heads to solve problems in our everyday life”… ”By.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to.
For Friday Read chapter 22 Program 4 due. Program 4 Any questions?
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
Machine Learning Version Spaces Learning. 2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
Introduction to ILP ILP = Inductive Logic Programming = machine learning  logic programming = learning with logic Introduced by Muggleton in 1992.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 7, 2001.
Machine Learning CSE 681 CH2 - Supervised Learning.
Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Learning from Observations Chapter 18 Through
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
1 Machine Learning: Rule Learning. 2 Learning Rules If-then rules in logic are a standard representation of knowledge that have proven useful in expert-systems.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Machine Learning: Symbol-Based Luger: Artificial.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Chapter 9: Structured Data Extraction Supervised and unsupervised wrapper generation.
Predicate Calculus Syntax Countable set of predicate symbols, each with specified arity  0. For example: clinical data with multiple tables of patient.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Overview Concept Learning Representation Inductive Learning Hypothesis
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Copyright R. Weber Machine Learning, Data Mining INFO 629 Dr. R. Weber.
Learning, page 1 CSI 4106, Winter 2005 Symbolic learning Points Definitions Representation in logic What is an arch? Version spaces Candidate elimination.
The AI War LISP and Prolog Basic Concepts of Logic Programming
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
Slides for “Data Mining” by I. H. Witten and E. Frank.
For Wednesday Read 20.4 Lots of interesting stuff in chapter 20, but we don’t have time to cover it all.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
CS 5751 Machine Learning Chapter 10 Learning Sets of Rules1 Learning Sets of Rules Sequential covering algorithms FOIL Induction as the inverse of deduction.
First-Order Rule Learning. Sequential Covering (I) Learning consists of iteratively learning rules that cover yet uncovered training instances Assume.
For Wednesday Read ch. 20, sections 1, 2, 5, and 7 No homework.
Machine Learning A Quick look Sources: Artificial Intelligence – Russell & Norvig Artifical Intelligence - Luger By: Héctor Muñoz-Avila.
Machine Learning Concept Learning General-to Specific Ordering
Data Mining and Decision Support
More Symbolic Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Concept Learning and The General-To Specific Ordering
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
CS 9633 Machine Learning Concept Learning
Ordering of Hypothesis Space
Rule Learning Hankui Zhuo April 28, 2018.
CS 391L: Machine Learning: Rule Learning
Knowledge in Learning Chapter 19
Presentation transcript:

CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin

CSCI 5582 Fall 2006 Today 11/14 Review Hypothesis Learning –Version Spaces Break Relational Learning –ILP

CSCI 5582 Fall 2006 Review Supervised machine learning –Naïve Bayes –Decision trees –Decision lists –Ensembles

CSCI 5582 Fall 2006 Classifiers These all provide a way to separate objects into classes based on intrinsic features of the object (encoded as sets of feature/value pairs). They don’t necessarily provide a definition for the concept learned They can’t deal with relational data.

CSCI 5582 Fall 2006 Classifiers Uncle?

CSCI 5582 Fall 2006 Concept Learning In concept learning we’d like to learn something akin to a definition: necessary and sufficient conditions for membership in a category –Rules out all non-members –Includes all members And we’d like to be able to deal with relational data Assume we’re given positive and negative examples of the concept to be learned.

CSCI 5582 Fall 2006 Data Mining The field of data mining is concerned with the extraction of possibly useful rules or patterns from large amounts of data.

CSCI 5582 Fall 2006 Concept Learning And most importantly the concept to be learned is expressed in terms of predicates/propositions that are already known (that is we have a domain theory of some kind).

CSCI 5582 Fall 2006 Basics In the context of concept learning, a hypothesis is just a theory of the concept that –Includes all members of the category –Excludes all non-members A false negative is… A false positive is…

CSCI 5582 Fall 2006 Concept Learning: Search Again its just search. We’re searching through the space of possible hypotheses to find one (all?) that do exactly what we want: cover all and only the concepts we’re trying to learn.

CSCI 5582 Fall 2006 Current Best Hypothesis Maintain a single hypothesis at a time. Perform surgery on it on demand. If I give it a positive example and it covers it… If I give it a negative example and it rejects it…

CSCI 5582 Fall 2006 Current Best Hypothesis If I give a positive example and it rejects it… –False negative. Adjust the theory so that 1.It… 2.And it…

CSCI 5582 Fall 2006 Current Best Hypothesis If I give it a negative example and it accepts it –False positive Adjust the theory so that it 1.? 2.?

CSCI 5582 Fall 2006 CBH How? –Depends on the language being used. But the critical notion to exploit is generalization/specialization

CSCI 5582 Fall 2006 CBH If you need to cover a falsely rejected positive example… –You need to generalize your hypothesis If you need to reject a false accepted negative example… –You need to specialize your hypothesis

CSCI 5582 Fall 2006 How… Depends on the language… typically –Dropping/adding conditions for membership –Adding/removing disjuncts from a definition

CSCI 5582 Fall 2006 So… Search is just specializing/generalizing a single hypothesis in response to each successive training example Until you cover all and only. But….

CSCI 5582 Fall 2006 But The backtracking inherent in CBH is pretty horrible. It turns out it isn’t really required CBH is making commitments early on that it really doesn’t have to make –Exploit the hierarchy inherent in the logical structure of the hypotheses

CSCI 5582 Fall 2006 Version Space Learning You can represent the space of hypothesis by representing certain boundaries (without representing the hypotheses) themselves. In response to false positives and false negatives you simply adjust the boundaries. At the end the space of hypotheses within the boundaries are all consistent with the training data.

CSCI 5582 Fall 2006 Version Space Learning I give you a single positive training example… –What’s the most general theory you can come up with? –What’s the most specific theory you can come up with?

CSCI 5582 Fall 2006 VS

CSCI 5582 Fall 2006 Version Space Learning Termination…. –When I run out of training examples –When the VS collapses There are no theories left in the space

CSCI 5582 Fall 2006 Break Look at –holmes.txt and tarzan.txt in –

CSCI 5582 Fall 2006 Break I’ll go over the quiz topics Thursday

CSCI 5582 Fall 2006 Relational Learning and Inductive Logic Programming Fixed feature vectors are a very limited representation of objects. Examples or target concept may require relational representation that includes multiple entities with relationships among them. First-order predicate logic is a more powerful representation for handling such relational descriptions.

CSCI 5582 Fall 2006 ILP Example Learn definitions of family relationships given data for primitive types and relations. brother(A,C), parent(C,B) -> uncle(A,B) husband(A,C), sister(C,D), parent(D,B) -> uncle(A,B) Given the relevant predicates and a database populated with positive and negative examples By database I mean sets of tuples for each of the relevant relations

CSCI 5582 Fall 2006 FOIL First-Order Inductive Logic Top-down sequential covering algorithm to learn first order theories. Background knowledge provided extensionally (ie. A model) Start with the most general rule possible. (T --> P(x)) Specialize it on demand… Specializations of a clause include adding all possible literals one at a time to the antecedent… –A -> P –B -> P –C -> P… Where A, B and C are predicates already in the domain theory. We’re working top-down from the most general hypothesis so what’s driving things?

CSCI 5582 Fall 2006 FOIL At a high level. –Start with the most general H –Repeatedly constructs clauses that cover a subset of the positive examples and none of the negative examples. –Then remove the covered positive examples –Constructs another clause –Repeat until all the positive examples are covered.

CSCI 5582 Fall 2006 F OIL Training Data Background knowledge consists of complete set of tuples for each background predicate for this universe. Example: Consider learning a definition for the target predicate path for finding a path in a directed acyclic graph. path(X,Y) :- edge(X,Y). path(X,Y) :- edge(X,Z), path(Z,Y) edge: {,,,,, } path: {,,,,,,,,, }

CSCI 5582 Fall 2006 F OIL Negative Training Data Negative examples of target predicate can be provided directly, or generated indirectly by making a closed world assumption. –Every pair of constants not in positive tuples for path predicate Negative path tuples: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,,,,, } Start with clause: path(X,Y):-. Possible literals to add : edge(X,X),edge(Y,Y),edge(X,Y),edge(Y,X),edge(X,Z), edge(Y,Z),edge(Z,X),edge(Z,Y),path(X,X),path(Y,Y), path(X,Y),path(Y,X),path(X,Z),path(Y,Z),path(Z,X), path(Z,Y),X=Y, plus negations of all of these. Neg: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Neg: {,,,,,,,,,,,,,,,,, } Sample F OIL Induction Pos : {,,,,,,,,, } Test: path(X,Y):- edge(X,X). Covers 0 positive examples Covers 6 negative examples Not a good literal to try.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,,,,, } Test: path(X,Y):- edge(X,Y). Covers 6 positive examples Covers 0 negative examples Chosen as best literal. Result is base clause. Neg: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: path(X,Y):- edge(X,Y). Covers 6 positive examples Covers 0 negative examples Chosen as best literal. Result is base clause. Neg: {,,,,,,,,,,,,,,,,, } Remove covered positive tuples.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Start new clause path(X,Y):-. Neg: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: path(X,Y):- edge(X,Y). Neg: {,,,,,,,,,,,,,,,,, } Covers 0 positive examples Covers 0 negative examples Not a good literal.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 15 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 15 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 15 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 15 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 15 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,, } Covers all 4 positive examples Covers 15 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Continue specializing clause: path(X,Y):- edge(X,Z). Neg: {,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: path(X,Y):- edge(X,Z),edge(Z,Y). Neg: {,,,,,,,,,,,, } Covers 3 positive examples Covers 0 negative examples

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: path(X,Y):- edge(X,Z),path(Z,Y). Neg: {,,,,,,,,,,,, } Covers 4 positive examplesCovers 0 negative examples Eventually chosen as best literal; completes clause. Definition complete, since all original tuples are covered (by way of covering some tuple.)

CSCI 5582 Fall 2006 More Realistic Applications Classifying chemical compounds as mutagenic (cancer causing) based on their graphical molecular structure and chemical background knowledge. Classifying web documents based on both the content of the page and its links to and from other pages with particular content. –A web page is a university faculty home page if: It contains the words “Professor” and “University”, and It is pointed to by a page with the word “faculty”, and It points to a page with the words “course” and “exam”

CSCI 5582 Fall 2006 Rule Learning and ILP Summary There are effective methods for learning symbolic rules from data using greedy sequential covering and top-down or bottom-up search. These methods have been extended to first-order logic to learn relational rules and recursive Prolog programs. Knowledge represented by rules is generally more interpretable by people, allowing human insight into what is learned and possible human approval and correction of learned knowledge.