Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Thursday, 12 April 2007 William.

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
Advertisements

Knowledge Representation and Reasoning Learning Sets of Rules and Analytical Learning Harris Georgiou – 4.
Chapter 10 Learning Sets Of Rules
Machine Learning Chapter 10. Learning Sets of Rules Tom M. Mitchell.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Tuesday, November 27, 2001 William.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
Rakesh Agrawal Ramakrishnan Srikant
Chapter 5: Mining Frequent Patterns, Association and Correlations
Data Mining: Concepts and Techniques (2nd ed.) — Chapter 5 —
732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña Association rules Apriori algorithm FP grow algorithm.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Learning set of rules.
Data Mining Association Analysis: Basic Concepts and Algorithms
FP-growth. Challenges of Frequent Pattern Mining Improving Apriori Fp-growth Fp-tree Mining frequent patterns with FP-tree Visualization of Association.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
1 Mining Frequent Patterns Without Candidate Generation Apriori-like algorithm suffers from long patterns or quite low minimum support thresholds. Two.
Association Analysis: Basic Concepts and Algorithms.
Computing & Information Sciences Kansas State University Lecture 11 of 42 CIS 530 / 730 Artificial Intelligence Lecture 11 of 42 William H. Hsu Department.
Mining Association Rules in Large Databases
Mining Association Rules
Mining Association Rules
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
Chapter 5 Mining Association Rules with FP Tree Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
Machine Learning Chapter 3. Decision Tree Learning
Ch5 Mining Frequent Patterns, Associations, and Correlations
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Wednesday, 11 April 2007 William.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 March 2007 William.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 7, 2001.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
Computing & Information Sciences Kansas State University Wednesday, 20 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 12 of 42 Wednesday, 20 September.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 11 of 41 Wednesday, 15.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 13 of 41 Monday, 20 September.
Computing & Information Sciences Kansas State University Lecture 13 of 42 CIS 530 / 730 Artificial Intelligence Lecture 13 of 42 William H. Hsu Department.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 17 Wednesday, 01 October.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Thursday, November 29, 2001.
Data Mining Find information from data data ? information.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 12 Friday, 17 September.
Computing & Information Sciences Kansas State University Lecture 14 of 42 CIS 530 / 730 Artificial Intelligence Lecture 14 of 42 William H. Hsu Department.
Computing & Information Sciences Kansas State University Monday, 25 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 14 of 42 Monday, 25 September.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 23 Friday, 17 October.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 14 of 41 Wednesday, 22.
CS 5751 Machine Learning Chapter 10 Learning Sets of Rules1 Learning Sets of Rules Sequential covering algorithms FOIL Induction as the inverse of deduction.
First-Order Rule Learning. Sequential Covering (I) Learning consists of iteratively learning rules that cover yet uncovered training instances Assume.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining ARM: Improvements March 10, 2009 Slide.
Association Analysis (3)
Computing & Information Sciences Kansas State University Lecture 12 of 42 CIS 530 / 730 Artificial Intelligence Lecture 12 of 42 William H. Hsu Department.
What is Frequent Pattern Analysis?
Computing & Information Sciences Kansas State University Wednesday, 13 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 10 of 42 Wednesday, 13 September.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining COMP Seminar BCB 713 Module Spring 2011.
Computing & Information Sciences Kansas State University Wednesday, 04 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 17 of 42 Wednesday, 04 October.
Computing & Information Sciences Kansas State University Friday, 13 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 21 of 42 Friday, 13 October.
Rule-based Learning Propositional Version. Rule Learning Based on generalization operations A generalization (resp. specialization) operation is an operation.
Computing & Information Sciences Kansas State University Monday, 18 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 11 of 42 Monday, 18 September.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
CS 9633 Machine Learning Decision Tree Learning
Machine Learning: Lecture 3
Presentation transcript:

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Thursday, 12 April 2007 William H. Hsu Department of Computing and Information Sciences, KSU Readings: Chapter 10, Mitchell Association Rule references Classification Rule Learning (concluded) and Association Rule Learning Lecture 34 of 42

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Lecture Outline Readings: Sections , Mitchell Suggested Exercises: 10.1, 10.2 Mitchell Sequential Covering Algorithms –Learning single rules by search Beam search Alternative covering methods –Learning rule sets First-Order Rules –Learning single first-order rules –FOIL: learning first-order rule sets

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Lecture Outline Readings: Sections , Mitchell; Section 21.4, Russell and Norvig Suggested Exercises: 10.5, Mitchell Induction as Inverse of Deduction –Problem of inductive learning revisited –Operators for automated deductive inference Resolution rule for deduction First-order predicate calculus (FOPC) and resolution theorem proving –Inverting resolution Propositional case First-order case Inductive Logic Programming (ILP) –Cigol –Progol

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Sequential Covering Algorithm (CN2): Review Algorithm Sequential-Covering (Target-Attribute, Attributes, D, Threshold) –Learned-Rules  {} –New-Rule  Learn-One-Rule (Target-Attribute, Attributes, D) –WHILE Performance (Rule, Examples) > Threshold DO Learned-Rules += New-Rule// add new rule to set D.Remove-Covered-By (New-Rule)// remove examples covered by New-Rule New-Rule  Learn-One-Rule (Target-Attribute, Attributes, D) –Sort-By-Performance (Learned-Rules, Target-Attribute, D) –RETURN Learned-Rules What Does Sequential-Covering Do? –Learns one rule, New-Rule –Takes out every example in D to which New-Rule applies (every covered example)

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition IF {Humidity = Normal} THEN Play-Tennis = Yes IF {Wind = Strong} THEN Play-Tennis = No IF {Wind = Light} THEN Play-Tennis = Yes IF {Humidity = High} THEN Play-Tennis = No … Learn-One-Rule via (Beam) Search: Review IF {} THEN Play-Tennis = Yes … IF {Humidity = Normal, Outlook = Sunny} THEN Play-Tennis = Yes IF {Humidity = Normal, Wind = Strong} THEN Play-Tennis = Yes IF {Humidity = Normal, Wind = Light} THEN Play-Tennis = Yes IF {Humidity = Normal, Outlook = Rain} THEN Play-Tennis = Yes

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Learn-One-Rule Algorithm: Review Algorithm Learn-One-Rule (Target-Attribute, Attributes, D) –New-Rule  most general rule possible –New-Rule-Neg  Neg –WHILE NOT New-Rule-Neg.Empty() DO// specialize New-Rule 1. Candidate-Literals  Generate-Candidates()// NB: rank by Performance() 2. Best-Literal  argmax L  Candidate-Literals Performance (Specialize-Rule (New- Rule, L), Target-Attribute, D)// all possible new constraints 3. New-Rule.Add-Precondition (Best-Literal)// add the best one 4. New-Rule-Neg  New-Rule-Neg.Filter-By (New-Rule) –RETURN (New-Rule)

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Learn-One-Rule: Subtle Issues How Does Learn-One-Rule Implement Search? –Effective approach: Learn-One-Rule organizes H in same general fashion as ID3 –Difference Follows only most promising branch in tree at each step Only one attribute-value pair (versus splitting on all possible values) –General to specific search (depicted in figure) Problem: greedy depth-first search susceptible to local optima Solution approach: beam search (rank by performance, always expand k best) Easily generalizes to multi-valued target functions (how?) Designing Evaluation Function to Guide Search –Performance (Rule, Target-Attribute, D) –Possible choices Entropy (i.e., information gain) as for ID3 Sample accuracy (n c / n  correct rule predictions / total predictions) m estimate: (n c + mp) / (n + m) where m  weight, p  prior of rule RHS

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Variants of Rule Learning Programs Sequential or Simultaneous Covering of Data? –Sequential: isolate components of hypothesis (e.g., search for one rule at a time) –Simultaneous: whole hypothesis at once (e.g., search for whole tree at a time) General-to-Specific or Specific-to-General? –General-to-specific: add preconditions, Find-G –Specific-to-general: drop preconditions, Find-S Generate-and-Test or Example-Driven? –Generate-and-test: search through syntactically legal hypotheses –Example-driven: Find-S, Candidate-Elimination, Cigol (next time) Post-Pruning of Rules? –Recall (Lecture 8-9 of 42): very popular overfitting recovery method What Statistical Evaluation Method? –Entropy –Sample accuracy (aka relative frequency) –m-estimate of accuracy

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition First-Order Rules What Are First-Order Rules? –Well-formed formulas (WFFs) of first-order predicate calculus (FOPC) –Sentences of first-order logic (FOL) –Example (recursive) Ancestor (x, y)  Parent (x, y). Ancestor (x, y)  Parent (x, z)  Ancestor (z, y). Components of FOPC Formulas: Quick Intro to Terminology –Constants: e.g., John, Kansas, 42 –Variables: e.g., Name, State, x –Predicates: e.g., Father-Of, Greater-Than –Functions: e.g., age, cosine –Term: constant, variable, or function(term) –Literals (atoms): Predicate(term) or negation (e.g.,  Greater-Than (age(John), 42)) –Clause: disjunction of literals with implicit universal quantification –Horn clause: at most one positive literal (H   L 1   L 2  …   L n )

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Learning First-Order Rules Why Do That? –Can learn sets of rules such as Ancestor (x, y)  Parent (x, y). Ancestor (x, y)  Parent (x, z)  Ancestor (z, y). –General-purpose (Turing-complete) programming language PROLOG Programs are such sets of rules (Horn clauses) Inductive logic programming (next time): kind of program synthesis Caveat –Arbitrary inference using first-order rules is semi-decidable Recursive enumerable but not recursive (reduction to halting problem L H ) Compare: resolution theorem-proving; arbitrary queries in Prolog –Generally, may have to restrict power Inferential completeness Expressive power of Horn clauses Learning part

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition First-Order Rule: Example Prolog (FOPC) Rule for Classifying Web Pages –[Slattery, 1997] –Course (A)  Has-Word (A, “instructor”), not Has-Word (A, “good”), Link-From (A, B), Has-Word (B, “assign”), not Link-From (B, C). –Train: 31/31, test: 31/34 How Are Such Rules Used? –Implement search-based (inferential) programs –References Chapters 1-10, Russell and Norvig Online resources at

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition First-Order Inductive Learning (FOIL): Algorithm Algorithm FOIL (Target-Predicate, Predicates, D) –Pos  D.Filter-By(Target-Predicate)// examples for which it is true –Neg  D.Filter-By(Not (Target-Predicate))// examples for which it is false –WHILE NOT Pos.Empty() DO// learn new rule Learn-One-First-Order-Rule (Target-Predicate, Predicates, D) Learned-Rules.Add-Rule (New-Rule) Pos.Remove-Covered-By (New-Rule) –RETURN (Learned-Rules) Algorithm Learn-One-First-Order-Rule (Target-Predicate, Predicate, D) –New-Rule  the rule that predicts Target-Predicate with no preconditions –New-Rule-Neg  Neg –WHILE NOT New-Rule-Neg.Empty() DO// specialize New-Rule 1. Candidate-Literals  Generate-Candidates()// based on Predicates 2. Best-Literal  argmax L  Candidate-Literals FOIL-Gain (L, New-Rule, Target- Predicate, D)// all possible new literals 3. New-Rule.Add-Precondition (Best-Literal)// add the best one 4. New-Rule-Neg  New-Rule-Neg,Filter-By (New-Rule) –RETURN (New-Rule)

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Specializing Rules in FOIL Learning Rule: P(x 1, x 2, …, x k )  L 1  L 2  …  L n. Candidate Specializations –Add new literal to get more specific Horn clause –Form of literal Q(v 1, v 2, …, v r ), where at least one of the v i in the created literal must already exist as a variable in the rule Equal(x j, x k ), where x j and x k are variables already present in the rule The negation of either of the above forms of literals

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Information Gain in FOIL Function FOIL-Gain (L, R, Target-Predicate, D) Where –L  candidate predicate to add to rule R –p 0  number of positive bindings of R –n 0  number of negative bindings of R –p 1  number of positive bindings of R + L –n 1  number of negative bindings of R + L –t  number of positive bindings of R also covered by R + L Note –- lg (p 0 / p 0 + n 0 ) is optimal number of bits to indicate the class of a positive binding covered by R –Compare: entropy (information gain) measure in ID3

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition FOIL: Learning Recursive Rule Sets Recursive Rules –So far: ignored possibility of recursive WFFs New literals added to rule body could refer to target predicate itself i.e., predicate occurs in rule head –Example Ancestor (x, y)  Parent (x, z)  Ancestor (z, y). Rule: IF Parent (x, z)  Ancestor (z, y) THEN Ancestor (x, y) Learning Recursive Rules from Relations –Given: appropriate set of training examples –Can learn using FOIL-based search Requirement: Ancestor  Predicates (symbol is member of candidate set) Recursive rules still have to outscore competing candidates at FOIL-Gain –NB: how to ensure termination? (well-founded ordering, i.e., no infinite recursion) –[Quinlan, 1990; Cameron-Jones and Quinlan, 1993]

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition FOIL: Summary Extends Sequential-Covering Algorithm –Handles case of learning first-order rules similar to Horn clauses –Result: more powerful rules for performance element (automated reasoning) General-to-Specific Search –Adds literals (predicates and negations over functions, variables, constants) –Can learn sets of recursive rules Caveat: might learn infinitely recursive rule sets Has been shown to successfully induce recursive rules in some cases Overfitting –If no noise, might keep adding new literals until rule covers no negative examples –Solution approach: tradeoff (heuristic evaluation function on rules) Accuracy, coverage, complexity FOIL-Gain: an MDL function Overfitting recovery in FOIL: post-pruning

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Apriori: A Generate-and-Test Approach Any subset of a frequent itemset must be frequent –if {toner, paper, DVD-Rs} is frequent, so is {toner, paper} –Every transaction having {toner, paper, DVD-Rs} also contains {toner, paper} Apriori pruning principle: If there is any itemset which is infrequent, its superset should not be generated/tested! Method: –generate length (k+1) candidate itemsets from length k frequent itemsets, and –test the candidates against DB The performance studies show its efficiency and scalability Agrawal & Srikant 1994, Mannila, et al. 1994

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition The Apriori Algorithm—An Example Database TDB 1 st scan C1C1 L1L1 L2L2 C2C2 C2C2 2 nd scan C3C3 L3L3 3 rd scan TidItems 10A, C, D 20B, C, E 30A, B, C, E 40B, E Itemsetsup {A}2 {B}3 {C}3 {D}1 {E}3 Itemsetsup {A}2 {B}3 {C}3 {E}3 Itemset {A, B} {A, C} {A, E} {B, C} {B, E} {C, E} Itemsetsup {A, B}1 {A, C}2 {A, E}1 {B, C}2 {B, E}3 {C, E}2 Itemsetsup {A, C}2 {B, C}2 {B, E}3 {C, E}2 Itemset {B, C, E} Itemsetsup {B, C, E} 2

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition The Apriori Algorithm Pseudo-code: C k : Candidate itemset of size k L k : frequent itemset of size k L 1 = {frequent items}; for (k = 1; L k !=  ; k++) do begin C k+1 = candidates generated from L k ; for each transaction t in database do increment the count of all candidates in C k+1 that are contained in t L k+1 = candidates in C k+1 with min_support end return  k L k ;

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Important Details of Apriori How to generate candidates? –Step 1: self-joining L k –Step 2: pruning How to count supports of candidates? Example of Candidate-generation –L 3 ={abc, abd, acd, ace, bcd} –Self-joining: L 3 *L 3 abcd from abc and abd acde from acd and ace –Pruning: acde is removed because ade is not in L 3 –C 4 ={abcd}

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition How to Generate Candidates? Suppose the items in L k-1 are listed in an order Step 1: self-joining L k-1 insert into C k select p.item 1, p.item 2, …, p.item k-1, q.item k-1 from L k-1 p, L k-1 q where p.item 1 =q.item 1, …, p.item k-2 =q.item k-2, p.item k-1 < q.item k-1 Step 2: pruning forall itemsets c in C k do forall (k-1)-subsets s of c do if (s is not in L k-1 ) then delete c from C k

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Challenges of Frequent Pattern Mining Challenges –Multiple scans of transaction database –Huge number of candidates –Tedious workload of support counting for candidates Improving Apriori: general ideas –Reduce passes of transaction database scans –Shrink number of candidates –Facilitate support counting of candidates

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Partition: Scan Database Only Twice Any itemset that is potentially frequent in DB must be frequent in at least one of the partitions of DB –Scan 1: partition database and find local frequent patterns –Scan 2: consolidate global frequent patterns A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association in large databases. In VLDB’95

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Partition Patterns and Databases Frequent patterns can be partitioned into subsets according to f- list –F-list=f-c-a-b-m-p –Patterns containing p –Patterns having m but no p –… –Patterns having c but no a nor b, m, p –Pattern f Completeness and non-redundency

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Grow long patterns from short ones using local frequent items –“abc” is a frequent pattern –Get all transactions having “abc”: DB|abc –“d” is a local frequent item in DB|abc  abcd is a frequent pattern FP-Growth Algorithm

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Benefits of the FP-tree Structure Completeness –Preserve complete information for frequent pattern mining –Never break a long pattern of any transaction Compactness –Reduce irrelevant info—infrequent items are gone –Items in frequency descending order: the more frequently occurring, the more likely to be shared –Never be larger than the original database (not count node-links and the count field)

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Terminology Induction and Deduction –Induction: finding h such that   D. (B  D  x i ) |  f(x i ) –Inductive learning: B  background knowledge (inductive bias, etc.) –Developing inverse deduction operators Deduction: finding entailed logical statements F(A, B) = C where A  B |  C Inverse deduction: finding hypotheses O(B, D) = h where   D. (B  D  x i ) |  f(x i ) –Resolution rule: deductive inference rule (P  L,  L  R |  P  R) Propositional logic: boolean terms, connectives ( , , ,  ) First-order predicate calculus (FOPC): well-formed formulas (WFFs), aka clauses (defined over literals, connectives, implicit quantifiers) –Inverse entailment: inverse of resolution operator Inductive Logic Programming (ILP) –Cigol: ILP algorithm that uses inverse entailment –Progol: sequential covering (general-to-specific search) algorithm for ILP

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Summary Points Induction as Inverse of Deduction –Problem of induction revisited Definition of induction Inductive learning as specific case Role of induction, deduction in automated reasoning –Operators for automated deductive inference Resolution rule (and operator) for deduction First-order predicate calculus (FOPC) and resolution theorem proving –Inverting resolution Propositional case First-order case (inverse entailment operator) Inductive Logic Programming (ILP) –Cigol: inverse entailment (very susceptible to combinatorial explosion) –Progol: sequential covering, general-to-specific search using inverse entailment Next Week: Knowledge Discovery in Databases (KDD), Final Review