CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.

Slides:



Advertisements
Similar presentations
Concept Learning and the General-to-Specific Ordering
Advertisements

1 CS 391L: Machine Learning: Rule Learning Raymond J. Mooney University of Texas at Austin.
Chapter 8 Topics in Graph Theory
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 20 Jim Martin.
Inductive Logic Programming: The Problem Specification Given: –Examples: first-order atoms or definite clauses, each labeled positive or negative. –Background.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Chapter 10 Learning Sets Of Rules
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Today’s Agenda  HW #1 Due  Quick Review  Finish Input Space Partitioning  Combinatorial Testing Software Testing and Maintenance 1.
Software Testing and Quality Assurance
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 8 Jim Martin.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 18 Jim Martin.
Machine Learning: Symbol-Based
Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
CSc411Artificial Intelligence1 Chapter 5 STOCHASTIC METHODS Contents The Elements of Counting Elements of Probability Theory Applications of the Stochastic.
17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to.
For Friday Read chapter 22 Program 4 due. Program 4 Any questions?
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
1 Machine Learning: Lecture 11 Analytical Learning / Explanation-Based Learning (Based on Chapter 11 of Mitchell, T., Machine Learning, 1997)
Introduction to ILP ILP = Inductive Logic Programming = machine learning  logic programming = learning with logic Introduced by Muggleton in 1992.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 7, 2001.
Today’s Topics Chapter 2 in One Slide Chapter 18: Machine Learning (ML) Creating an ML Dataset –“Fixed-length feature vectors” –Relational/graph-based.
Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.
Naive Bayes Classifier
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
EMNLP’01 19/11/2001 ML: Classical methods from AI –Decision-Tree induction –Exemplar-based Learning –Rule Induction –TBEDL ML: Classical methods from AI.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
1 Machine Learning: Rule Learning. 2 Learning Rules If-then rules in logic are a standard representation of knowledge that have proven useful in expert-systems.
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.
A Learning System for Decision Support in Telecommunications Filip Železný, Olga Štěpánková (Czech Technical University in Prague) Jiří Zídek (Atlantis.
Predicate Calculus Syntax Countable set of predicate symbols, each with specified arity  0. For example: clinical data with multiple tables of patient.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Outline Inductive bias General-to specific ordering of hypotheses
ARTIFICIAL INTELLIGENCE [INTELLIGENT AGENTS PARADIGM] Professor Janis Grundspenkis Riga Technical University Faculty of Computer Science and Information.
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
For Wednesday Read 20.4 Lots of interesting stuff in chapter 20, but we don’t have time to cover it all.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
CS 5751 Machine Learning Chapter 10 Learning Sets of Rules1 Learning Sets of Rules Sequential covering algorithms FOIL Induction as the inverse of deduction.
First-Order Rule Learning. Sequential Covering (I) Learning consists of iteratively learning rules that cover yet uncovered training instances Assume.
For Wednesday Read ch. 20, sections 1, 2, 5, and 7 No homework.
Today’s Topics HW1 Due 11:55pm Today (no later than next Tuesday) HW2 Out, Due in Two Weeks Next Week We’ll Discuss the Make-Up Midterm Be Sure to Check.
CpSc 810: Machine Learning Analytical learning. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various.
More Symbolic Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
Wrapper Learning: Cohen et al 2002; Kushmeric 2000; Kushmeric & Frietag 2000 William Cohen 1/26/03.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
CS 9633 Machine Learning Explanation Based Learning
Introduction to Logic for Artificial Intelligence Lecture 2
Naive Bayes Classifier
CS 9633 Machine Learning Concept Learning
Rule Learning Hankui Zhuo April 28, 2018.
CS 391L: Machine Learning: Rule Learning
Presentation transcript:

CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin

CSCI 5582 Fall 2006 Today 11/16 Finish up ILP/FOIL Break –HW questions Quiz review –Probabilistic sequence processing –Supervised ML Learning Classifiers –DTs, DLs, Naïve Bayes, Ensembles, SVMs Concept Learning –Version Spaces, FOIL

CSCI 5582 Fall 2006 Relational Learning and Inductive Logic Programming Fixed feature vectors are a very limited representation of objects. Examples or target concept may require relational representation that includes multiple entities with relationships among them. First-order predicate logic is a more powerful representation for handling such relational descriptions.

CSCI 5582 Fall 2006 ILP Example Learn definitions of family relationships given data for primitive types and relations. brother(A,C), parent(C,B) -> uncle(A,B) husband(A,C), sister(C,D), parent(D,B) -> uncle(A,B) Given the relevant predicates and a database populated with positive and negative examples By database I mean sets of tuples for each of the relevant relations

CSCI 5582 Fall 2006 FOIL First-Order Inductive Logic Top-down sequential covering algorithm to learn first order theories. Background knowledge provided extensionally (ie. A model) Start with the most general rule possible. (T --> P(x)) Specialize it on demand… Specializations of a clause include adding all possible literals one at a time to the antecedent… –A -> P –B -> P –C -> P… Where A, B and C are predicates already in the domain theory. We’re working top-down from the most general hypothesis so what’s driving things?

CSCI 5582 Fall 2006 FOIL At a high level. –Start with the most general H –Repeatedly constructs clauses that cover a subset of the positive examples and none of the negative examples. –Then remove the covered positive examples –Constructs another clause –Repeat until all the positive examples are covered.

CSCI 5582 Fall 2006 FOIL Constructing candidate clauses –Any predicate (negated or not), args are variables But every predicate must contain a variable from an earlier literal or the head of the clause (antecedent) –Equality constraints on variables –Some arithmetic

CSCI 5582 Fall 2006 F OIL Training Data Background knowledge consists of complete set of tuples for each background predicate for this universe. Example: Consider learning a definition for the target predicate path for finding a path in a directed acyclic graph. edge(X,Y) -> path(X,Y) edge(X,Z)^ path(Z,Y) -> path(X,Y) edge: {,,,,, } path: {,,,,,,,,, }

CSCI 5582 Fall 2006 F OIL Negative Training Data Negative examples of target predicate can be provided directly, or generated indirectly by making a closed world assumption. –Every pair of constants not in positive tuples for path predicate Negative path tuples: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,,,,, } Start with clause: T -> path(X,Y) Possible literals to add : edge(X,X),edge(Y,Y),edge(X,Y),edge(Y,X),edge(X,Z), edge(Y,Z),edge(Z,X),edge(Z,Y),path(X,X),path(Y,Y), path(X,Y),path(Y,X),path(X,Z),path(Y,Z),path(Z,X), path(Z,Y),X=Y, plus negations of all of these. Neg: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Neg: {,,,,,,,,,,,,,,,,, } Sample F OIL Induction Pos : {,,,,,,,,, } Test: edge(X,X) -> path(X,Y) Covers 0 positive examples Covers 6 negative examples Not a good literal to try.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,,,,, } Test: edge(X,Y) -> path(X,Y) Covers 6 positive examples Covers 0 negative examples Chosen as best literal. Result is base clause. Neg: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: edge(X,Y) -> path(X,Y) Covers 6 positive examples Covers 0 negative examples Chosen as best literal. Result is base clause. Neg: {,,,,,,,,,,,,,,,,, } Remove covered positive tuples.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Start new clause T -> path(X,Y) Neg: {,,,,,,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: edge(X,Y) -> path(X,Y) Neg: {,,,,,,,,,,,,,,,,, } Covers 0 positive examples Covers 0 negative examples Not a good literal.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,, } Test: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,, } Test: Edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,,, } Covers all 4 positive examplesCovers 14 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,, } Test: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,, } Test: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,, } Test: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,, } Covers all 4 positive examples Covers 14 of 26 negative examples Eventually chosen as best possible literal Negatives still covered, remove uncovered examples. Expand tuples to account for possible Z values.

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,, } Continue specializing clause: edge(X,Z) -> path(X,Y) Neg: {,,,,,,,,,,,, }

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,, } Test: edge(X,Z) ^ edge(Z,Y) -> path(X,Y) Neg: {,,,,,,,,,,,, } Covers 3 positive examples Covers 0 negative examples

CSCI 5582 Fall 2006 Sample F OIL Induction Pos : {,,,,,, } Test: edge(X,Z) ^ path(Z,Y) -> path(X,Y) Neg: {,,,,,,,,,,,, } Covers 4 positive examplesCovers 0 negative examples Eventually chosen as best literal; completes clause. Definition complete, since all original tuples are covered (by way of covering at least one of each positive tuple.)

CSCI 5582 Fall 2006 More Realistic Applications Classifying chemical compounds as mutagenic (cancer causing) based on their graphical molecular structure and chemical background knowledge. Classifying web documents based on both the content of the page and its links to and from other pages with particular content. –A web page is a university faculty home page if: It contains the words “Professor” and “University”, and It is pointed to by a page with the word “faculty”, and It points to a page with the words “course” and “exam”

CSCI 5582 Fall 2006 Rule Learning and ILP Summary There are effective methods for learning symbolic rules from data using greedy sequential covering and top-down or bottom-up search. These methods have been extended to first-order logic to learn relational rules and recursive Prolog programs. Knowledge represented by rules is generally more interpretable by people, allowing human insight into what is learned and possible human approval and correction of learned knowledge.

CSCI 5582 Fall 2006 Break HW questions? –Elizabeth?

CSCI 5582 Fall 2006 Break The next quiz will be on 11/28. It will cover the ML material and the probabilistic sequence material. The readings for this quiz are: –Chapter 18 –Chapter 19 –Chapter 20: –HMM chapter posted on the web

CSCI 5582 Fall 2006 Quiz Topics Sequence processing Classifiers Concept Learning

CSCI 5582 Fall 2006 Probabilistic Sequence Processing Reading: Assigned Chapter –Know the three problems P(observations | model) Argmax P(state sequence | observations, model) Argmax P(model parameter | observations, model structure)

CSCI 5582 Fall 2006 Probabilistic Sequence Processing The basis for the techniques –Independence assumptions –For a first order HMM 1.? 2.?

CSCI 5582 Fall 2006 Probabilistic Sequence Processing The basis for the techniques –Independence assumptions –For a first order HMM 1.Current state depends only on the previous 2.?

CSCI 5582 Fall 2006 Probabilistic Sequence Processing The basis for the techniques –Independence assumptions –For a first order HMM 1.Current state depends only on the previous 2.Current output depends only on current state

CSCI 5582 Fall 2006 Probabilistic Sequence Processing Know the computations Know the algorithms

CSCI 5582 Fall 2006 Classifiers Chapter 18 Basic induction task –DT, DL, Naïve Bayes –Ensembles (bagging, boosting) –For SVMS, just focus on the two main ideas (not the math) Learning theory –What are the key elements?

CSCI 5582 Fall 2006 Classifiers Theory: three main things we discussed: 1.? 2.? 3.?

CSCI 5582 Fall 2006 Classifiers Theory: three main things we discussed: 1.Size of the hypothesis space 2.? 3.?

CSCI 5582 Fall 2006 Classifiers Theory: three main things we discussed: 1.Size of the hypothesis space 2.Number of training examples 3.?

CSCI 5582 Fall 2006 Classifiers Theory: three main things we discussed: 1.Size of the hypothesis space 2.Number of training examples 3.Occam

CSCI 5582 Fall 2006 Classifiers Practice: 1.Training and testing 2.Learning as search 1.Managing choices 2.Making choices 3.Termination conditions

CSCI 5582 Fall 2006 Training Set

CSCI 5582 Fall 2006 Concept Learning Chapter 19: Focus on what the task is How it’s different from classifier induction –Version space learning –How FOIL works