Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Explanation-Based Learning (borrowed from mooney et al)
Analytical Learning.
Combining Inductive and Analytical Learning
Slides from: Doug Gray, David Poole
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Knowledge Representation and Reasoning Learning Sets of Rules and Analytical Learning Harris Georgiou – 4.
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
처음 페이지로 이동 Chapter 11: Analytical Learning Inductive learning training examples n Analytical learning prior knowledge + deductive reasoning n Explanation.
Università di Milano-Bicocca Laurea Magistrale in Informatica
Modular Neural Networks CPSC 533 Franco Lee Ian Ko.
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Pattern Recognition and Machine Learning
Simple Neural Nets For Pattern Classification
Artificial Neural Networks ML Paul Scheible.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Instance-Based Learning
Ensemble Learning: An Introduction
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Artificial Neural Networks
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Radial Basis Function Networks
Artificial Neural Networks
1 Machine Learning: Lecture 11 Analytical Learning / Explanation-Based Learning (Based on Chapter 11 of Mitchell, T., Machine Learning, 1997)
Machine Learning Chapter 11. Analytical Learning
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Chapter 9 Neural Network.
Machine Learning Chapter 4. Artificial Neural Networks
Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Friday, February 4, 2000 Lijun.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
Chapter 2: Concept Learning and the General-to-Specific Ordering.
Ensemble Methods: Bagging and Boosting
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.
Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
Machine Learning Concept Learning General-to Specific Ordering
CpSc 810: Machine Learning Analytical learning. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various.
EEE502 Pattern Recognition
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Data Mining and Decision Support
Learning Neural Networks (NN) Christina Conati UBC
CS 5751 Machine Learning Chapter 12 Comb. Inductive/Analytical 1 Combining Inductive and Analytical Learning Why combine inductive and analytical learning?
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Concept Learning and The General-To Specific Ordering
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
CS 9633 Machine Learning Explanation Based Learning
Deep Feedforward Networks
CS 9633 Machine Learning Concept Learning
CS 9633 Machine Learning Inductive-Analytical Methods
Data Mining Lecture 11.
Machine Learning: UNIT-4 CHAPTER-1
Machine Learning: Lecture 6
Machine Learning Chapter 2
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
Machine Learning Chapter 2
Presentation transcript:

Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수

Combining Inductive & Analytical Learning2 Contents q Motivation q Inductive-Analytical Approaches to Learning q Using Prior Knowledge to Initialize the Hypothesis  The KBANN Algorithm q Using Prior Knowledge to Alter the Search Objective  The TANGENTPROP Algorithm  The EBNN Algorithm q Using Prior Knowledge to Augment Search Operators  The FOCL Algorithm

Combining Inductive & Analytical Learning3 Motivation(1/2) q Inductive & Analytical Learning Inductive Learning Analytical Learning Goal: Hypothesis fits data Hypothesis fits domain theory Justification: Statistical inference Deductive inference Advantages: Requires little prior knowledge Learns from scarce data Pitfalls: Scarce data, incorrect bias Imperfect domain theory q A spectrum of learning tasks  Most practical learning problems lie somewhere between these two extremes of the spectrum.

Combining Inductive & Analytical Learning4 Motivation(2/2) q What kinds of learning algorithms can we devise that make use of approximate prior knowledge, together with available data, to form general hypothesis?  domain-independent algorithms that employ explicitly input domain-dependent knowledge q Desirable Properties  no domain theory  learn as well as inductive methods  perfect domain theory  learn as well as analytical methods  imperfect domain theory & imperfect training data  combine the two to outperform either inductive or analytical methods  accommodate arbitrary and unknown errors in domain theory  accommodate arbitrary and unknown errors in training data

Combining Inductive & Analytical Learning5 The Learning Problem q Given:  A set of training examples D, possibly containing errors  A domain theory B, possibly containing errors  A space of candidate hypothesis H q Determine:  A hypothesis that best fits the training examples & domain theory

Combining Inductive & Analytical Learning6 Hypothesis Space Search q Learning as a task of searching through hypothesis space  hypothesis space H  initial hypothesis  the set of search operator O 3 define individual search steps  the goal criterion G 3 specifies the search objective q Methods for using prior knowledge Use prior knowledge to  derive an initial hypothesis from which to begin the search  alter the objective G of the hypothesis space search  alter the available search steps O

Combining Inductive & Analytical Learning7 Using Prior Knowledge to Initialize the Hypothesis q Two Steps 1. initialize the hypothesis to perfectly fit the domain theory 2. inductively refine this initial hypothesis as needed to fit the training data q KBANN(Knowledge-Based Artificial Neural Network) 1. Analytical Step 3 create an initial network equivalent to the domain theory 2. Inductive Step 3 refine the initial network (use BACKPROP) Given:  A set of training examples  A domain theory consisting of nonrecursive, propositional Horn clauses Determine:  An artificial neural network that fits the training examples, biased the domain theory  Table 12.2(p.341)

Combining Inductive & Analytical Learning8 Example: The Cup Learning Task Neural Net Equivalent to Domain Theory Result of refining the network

Combining Inductive & Analytical Learning9 Remarks q KBANN vs. Backpropagation  when given an approximately correct domain theory & scarce training data 3 KBANN generalizes more accurately than Backpropagation 3 Classifying promoter regions in DNA 2 Backpropagation: error rate 8/106 2 KBANN: error rate 4/106  bias 3 KBANN 2 domain-specific theory 3 Backpropagation 2 domain-independent syntactic bias toward small weight values

Combining Inductive & Analytical Learning10 Using Prior Knowledge to Alter the Search Objective q Use of prior knowledge  incorporate it into the error criterion minimized by gradient descent  network must fit a combined function of the training data & domain theory q Form of prior knowledge  derivatives of the target function  certain type of prior knowledge can be expressed quite naturally  example: recognizing handwritten characters 3 “the identity of the character is independent of small translations and rotations of the image.”

Combining Inductive & Analytical Learning11 The TANGENTPROP Algorithm q Domain Knowledge  expressed as derivatives of the target function with respect to transformations of its inputs q Training Derivatives  TANGENTPROP assumes various training derivatives of the target function are provided. q Error Function : transformation(rotation or translation) : constant to determine the relative importance  Table 12.4(p.349)

Combining Inductive & Analytical Learning12 Remarks q TANGENTPROP combines the prior knowledge with observed training data, by minimizing an objective function that measures both  the network’s error with respect to the training example values  the network’s error with respect to the desired derivatives q TANGENTPROP is not robust errors in the prior knowledge  need to automatically select 3 EBNN Algorithm

Combining Inductive & Analytical Learning13 The EBNN Algorithm(1/2) q Input  A set of training examples of the form  A domain theory represented by a set of previously trained NN q Output  A new NN that approximates the target function q Algorithm  Create a new, fully connected feedforward network to represent the target function  For each training example, determine corresponding training derivatives  Use the TANGENTPROP algorithm to train the target network

Combining Inductive & Analytical Learning14 The EBNN Algorithm(2/2) q Computation of training derivatives  compute them itself for each observed training example 3 explain each training example in terms of a given domain theory 3 extract training derivatives from this explanation 2 provide important information for distinguishing relevant from irrelevant features q How to weight the relative importance of the inductive & analytical component of learning  is chosen independently for each training example  consider how accurately the domain theory predicts the training value for this particular example q Error Function  Figure 12.7(p.353) A(x): domain theory prediction for input x : ith training instance : jth component of the vector x c: normalizing constant

Combining Inductive & Analytical Learning15 Remarks q EBNN vs. Symbolic Explanation-Based Learning  domain theory consisting of NNs rather than Horn clauses  relevant dependencies take the form of derivatives  accommodates imperfect domain theories  learns a fixed-sized neural network 3 requires constant time to classify new instances 3 unable to represent sufficiently complex functions

Combining Inductive & Analytical Learning16 Using Prior Knowledge to Augment Search Operators q The FOCL Algorithm  Two operators for generating candidate specializations 1. Add a single new literal 2. Add a set of literals that constitute logically sufficient conditions for the target concept, according to the domain theory 2 select one of the domain theory clauses whose head matches the target concept. 2 Unfolding: Each nonoperational literal is replaced, until the sufficient conditions have been restated in terms of operational literals. 2 Pruning: the literal is removed unless its removal reduces classification accuracy over the training examples.  FOCL selects among all these candidate specializations, based on their performance over the data  domain theory is used in a fashion that biases the learner  leaves final search choices to be made based on performance over the training data  Figure 12.8(p.358)  Figure 12.9(p.361)