Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9.

Similar presentations


Presentation on theme: "Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9."— Presentation transcript:

1 Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9.

2 Combining Inductive & Analytical Learning2 Contents q Motivation q Inductive-Analytical Approaches to Learning q Using Prior Knowledge to Initialize the Hypothesis  The KBANN Algorithm q Using Prior Knowledge to Alter the Search Objective  The TANGENTPROP Algorithm  The EBNN Algorithm q Using Prior Knowledge to Augment Search Operators  The FOCL Algorithm

3 Combining Inductive & Analytical Learning3 Motivation(1/2) q Inductive & Analytical Learning Inductive Learning Analytical Learning Goal: Hypothesis fits data Hypothesis fits domain theory Justification: Statistical inference Deductive inference Advantages: Requires little prior knowledge Learns from scarce data Pitfalls: Scarce data, incorrect bias Imperfect domain theory q A spectrum of learning tasks  Most practical learning problems lie somewhere between these two extremes of the spectrum.

4 Combining Inductive & Analytical Learning4 Motivation(2/2) q What kinds of learning algorithms can we devise that make use of approximate prior knowledge, together with available data, to form general hypothesis?  domain-independent algorithms that employ explicitly input domain-dependent knowledge q Desirable Properties  no domain theory  learn as well as inductive methods  perfect domain theory  learn as well as analytical methods  imperfect domain theory & imperfect training data  combine the two to outperform either inductive or analytical methods  accommodate arbitrary and unknown errors in domain theory  accommodate arbitrary and unknown errors in training data

5 Combining Inductive & Analytical Learning5 The Learning Problem q Given:  A set of training examples D, possibly containing errors  A domain theory B, possibly containing errors  A space of candidate hypothesis H q Determine:  A hypothesis that best fits the training examples & domain theory

6 Combining Inductive & Analytical Learning6 Hypothesis Space Search q Learning as a task of searching through hypothesis space  hypothesis space H  initial hypothesis  the set of search operator O 3 define individual search steps  the goal criterion G 3 specifies the search objective q Methods for using prior knowledge Use prior knowledge to  derive an initial hypothesis from which to begin the search  alter the objective G of the hypothesis space search  alter the available search steps O

7 Combining Inductive & Analytical Learning7 Using Prior Knowledge to Initialize the Hypothesis q Two Steps 1. initialize the hypothesis to perfectly fit the domain theory 2. inductively refine this initial hypothesis as needed to fit the training data q KBANN(Knowledge-Based Artificial Neural Network) 1. Analytical Step 3 create an initial network equivalent to the domain theory 2. Inductive Step 3 refine the initial network (use BACKPROP) Given:  A set of training examples  A domain theory consisting of nonrecursive, propositional Horn clauses Determine:  An artificial neural network that fits the training examples, biased the domain theory  Table 12.2(p.341)

8 Combining Inductive & Analytical Learning8 Example: The Cup Learning Task Neural Net Equivalent to Domain Theory Result of refining the network

9 Combining Inductive & Analytical Learning9 Remarks q KBANN vs. Backpropagation  when given an approximately correct domain theory & scarce training data 3 KBANN generalizes more accurately than Backpropagation 3 Classifying promoter regions in DNA 2 Backpropagation: error rate 8/106 2 KBANN: error rate 4/106  bias 3 KBANN 2 domain-specific theory 3 Backpropagation 2 domain-independent syntactic bias toward small weight values

10 Combining Inductive & Analytical Learning10 Using Prior Knowledge to Alter the Search Objective q Use of prior knowledge  incorporate it into the error criterion minimized by gradient descent  network must fit a combined function of the training data & domain theory q Form of prior knowledge  derivatives of the target function  certain type of prior knowledge can be expressed quite naturally  example: recognizing handwritten characters 3 “the identity of the character is independent of small translations and rotations of the image.”

11 Combining Inductive & Analytical Learning11 The TANGENTPROP Algorithm q Domain Knowledge  expressed as derivatives of the target function with respect to transformations of its inputs q Training Derivatives  TANGENTPROP assumes various training derivatives of the target function are provided. q Error Function : transformation(rotation or translation) : constant to determine the relative importance  Table 12.4(p.349)

12 Combining Inductive & Analytical Learning12 Remarks q TANGENTPROP combines the prior knowledge with observed training data, by minimizing an objective function that measures both  the network’s error with respect to the training example values  the network’s error with respect to the desired derivatives q TANGENTPROP is not robust errors in the prior knowledge  need to automatically select 3 EBNN Algorithm

13 Combining Inductive & Analytical Learning13 The EBNN Algorithm(1/2) q Input  A set of training examples of the form  A domain theory represented by a set of previously trained NN q Output  A new NN that approximates the target function q Algorithm  Create a new, fully connected feedforward network to represent the target function  For each training example, determine corresponding training derivatives  Use the TANGENTPROP algorithm to train the target network

14 Combining Inductive & Analytical Learning14 The EBNN Algorithm(2/2) q Computation of training derivatives  compute them itself for each observed training example 3 explain each training example in terms of a given domain theory 3 extract training derivatives from this explanation 2 provide important information for distinguishing relevant from irrelevant features q How to weight the relative importance of the inductive & analytical component of learning  is chosen independently for each training example  consider how accurately the domain theory predicts the training value for this particular example q Error Function  Figure 12.7(p.353) A(x): domain theory prediction for input x : ith training instance : jth component of the vector x c: normalizing constant

15 Combining Inductive & Analytical Learning15 Remarks q EBNN vs. Symbolic Explanation-Based Learning  domain theory consisting of NNs rather than Horn clauses  relevant dependencies take the form of derivatives  accommodates imperfect domain theories  learns a fixed-sized neural network 3 requires constant time to classify new instances 3 unable to represent sufficiently complex functions

16 Combining Inductive & Analytical Learning16 Using Prior Knowledge to Augment Search Operators q The FOCL Algorithm  Two operators for generating candidate specializations 1. Add a single new literal 2. Add a set of literals that constitute logically sufficient conditions for the target concept, according to the domain theory 2 select one of the domain theory clauses whose head matches the target concept. 2 Unfolding: Each nonoperational literal is replaced, until the sufficient conditions have been restated in terms of operational literals. 2 Pruning: the literal is removed unless its removal reduces classification accuracy over the training examples.  FOCL selects among all these candidate specializations, based on their performance over the data  domain theory is used in a fashion that biases the learner  leaves final search choices to be made based on performance over the training data  Figure 12.8(p.358)  Figure 12.9(p.361)


Download ppt "Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9."

Similar presentations


Ads by Google