CS623: Introduction to Computing with Neural Nets (lecture-18) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

Slides:



Advertisements
Similar presentations
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Advertisements

Example set X Can Inductive Learning Work? Hypothesis space H Training set  Inductive hypothesis h size.
Machine Learning Week 2 Lecture 2.
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
Probably Approximately Correct Model (PAC)
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Combined Lecture CS621: Artificial Intelligence (lecture 25) CS626/449: Speech-NLP-Web/Topics-in- AI (lecture 26) Pushpak Bhattacharyya Computer Science.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21- Forward Probabilities and Robotic Action Sequences.
Naive Bayes Classifier
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36-37: Foundation of Machine Learning.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 28- PAC and Reinforcement Learning.
Introduction to Machine Learning Supervised Learning 姓名 : 李政軒.
1 Machine Learning: Lecture 8 Computational Learning Theory (Based on Chapter 7 of Mitchell T.., Machine Learning, 1997)
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
Machine Learning Chapter 5. Evaluating Hypotheses
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.
Prof. Pushpak Bhattacharyya, IIT Bombay CS 621 Artificial Intelligence Lecture 28 – 22/10/05 Prof. Pushpak Bhattacharyya Generating Function,
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CS623: Introduction to Computing with Neural Nets (lecture-12) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-17) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Concept Learning and The General-To Specific Ordering
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
CS623: Introduction to Computing with Neural Nets (lecture-7) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 23- Forward probability and Robot Plan; start of plan.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
Evaluating Hypotheses
CS 9633 Machine Learning Concept Learning
CS 9633 Machine Learning Inductive-Analytical Methods
Computational Learning Theory
CS623: Introduction to Computing with Neural Nets (lecture-5)
Pushpak Bhattacharyya Computer Science and Engineering Department
CH. 2: Supervised Learning
ECE 5424: Introduction to Machine Learning
INTRODUCTION TO Machine Learning
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
The
Computational Learning Theory
Evaluating Hypotheses
CS623: Introduction to Computing with Neural Nets (lecture-4)
Computational Learning Theory
CS623: Introduction to Computing with Neural Nets (lecture-9)
CSCI B609: “Foundations of Data Science”
CS344 : Introduction to Artificial Intelligence
CS 621 Artificial Intelligence Lecture /10/05 Prof
Machine Learning: UNIT-3 CHAPTER-2
Machine Learning: Lecture 6
CS621: Artificial Intelligence Lecture 12: Counting no
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-14)
Evaluating Hypothesis
CS623: Introduction to Computing with Neural Nets (lecture-3)
Lecture 14 Learning Inductive inference
CS 621 Artificial Intelligence Lecture 29 – 22/10/05
Inductive Learning (2/2) Version Space and PAC Learning
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
INTRODUCTION TO Machine Learning
Prof. Pushpak Bhattacharyya, IIT Bombay
INTRODUCTION TO Machine Learning 3rd Edition
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS623: Introduction to Computing with Neural Nets (lecture-11)
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS623: Introduction to Computing with Neural Nets (lecture-18) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Learning in Boltzmann m/c The meaning of learning probability distribution Example: rectangle learning Learning algo presented with + and – examples. + : points within ABCD - : points outside ABCD A B D C + - Target rectangle T

Learning Algorithm The algorithm has to output a hypotheses H, that is a good estimate of rectangle T. Probably Approximately Correct (PAC) learning used Let, U is universe of points Let, c ⊂ U, where c is called a concept. A collection C of c is called a concept class C U

Learning Algorithm There is a probability distribution Pr which is unknown, arbitrary but fixed, which produces examples over time,,, …, The probability distribution is generated by a “teacher” or “oracle”. We want the learning algo to learn ‘c’

Learning Algorithm We say that ‘c’ is PAC learnt by a hypothesis ‘h’ Learning algo takes place in 2 phases: –Training phase (loading) –Testing phase (generalization) c ⊕ h is the error region If the examples coming from c ⊕ h are of low probability then generalization is good. Learning means learning the following things –Approximating the Distribution –Assigning Name (community accepted) Ch U Universe Agreement False -veFalse +ve C ⊕ h = Error region

Key insights from 40 years of Machine Learning Learning in vacuum is impossible –Learner must already have a lot of knowledge The learner must know “what to” learn, called as Inductive Bias Example: Learning the rectangle ABCD. The target is to learn the boundary defined by the points A, B, C and D.

A B D C Target T

PAC We want, P(C ⊕ h) <= ∈ Definition of Probably Approximately Correct learning is: P[ Pr(C ⊕ h) = 1 – δ where ∈ = accuracy factor δ = confidence factor

PAC - Example Probability distribution is producing +ve and –ve examples Keep track of (x min, y min ) and (x max, y max ) and build rectangle out of it. y x AB D C

PAC - Algorithm In case of rectangle one can prove that the learning algorithm, viz., 1.Ignore –ve examples. 2.Producing and of the points, PAC learns the target rectangle if the number of examples is >= ∈ /4 ln δ/4

Illustration of the basic idea of Boltzmann Machine To learn the identity function The setting is probabilistic, x = 1 or x = -1, with uniform probability. P(x=1) = 0.5, P(x=-1) = 0.5 For, x=1, y=1 with P=0.9 For, x=-1, y=-1 with P=0.9 w x y x y 1 1

Illustration of the basic idea of Boltzmann Machine (contd.) Let α = output neuron states β = input neuron states P α| β = observed probability distribution Q α| β = desired probability distribution Q β = probability distribution on input states β

Illustration of the basic idea of Boltzmann Machine (contd.) The divergence D is given as: D = ∑ α ∑ β Q α| β Q β ln Q α| β / P α| β called KL divergence formula D = ∑ α ∑ β Q α| β Q β ln Q α| β / P α| β >= ∑ α ∑ β Q α| β Q β ( 1 - P α| β /Q α| β ) >= ∑ α ∑ β Q α| β Q β - ∑ α ∑ β P α| β Q β >= ∑ α ∑ β Q α β - ∑ α ∑ β P α β {Q α β and P α β are joint distributions} >= 1 – 1 = 0