ECE 471/571 - Lecture 14 Gradient Descent.

Slides:



Advertisements
Similar presentations
ECE 471/571 - Lecture 13 Gradient Descent 10/13/14.
Advertisements

Numerical Solution of Nonlinear Equations
Derivatives - Equation of the Tangent Line Now that we can find the slope of the tangent line of a function at a given point, we need to find the equation.
Empirical Maximum Likelihood and Stochastic Process Lecture VIII.
Function Optimization Newton’s Method. Conjugate Gradients
Open Methods (Part 1) Fixed Point Iteration & Newton-Raphson Methods
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Minimal Neural Networks Support vector machines and Bayesian learning for neural networks Peter Andras
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Linear Discriminant Functions Chapter 5 (Duda et al.)
Implicit Differentiation. Objectives Students will be able to Calculate derivative of function defined implicitly. Determine the slope of the tangent.
 By River, Gage, Travis, and Jack. Sections Chapter 6  6.1- Introduction to Differentiation (Gage)  The Gradient Function (Gage)  Calculating.
DO NOW Find the equation of the tangent line of y = 3sin2x at x = ∏
If is measured in radian Then: If is measured in radian Then: and: -
1 f ’’(x) > 0 for all x in I f(x) concave Up Concavity Test Sec 4.3: Concavity and the Second Derivative Test the curve lies above the tangentsthe curve.
Neural Networks1 Introduction to NETLAB NETLAB is a Matlab toolbox for experimenting with neural networks Available from:
Differentiation. f(x) = x 3 + 2x 2 – 3x + 5 f’(x) = 3x 2 + 4x - 3 f’(1) = 3 x x 1 – 3 = – 3 = 4 If f(x) = x 3 + 2x 2 –
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
3.1 Definition of the Derivative & Graphing the Derivative
Numerical Methods.
Lecture 4 Linear machine
Dan Simon Cleveland State University Jang, Sun, and Mizutani Neuro-Fuzzy and Soft Computing Chapter 6 Derivative-Based Optimization 1.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
INTRO TO OPTIMIZATION MATH-415 Numerical Analysis 1.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
4.1 Linear Approximations Fri Oct 16 Do Now Find the equation of the tangent line of each function at 1) Y = sinx 2) Y = cosx.
Warm Up Write an equation of the tangent line to the curve at the given point. 1)f(x)= x 3 – x + 1 where x = -1 2)g(x) = 3sin(x/2) where x = π/2 3)h(x)
Newton’s Method Problem: need to solve an equation of the form f(x)=0. Graphically, the solutions correspond to the points of intersection of the.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Unit 2 Lesson #3 Tangent Line Problems
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
CSE 330: Numerical Methods. Introduction The bisection and false position method require bracketing of the root by two guesses Such methods are called.
Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
ECE 471/571 - Lecture 19 Review 02/24/17.
Root Finding Methods Fish 559; Lecture 15 a.
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
Families of Solutions, Geometric Interpretation
LECTURE 4 OF SOLUTIONS OF NON-LINEAR EQUATIONS OBJECTIVES
Logistic Regression Gary Cottrell 6/8/2018
Differentiating Trigonometric, Logarithmic, and Exponential Functions
Secant Method.
Definition (p. 866).
Computational Optimization
Newton’s Method for Systems of Non Linear Equations
LECTURE 3 OF SOLUTIONS OF NON -LINEAR EQUATIONS.
10701 / Machine Learning.
Intro to NLP and Deep Learning
3.5 – Derivatives of Trig Functions
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Statistical Learning Dong Liu Dept. EEIS, USTC.
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Secant Method – Derivation
MATH 2140 Numerical Methods
Ying shen Sse, tongji university Sep. 2016
ECE 471/571 – Lecture 12 Perceptron.
Computers in Civil Engineering 53:081 Spring 2003
ECE 471/571 – Review 1.
Differentiate. f (x) = x 3e x
Trig. equations with graphs
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Newton’s Method and Its Extensions
X y y = x2 - 3x Solutions of y = x2 - 3x y x –1 5 –2 –3 6 y = x2-3x.
Newton-Raphson Method
Find the derivative of the following function:   {image} .
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 14 – Gradient Descent
ECE – Pattern Recognition Midterm Review
Presentation transcript:

ECE 471/571 - Lecture 14 Gradient Descent

General Approach to Learning Optimization methods Newton’s method Gradient descent Exhaustive search through the solution space Objective functions Maximum a-posteriori probability Maximum likelihood estimate Fisher’s linear discriminant Principal component analysis k-nearest neighbor Perceptron

General Approach to Learning Specify a model (objective function) and estimate its parameters Use optimization methods to find the parameters 1st derivative = 0 Gradient descent Exhaustive search through the solution space

Newton-Raphson Method Used to find solution to equations

Newton-Raphson Method vs. Gradient Descent f(x) = x2 – 5x – 4 f(x) = xcosx Newton-Raphson method Used to find solution to equations Find x for f(x) = 0 The approach Step 1: select initial x0 Step 2: Step 3: if |xk+1 – xk| < e, then stop; else xk = xk+1 and go back step 2. Gradient descent Used to find optima, i.e. solutions to derivatives Find x* such that f(x*) < f(x) The approach Step 1: select initial x0 Step 2: Step 3: if |xk+1 – xk| < e, then stop; else xk = xk+1 and go back step 2. x^{k+1} = x^k - \frac{f(x^k)}{f'(x^k)} x^{k+1} = x^k - \frac{f'(x^k)}{f''(x^k)}=x^k-cf'(x^k)

On the Learning Rate

Geometric Interpretation Gradient of tangent is 2 x1 x0 http://www.teacherschoice.com.au/Maths_Library/Calculus/tangents_and_normals.htm

function [y, dy] = g(x) y = 50*sin(x) + x .* x ; dy = 50*cos(x) + 2*x; % [x] = gd(x0, c, epsilon) % - Demonstrate gradient descent % - x0: the initial x % - c: the learning rate % - epsilon: controls the accuracy of the solution % - x: an array tracks x in each iteration % % Note: try c = 0.1, 0.2, 1; epsilon = 0.01, 0.1 function [x] = gd(x0, c, epsilon) % plot the curve clf; fx = -10:0.1:10; [fy, dfy] = g(fx); plot(fx, fy); hold on; % find the minimum x(1) = x0; finish = 0; i = 1; while ~finish [y, dy] = g(x(i)); plot(x(i), y, 'r*'); pause x(i+1) = x(i) - c * dy; if abs(x(i+1)-x(i)) < epsilon finish = 1; end i = i + 1; x(i) % plot trial points [y, dy] = g(x); for i=1:length(x) plot(x(i), y(i), 'r*'); function [y, dy] = g(x) y = 50*sin(x) + x .* x ; dy = 50*cos(x) + 2*x;