Question about Gradient descent Hung-yi Lee. Larger gradient, larger steps? Best step:

Slides:



Advertisements
Similar presentations
Factoring – Trinomials (a ≠ 1),
Advertisements

ECE 471/571 - Lecture 13 Gradient Descent 10/13/14.
Please turn in your Home-learning, get your notebook and Springboard book, and begin the bell-ringer! Test on Activity 6, 7 and 8 Wednesday (A day) and.
QuickSort Example Use the first number in the list as a ‘pivot’ First write a list of the numbers smaller than the pivot, in the order.
Numbers
2D case: If or then 2 or 3D cases: Several dimensions: U(x,y,z) Compact notation using vector del, or nabla: Another notation: Partial derivative is.
Addition and Subtraction Investigation Algorithm for the investigation 1. Choose two or three numbers. 2. Arrange them in order, biggest first. 3. Reverse.
Chapter 4 Negative Numbers. Learning Objectives Order numbers Subtracting a larger number from a smaller number Adding negative numbers Subtracting negative.
Chapter 2.2 Scientific Notation. Expresses numbers in two parts: A number between 1 and 10 Ten raised to a power Examples: 2.32 x x
 Multiplying or Dividing by a POSITIVE number. › General Rule:  The symbols (>, 0, 
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
Model representation Linear regression with one variable
Andrew Ng Linear regression with one variable Model representation Machine Learning.
Lecture 5 Controlled Sources (1) Hung-yi Lee. Textbook Chapter 2.3, 3.2.
Adding, Subtracting, Multiplying, and Dividing Real Numbers.
Sec 3.4 Finding Rate ObjectivesObjectives – Use the basic percent formula to solve for rate – Find the rate of return when the amount of the return and.
Power Rule and Substitution to find the anti derivative.
7 jumps right 12 jumps left Objective - To solve problems involving scientific notation Scientific Notation- used to write very large or very small numbers.
Lecture Note 2 – Calculus and Probability Shuaiqiang Wang Department of CS & IS University of Jyväskylä
Bacterial Foraging Optimization (BFO)
Introducing: common denominator least common denominator like fractions unlike fractions. HOW TO COMPARE FRACTIONS.
Local Linear Approximation Objective: To estimate values using a local linear approximation.
Digital Image Processing Week V Thurdsak LEAUHATONG.
A dilation is when the figure either gets larger (enlargement) or smaller (reduction). =
Problem 3 p. 45 Electric potential on ring’s axis From Chapter 2:
10-1 人生与责任 淮安工业园区实验学校 连芳芳 “ 自我介绍 ” “ 自我介绍 ” 儿童时期的我.
National Taiwan University
Classification: Logistic Regression
Machine Learning – Regression David Fenyő
Equations of Tangents.
10701 / Machine Learning.
HOW TO COMPARE FRACTIONS
رؤية مستقبلية لتطوير كلية الزراعة جامعة الفيوم
Yahoo Mail Customer Support Number
Most Effective Techniques to Park your Manual Transmission Car
How do Power Car Windows Ensure Occupants Safety
Regression Analysis 4e Montgomery, Peck & Vining
Objective - To solve problems involving scientific notation
تکنیک های فروش. تکنیک های فروش آیا فروش همان بازاریابی است؟ نقطه شروع کانون توجه وسیله نقطه پایان سود آوری با تامین رضایت مشتری آمیزه بازاریابی نیازهای.
مراقبت خانواده محور در NICU
استخراج فلزات 1 آماده‌سازی بار. استخراج فلزات 1 آماده‌سازی بار.
Presented by Xinxin Zuo 10/20/2017
Steepest Descent Algorithm: Step 1.
Applying Exponent Rules: Scientific Notation
دانشگاه شهیدرجایی تهران
تعهدات مشتری در کنوانسیون بیع بین المللی
THANK YOU!.
بسمه تعالی کارگاه ارزشیابی پیشرفت تحصیلی
دومین کمیته مترجمین حاکمیت بالینی دانشگاه
Equivalent Fractions Raising and Reducing.
Deep Neural Networks (DNN)
Thank you.
Thank you.
Math Journal Notes Unit 5.
Class Project Survey of 3-5 research papers
Shodmonov M.. The main goal of the work Analysis.
Backpropagation Disclaimer: This PPT is modified based on
Machine Learning Project
The Math of Machine Learning
Making Equivalent Fractions.
Quadratic graphs.
6.7 Dividing a Polynomial by a Polynomial
Sampling Distributions
Scale Drawings – Common Core
The Updated experiment based on LSTM
> < > < ≥ ≤ Shot at 90% Larger Smaller Larger
Divide 9 × by 3 ×
First-Order Methods.
Presentation transcript:

Question about Gradient descent Hung-yi Lee

Larger gradient, larger steps? Best step:

Contradiction Original Gradient descent Adagrad RMSprop Larger gradient, larger step Divided by first derivative

Second Derivative Best step: The best step is |First derivative| Second derivative

More than one parameters |First derivative| Second derivative The best step is a b c d c < ac < a c > d Larger second derivative smaller second derivative a > b

What to do with Adagrad and RMSprop? |First derivative| Second derivative The best step is Use first derivative to estimate second derivative larger second derivative smaller second derivative

Acknowledgement This question is raised by 李廣和

Thanks for your attention!