CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ 4: Regression (continued) 1CSC 4510 - M.A. Papalaskari - Villanova University T he slides in this presentation are adapted from: The Stanford online ML course http://www.ml-class.org/http://www.ml-class.org/

Last time Introduction to linear regression Intuition – least squares approximation Intuition – gradient descent algorithm Hands on: Simple example using excel CSC 4510 - M.A. Papalaskari - Villanova University2

Today How to apply gradient descent to minimize the cost function for regression linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University3

Housing Prices (Portland, OR) Price (in 1000s of dollars) Size (feet 2 ) 4CSC 4510 - M.A. Papalaskari - Villanova University Reminder: sample problem

Notation: m = Number of training examples x’s = “input” variable / features y’s = “output” variable / “target” variable Size in feet 2 (x) Price ($) in 1000's (y) 2104460 1416232 1534315 852178 …… Training set of housing prices (Portland, OR) 5CSC 4510 - M.A. Papalaskari - Villanova University Reminder: Notation

Training Set Learning Algorithm h Size of house Estimate price Linear Hypothesis: Univariate linear regression) 6CSC 4510 - M.A. Papalaskari - Villanova University Reminder: Learning algorithm for hypothesis function h

Training Set Learning Algorithm h Size of house Estimate price Linear Hypothesis: Univariate linear regression) 7CSC 4510 - M.A. Papalaskari - Villanova University Reminder: Learning algorithm for hypothesis function h

Gradient descent algorithm Linear Regression Model 8CSC 4510 - M.A. Papalaskari - Villanova University

Today How to apply gradient descent to minimize the cost function for regression 1.a closer look at the cost function 2.applying gradient descent to find the minimum of the cost function linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University9

Hypothesis: Parameters: Cost Function: Goal: 10CSC 4510 - M.A. Papalaskari - Villanova University

Hypothesis: Parameters: Cost Function: Goal: Simplified 11CSC 4510 - M.A. Papalaskari - Villanova University θ 0 = 0

y x (for fixed θ 1 this is a function of x )(function of the parameter θ 1 ) 12CSC 4510 - M.A. Papalaskari - Villanova University θ 0 = 0 h θ (x) = x

y x 13CSC 4510 - M.A. Papalaskari - Villanova University (for fixed θ 1 this is a function of x )(function of the parameter θ 1 ) θ 0 = 0 h θ (x) = 0.5x

y x 14CSC 4510 - M.A. Papalaskari - Villanova University (for fixed θ 1 this is a function of x )(function of the parameter θ 1 ) θ 0 = 0 h θ (x) = 0

Hypothesis: Parameters: Cost Function: Goal: 15CSC 4510 - M.A. Papalaskari - Villanova University What if θ 0 ≠ 0?

Price ($) in 1000’s Size in feet 2 (x) 16CSC 4510 - M.A. Papalaskari - Villanova University h θ (x) = 10 + 0.1x (for fixed θ 0, θ 1, this is a function of x)(function of the parameters θ 0, θ 1 )

17CSC 4510 - M.A. Papalaskari - Villanova University

(for fixed θ 0, θ 1, this is a function of x)(function of the parameters θ 0, θ 1 ) 18CSC 4510 - M.A. Papalaskari - Villanova University

19CSC 4510 - M.A. Papalaskari - Villanova University (for fixed θ 0, θ 1, this is a function of x)(function of the parameters θ 0, θ 1 )

Have some function Want Gradient descent algorithm outline: Start with some Keep changing to reduce until we hopefully end up at a minimum 23CSC 4510 - M.A. Papalaskari - Villanova University

Have some function Want Gradient descent algorithm 24CSC 4510 - M.A. Papalaskari - Villanova University

Have some function Want Gradient descent algorithm learning rate 25CSC 4510 - M.A. Papalaskari - Villanova University

If α is too small, gradient descent can be slow. If α is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge. 26CSC 4510 - M.A. Papalaskari - Villanova University

at local minimum Current value of 27CSC 4510 - M.A. Papalaskari - Villanova University

Gradient descent can converge to a local minimum, even with the learning rate α fixed. 28CSC 4510 - M.A. Papalaskari - Villanova University

Gradient descent algorithm Linear Regression Model 29CSC 4510 - M.A. Papalaskari - Villanova University

Gradient descent algorithm update and simultaneously 30CSC 4510 - M.A. Papalaskari - Villanova University

  J(     ) 31CSC 4510 - M.A. Papalaskari - Villanova University

  J(     ) 32CSC 4510 - M.A. Papalaskari - Villanova University

33CSC 4510 - M.A. Papalaskari - Villanova University

(for fixed, this is a function of x)(function of the parameters ) 34CSC 4510 - M.A. Papalaskari - Villanova University

“Batch” Gradient Descent “Batch”: Each step of gradient descent uses all the training examples. Alternative: process part of the dataset for each step of the algorithm. T he slides in this presentation are adapted from: The Stanford online ML course http://www.ml-class.org/http://www.ml-class.org/ 43CSC 4510 - M.A. Papalaskari - Villanova University

Size (feet 2 ) Number of bedrooms Number of floors Age of home (years) Price ($1000) 121045145460 114163240232 115343230315 18522136178 What’s next? We are not in univariate regression anymore: 44CSC 4510 - M.A. Papalaskari - Villanova University

Linear Algebra Review CSC 4510 - M.A. Papalaskari - Villanova University47

Matrix Elements (entries of matrix) “ i, j entry” in the i th row, j th column Matrix: Rectangular array of numbers Dimension of matrix: number of rows x number of columns eg: 4 x 2 48CSC 4510 - M.A. Papalaskari - Villanova University

49 Another Example: Representing communication links in a network b b a c e d e d Adjacency matrix Adjacency matrix a b c d e a b c d e a 0 1 2 0 3 a 0 1 0 0 2 b 1 0 0 0 0 b 0 1 0 0 0 c 2 0 0 1 1 c 1 0 0 1 0 d 0 0 1 0 1 d 0 0 1 0 1 e 3 0 1 1 0 e 0 0 0 0 0

Vector: An n x 1 matrix. n-dimensional vector element 50CSC 4510 - M.A. Papalaskari - Villanova University

Vector: An n x 1 matrix. n-dimensional vector 1-indexed vs 0-indexed: element 51CSC 4510 - M.A. Papalaskari - Villanova University

Matrix Addition 52CSC 4510 - M.A. Papalaskari - Villanova University

Scalar Multiplication 53CSC 4510 - M.A. Papalaskari - Villanova University

Combination of Operands 54CSC 4510 - M.A. Papalaskari - Villanova University

Matrix-vector multiplication 55CSC 4510 - M.A. Papalaskari - Villanova University

Details: m x n matrix (m rows, n columns) n x 1 matrix (n-dimensional vector) m-dimensional vector To get y i, multiply A ’s i th row with elements of vector x, and add them up. 56CSC 4510 - M.A. Papalaskari - Villanova University

Example 57CSC 4510 - M.A. Papalaskari - Villanova University

House sizes: 58CSC 4510 - M.A. Papalaskari - Villanova University

Example matrix-matrix multiplication

Details: m x k matrix (m rows, k columns) k x n matrix (k rows, n columns) m x n matrix 60CSC 4510 - M.A. Papalaskari - Villanova University The i th column of the Matrix C is obtained by multiplying A with the i th column of B. (for i = 1, 2, …, n )

Example: Matrix-matrix multiplication 61CSC 4510 - M.A. Papalaskari - Villanova University

House sizes: Matrix Have 3 competing hypotheses: 1. 2. 3. 62CSC 4510 - M.A. Papalaskari - Villanova University

Let and be matrices. Then in general, (not commutative.) E.g. 63CSC 4510 - M.A. Papalaskari - Villanova University

Let Compute 64CSC 4510 - M.A. Papalaskari - Villanova University

Identity Matrix For any matrix A, Denoted I (or I n x n or I n ). Examples of identity matrices: 2 x 2 3 x 3 4 x 4 65CSC 4510 - M.A. Papalaskari - Villanova University

Matrix inverse: A -1 If A is an m x m matrix, and if it has an inverse, Matrices that don’t have an inverse are “singular” or “degenerate” 66CSC 4510 - M.A. Papalaskari - Villanova University

Matrix Transpose Example: Let be an m x n matrix, and let Then is an n x m matrix, and 67CSC 4510 - M.A. Papalaskari - Villanova University

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/

Similar presentations

Presentation on theme: "CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/

Similar presentations

Presentation on theme: "CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/"— Presentation transcript:

Similar presentations

About project

Feedback