CSE 4705 Artificial Intelligence

CSE 4705 Artificial Intelligence
Jinbo Bi Department of Computer Science & Engineering

Machine learning (1) Supervised learning algorithms

Supervised learning: definition
Given a collection of examples (training set ) Each example contains a set of attributes (independent variables), one of the attributes is the target (dependent variables). Find a model to predict the target as a function of the values of other attributes. Goal: previously unseen examples should be predicted as accurately as possible. A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

Supervised learning: classification
When the dependent variable is categorical, a classification problem

Supervised learning: regression
When the dependent variable is continuous, a regression problem

Practice A very simple example
We observe the following data, how can we find a function f that best approximate y = f(x) X Y This dataset was created using y = sin(2πx) + N(0,σ) where σ = 0.1

Simple regression algorithms
Least squares methods Linear regression Polynomial regression with least squares The very important practical issues Overfitting underfitting Regularized linear regression Ridge regression Least absolute shrinkage and selection operator

Least squares We wish to use some real-valued input variables x to predict the value of a target y We collect training data of pairs (xi,yi), i=1,…N Suppose we have a model f that maps each x example to a value of y’ Sum of squares function: Sum of the squares of the deviation between the observed target value y and the predicted value y’

Least squares Find a function f such that the sum of squares is minimized For example, your function is in the form of linear functions f (x) = wTx Least squares with a linear function of parameters w is called “linear regression”

Linear regression Linear regression has a closed-form solution for w
The minimum is attained at the zero derivative

Linear regression A simple example where we observed three data points
(x1, y1), (x2, y2) and (x3, y3) where x is a vector of 2 entries

Polynomial fitting with least squares
For example, our function is no longer linear but a polynomial of x Let us assume we have one independent variable x, and we are building a polynomial of order M to approximate y

Polynomial fitting with least squares
Similarly, Least Squares has a closed form solution with linear regression, we also have the same closed form solution when the function is a polynomial where

Supervised learning – practical issues
Underfitting Overfitting Before introducing these important concept, let us study the behavior of the simple regression algorithm – polynomial regression

Polynomial curve fitting
x is evenly distributed from [0,1] y = f(x) + random error y = sin(2πx) + ε, ε ~ N(0,σ)

Polynomial curve fitting

Sum-of-Squares error function

0-th order polynomial

3-rd order polynomial

Over-fitting Root-Mean-Square (RMS) Error:

Over-fitting If we create 100 more points in the same simulation procedure, Root-Mean-Square (RMS) Error:

Polynomial coefficients

Data set size 9th Order Polynomial

Regularization Penalize large coefficient values Ridge regression

Regularization Let us fit again a polynomial of order 9, but this time we use regularization

Regularization

Regularization vs.

Polynomial coefficients

Ridge regression Derive the analytic solution to the optimization problem for ridge regression Again, set the first order derivative = 0

Regularization Penalize large coefficient values using another way LASSO – least absolute shrinkage and selection operator No closed-form solution, we need to run iterative algorithm

Questions?

CSE 4705 Artificial Intelligence

Similar presentations

Presentation on theme: "CSE 4705 Artificial Intelligence"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 4705 Artificial Intelligence

Similar presentations

Presentation on theme: "CSE 4705 Artificial Intelligence"— Presentation transcript:

Similar presentations

About project

Feedback