Linear Regression review

Slides:

Advertisements

Similar presentations

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.

Advertisements

Regularization David Kauchak CS 451 – Fall 2013.

MS&E 211 Quadratic Programming Ashish Goel. A simple quadratic program Minimize (x 1 ) 2 Subject to: -x 1 + x 2 ≥ 3 -x 1 – x 2 ≥ -2.

Least squares CS1114

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Reformulated - SVR as a Constrained Minimization Problem subject to n+1+2m variables and 2m constrains minimization problem Enlarge the problem size and.

Optimization Methods One-Dimensional Unconstrained Optimization

Lecture 12 Projection and Least Square Approximation Shang-Hua Teng.

Lecture 12 Least Square Approximation Shang-Hua Teng.

Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.

K-means Clustering. What is clustering? Why would we want to cluster? How would you determine clusters? How can you do this efficiently?

Optimization Methods One-Dimensional Unconstrained Optimization

Linear Discriminators Chapter 20 From Data to Knowledge.

EXAMPLE 3 Write an equation for a function

EXAMPLE 4 Choose a solution method Tell what method you would use to solve the quadratic equation. Explain your choice(s). a. 10x 2 – 7 = 0 SOLUTION a.

LINEAR ALGEBRA A toy manufacturer makes airplanes and boats: It costs $3 to make one airplane and $2 to make one boat. He has a total of $60. The many.

Linear Regression James H. Steiger. Regression – The General Setup You have a set of data on two variables, X and Y, represented in a scatter plot. You.

Quadratics Solving equations Using “Completing the Square”

Linear Models for Classification

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore

Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.

Exam 1 Oct 3, closed book Place ITE 119, Time:12:30-1:45pm

KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.

Mathematical Analysis of MaxEnt for Mixed Pixel Decomposition

1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.

Geology 5670/6670 Inverse Theory 27 Feb 2015 © A.R. Lowry 2015 Read for Wed 25 Feb: Menke Ch 9 ( ) Last time: The Sensitivity Matrix (Revisited)

Warm Up Tell whether the system has one solution, no solution, or infinitely many solutions.

3-1Forecasting Weighted Moving Average Formula w t = weight given to time period “t” occurrence (weights must add to one) The formula for the moving average.

Simple Linear Regression In many scientific investigations, one is interested to find how something is related with something else. For example the distance.

Data Mining ICCM

Linear Equations Constant Coefficients

Deep Feedforward Networks

Review of Matrix Operations

Heuristic Optimization Methods

Chapter 2 Single Layer Feedforward Networks

Solving Quadratic Equations by the Complete the Square Method

One-layer neural networks Approximation problems

A Simple Artificial Neuron

Haim Kaplan and Uri Zwick

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Factoring Quadratics.

Neural Networks A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.

CS 2750: Machine Learning Linear Regression

Probabilistic Models for Linear Regression

A special case of calibration

Solving Quadratic Equations by Factoring

Linear Discriminators

Lecture 8 Generalized Linear Models &

Simple linear equation

Use back - substitution to solve the triangular system. {image}

For Big Data sets and Data Science Applications

The Quadratic Formula.

Collaborative Filtering Matrix Factorization Approach

Factoring Quadratics using X-Box method

CSCI B609: “Foundations of Data Science”

Linear Classifier by Dr

دسته بندی با استفاده از مدل های خطی

Conjugate Gradient Method

Instructor :Dr. Aamer Iqbal Bhatti

مدلسازي تجربي – تخمين پارامتر

Introduction to Scientific Computing II

Introduction to Scientific Computing II

Introduction to Scientific Computing II

Nonlinear Fitting.

Factoring Quadratics using X-Box method

Incremental Problem Solving for CS1100

Linear Discrimination

Calibration and homographies

Presentation transcript:

Linear Regression review http://youtu.be/GAmzwIkGFgE http://youtu.be/ocGEhiLwDVc http://youtu.be/qPga0OBV-O8 http://youtu.be/MwokVxy5tvg

Search and LR LR minimizes the sum of the errors squared between regression line and data points LR finds values for A and B in y = Ax + B to minimize the sum of the errors squared Are there other ways of “finding” A and B? Yes Do they guarantee minimizing sigma errors squared? Suppose the relationship is not linear?

Problem solving as search Through the lens of search, all problems look the same. There is a space of candidate solutions There is a candidate solution generator There is a way to measure “progress” so you know when you reach a “good solution” You can tell if you found a “good solution” You can compare two candidates and tell which is better Every candidate has a cost (minimize) or utility (maximize) which can guide progress

Generate and test Repeat Candidate = generate() if test(candidate) == “good solution” break

Search How is linear regression like generate and test? Linear regression has a very very good generator that generates a “good solution” in one iteration But it only works on linearly, related data Quadratic regression only works on quadratically related data

Poorly understood data Stock markets GDP Cancer Car buying Aisle stocking Recommendations Images, videos,

Poorly understood data Visualization can help when data is two or three dimensional (maybe upto 10 dimensions). This is still an art. Generate and test might be slow Consider using a simple generator for LS regression. Generator would generate all possible values of A and B within [-1024.00...+1024.00] Suppose we have 100 dimensions?

Can we use gradients? A gradient is a local slope. If we can tell which of two candidates is better can we make progress towards a solution? Think about the connect4 learner If one set of weights is better than another, can we make progress towards the “best” set of weights?

Search algorithm solutionOld = generate() solutionNew = generate() Repeat if evaluate(solutionNew) < evaluate(solutionOld) SolutionOld = solutionNew SolutionNew = modify(solutionNew) Else SolutionNew = modify(solutionOld)

Issues Time versus quality Limiting the search space Discretizing the search space Susceptibility to local optima