CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Kin 304 Regression Linear Regression Least Sum of Squares
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Read Chapter 17 of the textbook
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 12 Multiple Regression
Least Square Regression
Least Square Regression
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Numerical Analysis ECIV 3306 Chapter 17 Least Square Regression.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
SIMPLE LINEAR REGRESSION
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.
Chapter Topics Types of Regression Models
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Leon-Guerrero and Frankfort-Nachmias,
Chapter 4 Two-Variables Analysis 09/19-20/2013. Outline  Issue: How to identify the linear relationship between two variables?  Relationship: Scatter.
Chapter 8: Bivariate Regression and Correlation
SIMPLE LINEAR REGRESSION
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression
Stats/Methods I JEOPARDY. Jeopardy CorrelationRegressionZ-ScoresProbabilitySurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Logistic Regression Database Marketing Instructor: N. Kumar.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Regression Regression relationship = trend + scatter
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Logistic Regression. Linear Regression Purchases vs. Income.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
Chapter 7. Classification and Prediction
Logistic Regression When and why do we use logistic regression?
Regression Analysis AGEC 784.
Logistic Regression APKC – STATS AFAC (2016).
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Linear Regression Special Topics.
ENME 392 Regression Theory
Kin 304 Regression Linear Regression Least Sum of Squares
Regression Techniques
BPK 304W Regression Linear Regression Least Sum of Squares
BPK 304W Correlation.
Introduction to logistic regression a.k.a. Varbrul
SA3202 Statistical Methods for Social Sciences
Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae
LESSON 21: REGRESSION ANALYSIS
Linear Regression.
Linear regression Fitting a straight line to observations.
The Least-Squares Line Introduction
Least Square Regression
Ch 4.1 & 4.2 Two dimensions concept
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Presentation transcript:

CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Some slides extracted from Data Mining, Introductory and Advanced Topics, Prentice Hall, 2002.

2CSE 5331/7331 F'07 Table of Contents Linear Regression Linear Regression Nonlinear Regression Nonlinear Regression Logistic Regression Logistic Regression Metrics Metrics

3CSE 5331/7331 F'07 Remember High School? Y= mx + b Y= mx + b You need two points to determine a straight line. You need two points to determine a straight line. You need two points to find values for m and b. You need two points to find values for m and b. THIS IS REGRESSION

© Prentice Hall4CSE 5331/7331 F'07 Regression Predict future values based on past values Predict future values based on past values Linear Regression assumes linear relationship exists. Linear Regression assumes linear relationship exists. y = c 0 + c 1 x 1 + … + c n x n Find values to best fit the data Find values to best fit the data

© Prentice Hall5CSE 5331/7331 F'07 Linear Regression

© Prentice Hall6CSE 5331/7331 F'07 Linear Regression Assume data fits a predefined function Assume data fits a predefined function Determine best values for regression coefficients c 0,c 1,…,c n. Determine best values for regression coefficients c 0,c 1,…,c n. Assume an error: y = c 0 +c 1 x 1 +…+c n x n Assume an error: y = c 0 +c 1 x 1 +…+c n x n +  Estimate error using mean squared error for training set:

7CSE 5331/7331 F'07 Linear Regression Poor Fit Why use sum of least squares? Linear doesn’t always work well

8CSE 5331/7331 F'07 Nonlinear Regression Data does not nicely fit a straight line Data does not nicely fit a straight line Fit data to a curve Fit data to a curve Many possible functions Many possible functions Not as easy and straightforward as linear regression Not as easy and straightforward as linear regression How nonlinear regression works: How nonlinear regression works:

9CSE 5331/7331 F'07 Logistic Regression Generalized linear model Generalized linear model Predict discrete outcome Predict discrete outcome –Binomial (binary) logistic regression –Multinomial logistic regression One dependent variable One dependent variable Logistic Regression by Gerard E. Dallal Logistic Regression by Gerard E. Dallal

10CSE 5331/7331 F'07 Logistic Regression (cont’d) Log Odds Function: Log Odds Function: P is probability that outcome is 1 P is probability that outcome is 1 Odds – The probability the event occurs divided by the probability that it does not occur Odds – The probability the event occurs divided by the probability that it does not occur Log Odds function is strictly increasing as p increases Log Odds function is strictly increasing as p increases

11CSE 5331/7331 F'07 Why Log Odds? Shape of curve is desirable Shape of curve is desirable Relationship to probability Relationship to probability Range – to + Range – to +

12CSE 5331/7331 F'07 P-value The probability that a variable has a value greater than the observed value The probability that a variable has a value greater than the observed value s.html s.html s.html s.html

© Prentice Hall13CSE 5331/7331 F'07 Correlation Examine the degree to which the values for two variables behave similarly. Examine the degree to which the values for two variables behave similarly. Correlation coefficient r: Correlation coefficient r: 1 = perfect correlation1 = perfect correlation -1 = perfect but opposite correlation-1 = perfect but opposite correlation 0 = no correlation0 = no correlation

© Prentice Hall14CSE 5331/7331 F'07 Covariance Degree to which two variables vary in the same manner Degree to which two variables vary in the same manner Correlation is normalized and covariance is not Correlation is normalized and covariance is not expect3.html expect3.html expect3.html expect3.html

15CSE 5331/7331 F'07 Residual Error Error Difference between desired output and predicted output Difference between desired output and predicted output May actually use sum of squares May actually use sum of squares