LECTURE 12 Multiple regression analysis Epsy 640 Texas A&M University.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
The Simple Regression Model
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Linear regression models
The General Linear Model. The Simple Linear Model Linear Regression.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Probability & Statistical Inference Lecture 8 MSc in Computing (Data Analytics)
Maximum likelihood (ML) and likelihood ratio (LR) test
CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Additional Topics in Regression Analysis
LECTURE 5 MULTIPLE REGRESSION TOPICS –SQUARED MULTIPLE CORRELATION –B AND BETA WEIGHTS –HIERARCHICAL REGRESSION MODELS –SETS OF INDEPENDENT VARIABLES –SIGNIFICANCE.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Maximum likelihood (ML)
EPSY 651: Structural Equation Modeling I. Where does SEM fit in Quantitative Methodology? Draws on three traditions in mathematics and science: Psychology.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Introduction to Linear Regression and Correlation Analysis
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Chapter 7 Point Estimation
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Machine Learning 5. Parametric Methods.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Summary of the Statistics used in Multiple Regression.
Computacion Inteligente Least-Square Methods for System Identification.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Statistics 350 Review. Today Today: Review Simple Linear Regression Simple linear regression model: Y i =  for i=1,2,…,n Distribution of errors.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Multiple Regression.
Chapter 13 Simple Linear Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Chapter 4. Inference about Process Quality
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Lecture 2- Alternate Correlation Procedures
CHAPTER 29: Multiple Regression*
CONCEPTS OF ESTIMATION
Multiple Regression.
6-1 Introduction To Empirical Models
Statistical Assumptions for SLR
5.2 Least-Squares Fit to a Straight Line
Simple Linear Regression
Linear Regression and Correlation
Applied Statistics and Probability for Engineers
Presentation transcript:

LECTURE 12 Multiple regression analysis Epsy 640 Texas A&M University

Multiple regression analysis The test of the overall hypothesis that y is unrelated to all predictors, equivalent to H 0 :  2 y  123… = 0 H 1 :  2 y  123… = 0 is tested by F = [ R 2 y  123… / p] / [ ( 1 - R 2 y  123… ) / (n – p – 1) ] F = [ SS reg / p ] / [ SS e / (n – p – 1)]

Multiple regression analysis SOURCEdfSum of SquaresMean SquareF x 1, x 2 …pSS reg SS reg / pSS reg / 1 SS e /(n-p-1) e (residual) n-p-1SS e SS e / (n-p-1) total n-1SS y SS y / (n-1) Table 8.2: Multiple regression table for Sums of Squares

Multiple regression analysis predicting Depression LOCUS OF CONTROL, SELF-ESTEEM, SELF-RELIANCE

ss x 1 ss x 2 SSy SSe Fig. 8.4: Venn diagram for multiple regression with two predictors and one outcome measure SS reg

Type I ss x 1 Type III ss x 2 SSy SSe Fig. 8.5: Type I contributions SSx 1 SSx 2

Type III ss x 1 Type III ss x 2 SSy SSe Fig. 8.6: Type IIII unique contributions SSx 1 SSx 2

Multiple Regression ANOVA table SOURCEdfSum of SquaresMean SquareF (Type I) Model2SS reg SS reg / 2SS reg / 2 SS e / (n-3) x 1 1 SS x1 SS x1 / 1SS x1 / 1 SS e /(n-3) x 21 SS x2  x1 SS x2  x1 SS x2  x1 / 1 SS e /(n-3) e n-3SS e SS e / (n-3) total n-1SS y SS y / (n-3) Table 8.3: Multiple regression table for Sums of Squares of each predictor

X1X1 X2X2 Y e  =.5  =.6 r =.4 R 2 = (.74)(.8)(.4)  ( ) = PATH DIAGRAM FOR REGRESSION

Depression DEPRESSION LOC. CON. SELF-EST SELF-REL R 2 =.60 e .4

Shrinkage R 2 Different definitions: ask which is being used: –What is population value for a sample R 2 ? R 2 s = 1 – (1- R 2 )(n-1)/(n-k-1) –What is the cross-validation from sample to sample? R 2 sc = 1 – (1- R 2 )(n+k)/(n-k)

Estimation Methods Types of Estimation: –Ordinary Least Squares (OLS) Minimize sum of squared errors around the prediction line –Generalized Least Squares A regression technique that is used when the error terms from an ordinary least squares regression display non-random patterns such as autocorrelation or heteroskedasticity.ordinary least squares –Maximum Likelihood

Maximum Likelihood Estimation Maximum likelihood estimation There is nothing visual about the maximum likelihood method - but it is a powerful method and, at least for large samples, very preciseMaximum likelihood estimation begins with writing a mathematical expression known as the Likelihood Function of the sample data. Loosely speaking, the likelihood of a set of data is the probability of obtaining that particular set of data, given the chosen probability distribution model. This expression contains the unknown model parameters. The values of these parameters that maximize the sample likelihood are known as the Maximum Likelihood Estimatesor MLE's. Maximum likelihood estimation is a totally analytic maximization procedure. MLE's and Likelihood Functions generally have very desirable large sample properties: –they become unbiased minimum variance estimators as the sample size increases –they have approximate normal distributions and approximate sample variances that can be calculated and used to generate confidence bounds –likelihood functions can be used to test hypotheses about models and parameters With small samples, MLE's may not be very precise and may even generate a line that lies above or below the data pointsThere are only two drawbacks to MLE's, but they are important ones: –With small numbers of failures (less than 5, and sometimes less than 10 is small), MLE's can be heavily biased and the large sample optimality properties do not apply Calculating MLE's often requires specialized software for solving complex non-linear equations. This is less of a problem as time goes by, as more statistical packages are upgrading to contain MLE analysis capability every year.

Outliers Leverage (for a single predictor): L i = 1/n + (Xi –Mx) 2 /  x 2 (min=1/n, max=1) Values larger than 1/n by large amount should be of concern Cook’s Di =  (Y – Yi) 2 / [(k+1)MSres] –the difference between predicted Y with and without Xi   

Outliers In SPSS Regression, under the SAVE option, both leverage and Cook’s D will be computed and saved as new variables with values for each case