SA3101 Final Examination Solution

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

1 Outliers and Influential Observations KNN Ch. 10 (pp )
Ch. 21 Practice.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Lecture 25 Multiple Regression Diagnostics (Sections )
Chapter 7 Analysis of ariance Variation Inherent or Natural Variation Due to the cumulative effect of many small unavoidable causes. Also referred to.
Lecture 24 Multiple Regression (Sections )
Lecture 24: Thurs., April 8th
Outliers and Influential Data Points in Regression Analysis James P. Stevens sujin jang november 10, 2008.
Ch. 14: The Multiple Regression Model building
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Basics of Regression Analysis. Determination of three performance measures Estimation of the effect of each factor Explanation of the variability Forecasting.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
1 Reg12M G Multiple Regression Week 12 (Monday) Quality Control and Critical Evaluation of Regression Results An example Identifying Residuals Leverage:
Anaregweek11 Regression diagnostics. Regression Diagnostics Partial regression plots Studentized deleted residuals Hat matrix diagonals Dffits, Cook’s.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
1 1 Slide Simple Linear Regression Estimation and Residuals Chapter 14 BA 303 – Spring 2011.
© Department of Statistics 2012 STATS 330 Lecture 23: Slide 1 Stats 330: Lecture 23.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Multiple Regression SECTION 10.3 Variable selection Confounding variables.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
A.K.A. Kuby Snacks: Chapter 4 Newton’s First Law of Migration: The Gravity Model.
1 Reg12W G Multiple Regression Week 12 (Wednesday) Review of Regression Diagnostics Influence statistics Multicollinearity Examples.
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
MATH 2311 Section 5.4. Residuals Examples: Interpreting the Plots of Residuals The plot of the residual values against the x values can tell us a lot.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Using the regression equation (17.6). Why regression? 1.Analyze specific relations between Y and X. How is Y related to X? 2.Forecast / Predict the variable.
Unit 9: Dealing with Messy Data I: Case Analysis
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Chapter 14 Introduction to Multiple Regression
Is there a relationship between the lengths of body parts?
Chapter 6 Diagnostics for Leverage and Influence
Statistics for Managers using Microsoft Excel 3rd Edition
Chapter 9 Multiple Linear Regression
Unit 4 Lesson 4 (5.4) Summarizing Bivariate Data
Chapter 5 Lesson 5.3 Summarizing Bivariate Data
Regression Diagnostics
Regression Model Building - Diagnostics
Chapter 8 Part 2 Linear Regression
Diagnostics and Transformation for SLR
Lecture 14 Review of Lecture 13 What we’ll talk about today?
1. An example for using graphics
Solutions of Tutorial 10 SSE df RMS Cp Radjsq SSE1 F Xs c).
Prediction and Prediction Intervals
Residuals The residuals are estimate of the error
Solutions to Tutorial 6 Problems
Motivational Examples Three Types of Unusual Observations
Multiple Linear Regression
Solution 9 1. a) From the matrix plot, 1) The assumption about linearity seems ok; 2).The assumption about measurement errors can not be checked at this.
Full Model: contain ALL coefficients of interest
Solution 7 1. a). The scatter plot or the residuals vs fits plot
Three Measures of Influence
Pemeriksaan Sisa dan Data Berpengaruh Pertemuan 17
Regression Model Building - Diagnostics
Regression Forecasting and Model Building
Solutions of Tutorial 9 SSE df RMS Cp Radjsq SSE1 F Xs c).
Risk Factor Analysis (II)
Problems of Tutorial 9 (Problem 4.12, Page 120) Download the “Data for Exercise ” from the class website. The data consist of 1 response variable.
Diagnostics and Transformation for SLR
Chapter 3 Vocabulary Linear Regression.
Presentation transcript:

SA3101 Final Examination Solution SA3101 Regression Analysis Solutions of the Final Examination Semester II, 2001-2002 (d), (c ), (b), (a), (d). (a) The matrix plot looks like (b) The matrix plot looks like 4/28/2019 SA3101 Final Examination Solution

SA3101 Final Examination Solution 2. (continued). (c ) The potential-residual plot is (d) n=25, p=4. H0: beta3=2 v.s. Beta3 not =2 , T3=-4.37, df=n-p-1=50-4-1=45. T(45,.025)=2.01. The graph indicating the rejection regions and T3 looks like Since T3 falls into the left sided rejection region, H0 is rejected at 5% level of significance. 4/28/2019 SA3101 Final Examination Solution

SA3101 Final Examination Solution 2. (continued) (d) H0: beta1=0, beta4=0 v.s. H1: H0 is false. F=2.1, df=(2,45). The graph indicating the rejection region and F is as follows: Since F doesn’t fall into the rejection region, H0 is not rejected. Thus, the reduced model is adequate. (e). The index plot of the Cook’s distance is 4/28/2019 SA3101 Final Examination Solution

SA3101 Final Examination Solution 3. (a). We should use the studentized residual here. The following is the index plot of the studentized residuals: Observations 4,5, 7, 19 are outliers, using cut-off value 2. (b). Observations 4 and 5 are high leverage points, using cut-off value 2(p+1)/n=.5 for Pii. (c ). Observations 4,5, 7, 19 are influential points, using cut-off value 2((p+1)/(n-p-1))^.5=1.1546. (d). The rough potential-residual plot is 4/28/2019 SA3101 Final Examination Solution

SA3101 Final Examination Solution 4. n=48, p=5, R^2(F)=30.7% (a). R0^2=0. R^2 reflects the proportion of the total variability explained by the predictors. Since there are no predictors in the model, the R^2=0. (b). X3(Taxes) will be first introduced since R^2=16.6% is the largest R^2 among all R^2 for single predictor models. H0: Y=beta0+e v.s. H1: Y=beta0+X3beta3+e. F=(R3^2-R0^2)/1/(1-R3^2)*(48-2)=.166/.834*46=9.1559, df=(1,46), F(.05,1,46)=4.04. X3 should be introduced into the model. Thus, the variable “Taxes” is the first factor that affects the Domestic immigration behavior of American people. (c) . X1 (crime), X3 (Taxes) and X4 (Educ.) should be first introduced since the associated R^2 is the largest one among all R^2 with 3-variable models. H0: Y=beta0+X3beta3+e v.s. H1: Y=beta0+X1beta1+X3beta3+X4beta4+e R^2(F)=.307, df(F)=48-3-1=44; R^2(R )=.166, df(R )=48-1-1=46 F=(.307-.166)/2/(1-.307)*44=4.476, df=(2,44), F(2,44,.05)=3.19, Reject H0, the reduced model is not adequate. That is, X1 and X4 should be introduced. Thus, Crime, Taxes and Education are three major concerns when American people consider domestic immigration. (d). Either X5 or X2 should be first deleted. This is because R^2 reduces almost 0 if X5 or X2 is deleted. 4/28/2019 SA3101 Final Examination Solution

SA3101 Final Examination Solution 4. (continued) (e) Model “34” is the best model since the associated R^2 is the largest among all R^2 with 2-variables. Note that X’s 13 23 24 34 35 R^2 16.8% 19.5% 8.5% 26.7% 17.4% Ra^2 13.5% 15.9% 4.43% 23.4% 13.7% Thus, according to adjusted R^2, model “34” is also the best. Actually Ra^2=1-(n-1)/(n-p-1)*(1-R^2) This means that for the same p, Ra^2 is strictly increasing with R^2. This is the main reason why the solution is the same. (f). p 0 1 2 3 4 5 R^2 0 16.6% 26.7% 30.7% 30.7% 30.7% Ra^2 0 14.8% 23.4% 26.0% 24.3% 22.45% Since for p=3,4,5, the R^2 changes almost 0 while from p=2 to p=3, the change is big. Thus, p=3 is the best model. The associated adjusted R^2 confirm this conclusion. 4/28/2019 SA3101 Final Examination Solution