Download presentation
Presentation is loading. Please wait.
Published byRaymond Turner Modified over 9 years ago
1
Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver fluke egg hatching rate ii.explaining CEO remuneration iii.brain weights and body weights 3.SLR with transformed data 4.Transforming X, quadratic fit 5.Other options
2
Diploma in Statistics Introduction to Regression Lecture 5.12 Using t values Convention: n >30 is big, n < 30 is small. Z 0.05 = 1.96 ≈ 2 t 30, 0.05 = 2.04 ≈ 2
3
Diploma in Statistics Introduction to Regression Lecture 5.13
4
Diploma in Statistics Introduction to Regression Lecture 5.14 Quantify the extent of the recovery in Year 6, Q3. = 1030 Q1 + 1292 Q2 + 1210 Q3 + 1279 Q4 + 33.7 Time Year 6 Q2: P = 1657 = 1292 + 33.7 × 22 = 2033 P – = 1657 – 2033 = – 376 Year 6 Q3: P = 2185 = 1210 + 33.7 × 23 = 1985 P – = 2185 – 1985 = 200 Homework 4.2.1
5
Diploma in Statistics Introduction to Regression Lecture 5.15 Homework 4.2.2 List correspondences between the output from the original regression and the output from the alternative regression. Confirm that the coefficients of Q1, Q2 and Q3 in the original are the corresponding coefficients in the alternative with the Q4 coefficient added.
6
Diploma in Statistics Introduction to Regression Lecture 5.16 Predictor Coef SE Coef T P Noconstant Q1 1029.87 23.41 43.99 0.000 Q2 1292.35 24.45 52.85 0.000 Q3 1210.42 25.55 47.37 0.000 Q4 1278.70 26.71 47.88 0.000 Time 33.725 1.619 20.83 0.000 S = 40.9654 Predictor Coef SE Coef T P Constant 1278.70 26.71 47.88 0.000 Q1 -248.82 26.36 -9.44 0.000 Q2 13.65 26.11 0.52 0.609 Q3 -68.27 25.96 -2.63 0.019 Time 33.725 1.619 20.83 0.000 S = 40.9654
7
Diploma in Statistics Introduction to Regression Lecture 5.17 Homework 4.2.3 1.Calculate the simple linear regressions of Jobtime on each of T_Ops and Units. Confirm the corresponding t-values. 2.Calculate the simple linear regression of Jobtime on Ops per Unit. Comment on the negative correlation of Jobtime with Ops per Unit in the light of the corresponding t-value. 3.Confirm the calculation of the R 2 values.
8
Diploma in Statistics Introduction to Regression Lecture 5.18 Solution 4.2.3 2.Calculate the simple linear regression of Jobtime on Ops per Unit. Comment on the negative correlation of Jobtime with Ops per Unit in the light of the corresponding t-value. Comment: The t-value is insignificant; the negative correlation is just chance variation, with no substantive meaning.
9
Diploma in Statistics Introduction to Regression Lecture 5.19 Variance Inflation Factors Convention: problem if > 90% or VIF k > 10
10
Diploma in Statistics Introduction to Regression Lecture 5.110 What to do? Get new X values, to break correlation pattern –impractical in observational studies Choose a subset of the X variables –manually –automatically stepwise regression other methods
11
Diploma in Statistics Introduction to Regression Lecture 5.111 Residential load survey data. Data collected by a US electricity supplier during an investigation of the factors that influence peak demand for electricity by residential customers. Load is demand at system peak demand hour, (kW) Size is house size, in SqFt/1000, Income (X2) is annual family income, in $/1000, AirCon (X3) is air conditioning capacity, in tons, Index (X4) is the house appliance index, in kW, Residents (X5) is number in house on a typical day
12
Diploma in Statistics Introduction to Regression Lecture 5.112 Matrix plot
13
Diploma in Statistics Introduction to Regression Lecture 5.113 Results All variables in: Predictor Coef SE Coef T P Constant 0.1263 0.2289 0.55 0.585 Size -2.6689 0.9059 -2.95 0.006 Income 0.00027912 0.00007892 3.54 0.001 AirCon 0.42462 0.03472 12.23 0.000 Index 0.00038137 0.00007884 4.84 0.000 Residents 0.00197 0.02218 0.09 0.930 Income deleted Predictor Coef SE Coef T P Constant -397.0 492.7 -0.81 0.426 Size 10943.3 594.2 18.42 0.000 AirCon -1.86 75.45 -0.02 0.980 Index 0.0721 0.1709 0.42 0.676 Residents 38.65 47.75 0.81 0.424
14
Diploma in Statistics Introduction to Regression Lecture 5.114 Exercise Calculate the VIF for Size. Comment. Homework Calculate variance inflation factors for all explanatory variables. Discuss
15
Diploma in Statistics Introduction to Regression Lecture 5.115 Multicollinearity when when there is perfect correlation within the X variables. Example: Indicators Illustration: Minitab
16
Diploma in Statistics Introduction to Regression Lecture 5.116 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver fluke egg hatching rate ii.explaining CEO remuneration iii.brain weights and body weightsA 3.SLR with transformed data 4.Transforming X, quadratic fit 5.Other options
17
Diploma in Statistics Introduction to Regression Lecture 5.117 (i)Hatching of liver fluke eggs The life cycle of the liver fluke
18
Diploma in Statistics Introduction to Regression Lecture 5.118 Hatching of liver fluke eggs: Duration and Success rate
19
Diploma in Statistics Introduction to Regression Lecture 5.119
20
Diploma in Statistics Introduction to Regression Lecture 5.120
21
Diploma in Statistics Introduction to Regression Lecture 5.121 (ii)Explaining CEO Compensation and Company Sales, (Forbes magazine, May 1994)
22
Diploma in Statistics Introduction to Regression Lecture 5.122 Explaining CEO Remuneration, bivariate log transformation
23
Diploma in Statistics Introduction to Regression Lecture 5.123 (iii) Mammals' Brainweight vs Bodyweight
24
Diploma in Statistics Introduction to Regression Lecture 5.124 Scatterplot view
25
Diploma in Statistics Introduction to Regression Lecture 5.125 Scatterplot view, log transform
26
Diploma in Statistics Introduction to Regression Lecture 5.126 Scatterplot view, Dinosaurs deleted
27
Diploma in Statistics Introduction to Regression Lecture 5.127 Histogram view
28
Diploma in Statistics Introduction to Regression Lecture 5.128 Histogram view, log transform
29
Diploma in Statistics Introduction to Regression Lecture 5.129 Changing spread with log
30
Diploma in Statistics Introduction to Regression Lecture 5.130 Changing spread with log
31
Diploma in Statistics Introduction to Regression Lecture 5.131 Changing spread with log
32
Diploma in Statistics Introduction to Regression Lecture 5.132 Changing spread with log
33
Diploma in Statistics Introduction to Regression Lecture 5.133 Changing spread with log
34
Diploma in Statistics Introduction to Regression Lecture 5.134 Changing spread with log
35
Diploma in Statistics Introduction to Regression Lecture 5.135 Changing spread with log
36
Diploma in Statistics Introduction to Regression Lecture 5.136 Changing spread with log
37
Diploma in Statistics Introduction to Regression Lecture 5.137 Changing spread with log
38
Diploma in Statistics Introduction to Regression Lecture 5.138 Why the log transform works High spread at high X transformed to low spread at high Y Low spread at low X transformed to high spread at low Y
39
Diploma in Statistics Introduction to Regression Lecture 5.139 Why the log transform works 10 to 100 transformed to log 10 (10) to log 10 (10 2 ) i.e. 1 to 2 1/10 = 0.1 to 1/100 = 0.01 transformed to log 10 (10 –1 ) to log 10 (10 –2 ) i.e., – 1 to – 2
40
Diploma in Statistics Introduction to Regression Lecture 5.140 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver fluke egg hatching rate ii.explaining CEO remuneration iii.brain weights and body weights 3.SLR with transformed data 4.Transforming X, quadratic fit 5.Other options
41
Diploma in Statistics Introduction to Regression Lecture 5.141 SLR with transformed data LBrainW versus LBodyW The regression equation is LBrainW = 0.932 + 0.753 LBodyW PredictorCoef SE Coef T P Constant 0.93237 0.04170 22.36 0.000 LBodyW 0.75309 0.02858 26.35 0.000 S = 0.302949
42
Diploma in Statistics Introduction to Regression Lecture 5.142 Application: Do humans conform? Human
43
Diploma in Statistics Introduction to Regression Lecture 5.143 Application: Do humans conform? Delete the Human data, calculate regression, predict human LBrainW and compare to actual, relative to s
44
Diploma in Statistics Introduction to Regression Lecture 5.144 Application: Do humans conform? Regression Analysis: LBrainW versus LBodyW The regression equation is LBrainW = 0.924 + 0.744 LBodyW Predictor Coef SE Coef t p Constant 0.92410 0.03933 23.50 0.000 LBodyW 0.74383 0.02706 27.48 0.000 S = 0.285036
45
Diploma in Statistics Introduction to Regression Lecture 5.145 Application: Do humans conform? LBodyW(Human) = 1.79239 LBrainW(Human) = 3.12057 Predicted LBrainW= 0.924 + 0.744 × 1.79239 = 2.25754 Residual= 3.12057 – 2.25754 = 0.86303 Residual / s = 0.86303 / 0.285036 = 3.03
46
Diploma in Statistics Introduction to Regression Lecture 5.146 Deleted residuals For each potentially exceptional case: –delete the case –calculate the regression from the rest –use the fitted equation to calculate a deleted fitted value –calculate deleted residual = obseved value – deleted fitted value Minitab does this automatically for all cases!
47
Diploma in Statistics Introduction to Regression Lecture 5.147 Application: Do humans conform? With 63 cases, we do not expect to see any cases with residuals exceeding 3 standard deviations. On the other hand, recalling the scatter plot, the humans do not appear particulary exceptional. The dotplot view of deleted residuals emphasises this: Water opossums appear more exceptional. Human Water Opossum
48
Diploma in Statistics Introduction to Regression Lecture 5.148 Application: Do humans conform?
49
Diploma in Statistics Introduction to Regression Lecture 5.149 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver fluke egg hatching rate ii.explaining CEO remuneration iii.brain weights and body weights 3.SLR with transformed data 4.Transforming X, quadratic fit 5.Other options
50
Diploma in Statistics Introduction to Regression Lecture 5.150 Optimising a nicotine extraction process In determining the quantity of nicotine in different samples of tobacco, temperature is a key variable in optimising the extraction process. A study of this phenomenon involving analysis of 18 samples produced these data.
51
Diploma in Statistics Introduction to Regression Lecture 5.151 Optimising a nicotine extraction process Regression Analysis: Nicotine versus Temperature The regression equation is Nicotine = 2.61 + 0.0247 Temperature Predictor Coef SE Coef T P Constant 2.6086 0.2121 12.30 0.000 Temperature 0.024656 0.003579 6.89 0.000 S = 0.217412 R-Sq = 74.8%
52
Diploma in Statistics Introduction to Regression Lecture 5.152 Optimising a nicotine extraction process
53
Diploma in Statistics Introduction to Regression Lecture 5.153 Optimising a nicotine extraction process, quadratic fit
54
Diploma in Statistics Introduction to Regression Lecture 5.154 Optimising a nicotine extraction process, quadratic fit The regression equation is Nicotine = 1.20 + 0.0767 Temperature - 0.000453 Temp-sqr Predictor Coef SE Coef T P Constant 1.2041 0.6312 1.91 0.076 Temperature 0.07674 0.02257 3.40 0.004 Temp-sqr -0.0004529 0.0001943 -2.33 0.034 S = 0.192398 R-Sq = 81.5%
55
Diploma in Statistics Introduction to Regression Lecture 5.155 Optimising a nicotine extraction process, quadratic fit
56
Diploma in Statistics Introduction to Regression Lecture 5.156 Optimising a nicotine extraction process, quadratic fit, case 5 excluded The regression equation is Nicotine = 1.21 + 0.0750 Temperature - 0.000419 Temp-sqr Predictor Coef SE Coef T P Constant 1.2096 0.5129 2.36 0.033 Temperature 0.07504 0.01835 4.09 0.001 Temp-sqr -0.0004189 0.0001583 -2.65 0.019 S = 0.156321 R-Sq = 88.6%
57
Diploma in Statistics Introduction to Regression Lecture 5.157 Optimising a nicotine extraction process, quadratic fit, case 5 excluded
58
Diploma in Statistics Introduction to Regression Lecture 5.158 5Other options Other functions, –e.g., 1/Y, Y, Y 2, etc., same for X Generalised linear models, –choose a function of Y, a model for etc.
59
Diploma in Statistics Introduction to Regression Lecture 5.159 Reading EM Section 6.7.1 Hamilton, Ch. 5 Extra Notes: More on log
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.