Download presentation
Presentation is loading. Please wait.
Published byLeon Long Modified over 9 years ago
1
Example x12345678 y67529676
2
We wish to check for a non zero correlation
3
We know that: Let the true correlation coefficient be ρ. Then test the hypotheses:
4
H 0 : ρ = 0 H 1 : ρ ≠ 0 It has already been shown that r = 0.1458 Thus,
5
The cut off points for the t distribution with 6 degrees of freedom for 2.5% top and bottom are +/-2.447. -2.447 2.447 The t value of 0.3610 implies H 0 is accepted ;;;;;;;;;; ;;;;;;;;;;;
6
There is no evidence of a non zero correlation between x and y. Similarly, we can check whether the slope b is significantly different from 0.
8
So the value of b is 0.1190. Now carry out a hypothesis test. H 0 : b = 0 H 1 : b ≠ 0 The standard error of b is This is calculated in R as 0.3298 ^
9
The test statistic is This calculates as (0.1190 – 0)/0.3298 = 0.3608
10
Ds….. ………. Again, t tables using 6 degrees of freedom give cut of point of 2.447 for 2.5%. ………-2.447………………................ 2.447
11
Since the test statistic t (0.3608) is less than this cut-off point, we accept the null hypothesis H 0. There is no evidence at the 5% level of a non- zero value of b. To confirm this, the 95% CI is: 0.1190 +/- 2.447 x 0.3298 = (-0.688, 0.926) Notice that this includes zero
12
Confidence Intervals for Variance We quoted earlier that This can be used to obtain a confidence interval for σ 2 Recall the earlier example y3.53.23.02.94.02.52.3 x3.13.43.03.23.92.82.2
13
Estimate of error variance 2 Now is equal to 0.8312 for “bottom” 2.5% and 12.83 for “top” 2.5% 95% CI for 2 is (5 0.07884/12.83, 5 0.07884/0.8312) i.e. (0.031, 0.474)
15
Trees Example More than one variable
19
The residual plot suggests that the linear model is satisfactory. The R squared value seems quite high though, so from physical arguments we force the line to pass through the origin.
21
The R squared value is higher now, but the residual plot is not so random.
23
We might now ask if we can find a model with both explanatory variables height and girth. Physical considerations suggest that we should explore the very simple model Volume = b 1 × height × (girth) 2 + This is basically the formula for the volume of a cylinder.
25
So the equation is: Volume = 0.002108 × height × (girth) 2 +
28
The residuals are considerably smaller than those from any of the previous models considered. Further graphical analysis fails to reveal any further obvious dependence on either of the explanatory variable girth or height. Further analysis also shows that inclusion of a constant term in the model does not significantly improve the fit. Model 4 is thus the most satisfactory of those models considered for the data.
29
However, this is regression “through the origin” so it may be more satisfactory to rewrite Model 4 as volume = b 1 + height × (girth) 2
30
so that b 1 can then just be regarded as the mean of the observations of volume height × (girth) 2 recall that is assumed to have location measure (here mean) 0.
31
Compare with 0.002108 found earlier
32
Practical Question 2 yx1x1 x2x2 3.53.130 3.23.425 3.0 20 2.93.230 4.03.940 2.52.825 2.32.230
33
So y = -0.2138 + 0.8984x1 + 0.01745x2 + e
35
Use >plot(multregress)
36
> ynew=c(y,12) > x1new=c(x1,20) > x2new=c(x2,100) > multregressnew=lm(ynew~x1new+x2new)
39
Very large influence
40
Second Example > ynew=c(y,40) > x1new=c(x1,10) > x2new=c(x2,50) > multregressnew=lm(ynew~x1new+x2new)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.