Example x12345678 y67529676. We wish to check for a non zero correlation.

Example x12345678 y67529676

We wish to check for a non zero correlation

We know that: Let the true correlation coefficient be ρ. Then test the hypotheses:

H 0 : ρ = 0 H 1 : ρ ≠ 0 It has already been shown that r = 0.1458 Thus,

The cut off points for the t distribution with 6 degrees of freedom for 2.5% top and bottom are +/-2.447. -2.447 2.447 The t value of 0.3610 implies H 0 is accepted ;;;;;;;;;; ;;;;;;;;;;;

There is no evidence of a non zero correlation between x and y. Similarly, we can check whether the slope b is significantly different from 0.

So the value of b is 0.1190. Now carry out a hypothesis test. H 0 : b = 0 H 1 : b ≠ 0 The standard error of b is This is calculated in R as 0.3298 ^

The test statistic is This calculates as (0.1190 – 0)/0.3298 = 0.3608

Ds….. ………. Again, t tables using 6 degrees of freedom give cut of point of 2.447 for 2.5%. ………-2.447………………................ 2.447

Since the test statistic t (0.3608) is less than this cut-off point, we accept the null hypothesis H 0. There is no evidence at the 5% level of a non- zero value of b. To confirm this, the 95% CI is: 0.1190 +/- 2.447 x 0.3298 = (-0.688, 0.926) Notice that this includes zero

Confidence Intervals for Variance We quoted earlier that This can be used to obtain a confidence interval for σ 2 Recall the earlier example y3.53.23.02.94.02.52.3 x3.13.43.03.23.92.82.2

Estimate of error variance  2 Now is equal to 0.8312 for “bottom” 2.5% and 12.83 for “top” 2.5% 95% CI for  2 is (5  0.07884/12.83, 5  0.07884/0.8312) i.e. (0.031, 0.474)

Trees Example More than one variable

The residual plot suggests that the linear model is satisfactory. The R squared value seems quite high though, so from physical arguments we force the line to pass through the origin.

The R squared value is higher now, but the residual plot is not so random.

We might now ask if we can find a model with both explanatory variables height and girth. Physical considerations suggest that we should explore the very simple model Volume = b 1 × height × (girth) 2 +  This is basically the formula for the volume of a cylinder.

So the equation is: Volume = 0.002108 × height × (girth) 2 + 

The residuals are considerably smaller than those from any of the previous models considered. Further graphical analysis fails to reveal any further obvious dependence on either of the explanatory variable girth or height. Further analysis also shows that inclusion of a constant term in the model does not significantly improve the fit. Model 4 is thus the most satisfactory of those models considered for the data.

However, this is regression “through the origin” so it may be more satisfactory to rewrite Model 4 as volume = b 1 +  height × (girth) 2

so that b 1 can then just be regarded as the mean of the observations of volume height × (girth) 2 recall that  is assumed to have location measure (here mean) 0.

Compare with 0.002108 found earlier

Practical Question 2 yx1x1 x2x2 3.53.130 3.23.425 3.0 20 2.93.230 4.03.940 2.52.825 2.32.230

So y = -0.2138 + 0.8984x1 + 0.01745x2 + e

Use >plot(multregress)

> ynew=c(y,12) > x1new=c(x1,20) > x2new=c(x2,100) > multregressnew=lm(ynew~x1new+x2new)

Very large influence

Second Example > ynew=c(y,40) > x1new=c(x1,10) > x2new=c(x2,50) > multregressnew=lm(ynew~x1new+x2new)

Example x12345678 y67529676. We wish to check for a non zero correlation.

Similar presentations

Presentation on theme: "Example x12345678 y67529676. We wish to check for a non zero correlation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Example x12345678 y67529676. We wish to check for a non zero correlation.

Similar presentations

Presentation on theme: "Example x12345678 y67529676. We wish to check for a non zero correlation."— Presentation transcript:

Similar presentations

About project

Feedback