Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revision of basic statistics Hypothesis testing Principles Testing a proportion Testing a mean Testing the difference between two means Estimation.

Similar presentations


Presentation on theme: "Revision of basic statistics Hypothesis testing Principles Testing a proportion Testing a mean Testing the difference between two means Estimation."— Presentation transcript:

1 Revision of basic statistics Hypothesis testing Principles Testing a proportion Testing a mean Testing the difference between two means Estimation

2 Principles of hypothesis testing Null vs alternative hypothesis The null as assumed true until proven otherwise If the evidence is inconsistent with the null, reject it in favour of the alternative. E.g. H 0 : a coin is fair vs H 1 : a coin is biased towards heads Evidence (data): 20 heads in 25 tosses Evidence seems unlikely if H 0 were true, hence reject H 0 Probability of such extreme evidence is actually 0.2%. We usually reject if the probability is < 5% (the significance level of the test.)

3 Testing a proportion H 0 : 10% of people are left handed H 0 :  = 0.1 H 1 : the proportion is not 10% H 1 :   0.1 The sample proportion p is a random variable and should be somewhere near to the true value. Its probability distribution is p ~ N( ,  n) under H 0 Hence the test statistic is This is, in general,

4 Using data to calculate the test statistic If 7 out of a group of 50 are left handed, the test statistic is This is less than z* = 1.96, the critical value which cuts off 5% in the two tails of the Normal distribution. Hence we cannot reject H 0.

5 Testing a mean A firm selling franchises claims that the average weekly income of a franchise is at least £2000. A sample of 40 such franchises finds an average weekly income of £1770 with s.d. £450. Is the claim justified? H 0 :  = 2000 vs H 1 :  < 2000 Significance level for test: 1% (we want to avoid a false accusation) Critical value: z*= 2.33 Since z < -z* we reject H 0.

6 The Prob-value approach Instead of comparing the test statistic to the critical value, we could compare the prob-value to the significance level (1% in this case) The prob-value is the area in the tail of the distribution beyond the value of the test statistic. In this case (z = -3.23) the prob-value is 0.0013 (0.13%, found from the standard Normal table) Since 0.13% < 1% we reject H 0 -2.33 -3.23 1% in tail of distribution 0.13% in tail

7 How to reject the null hypothesis Method 1 Test statistic > critical value (in absolute value) 3.23 > 2.33 Method 2 (prob-value) Prob value < significance level 0.13% < 1% Note the different direction of the inequality!!! Both reject the null If in doubt, draw the diagram! Watch out for: Choice of significance level (5% or 1%) One vs two tail test. If we had a two tail test, the prob-value would be 0.26% (and compare this to 1%).

8 Testing the difference of two means A sample of 40 students five years ago found an average expenditure on text books per annum of £87 (at today's prices) with s.d. £21. A current survey of 50 students found average expenditure of £77 with s.d. £30. Has expenditure declined? H 0 :  1 -  2 = 0 vs H 1 :  1 -  2 > 0 Random variable: Significance level: 5%. Critical value z = 1.64. Test statistic: Decision: z > z* hence reject H 0. Or, prob-value associated with 1.86 is 3.14% < 5% hence reject.

9 The t distribution When testing a mean with small samples, we use the t distribution instead of the Normal. (But note that regression coefficients follow the t distribution whatever the sample size.) A sample of 12 National Lottery outlets finds an average sale of 800 tickets per week, with s.d. 140. Does this suggest the original target of 700 has been exceeded? H 0 :  =700  H 1 :  > 700 Significance level: 5%. Critical value t* = 1.796 (d.f. = 11) Test statistic: 2.47 > 1.796 hence reject H 0. Alternatively, prob-value associated with 2.47 is 1.6%.

10 Estimation An alternative approach than hypothesis testing The sample mean or proportion is a point estimate Around this we build a confidence interval For the Normal distribution, the 95% CI is given by Point estimate  1.96 standard errors For the franchising example above, we have The interval has a width of about 170, expressing our uncertainty. For the t distribution, the interval is given by Point estimate  t* standard errors where t* is obtained from tables, using the appropriate degrees of freedom (d.f. = n – 1 for the mean).


Download ppt "Revision of basic statistics Hypothesis testing Principles Testing a proportion Testing a mean Testing the difference between two means Estimation."

Similar presentations


Ads by Google