Download presentation
Presentation is loading. Please wait.
Published byLeilani Botting Modified over 10 years ago
1
Hypothesis Testing
2
To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test statistic into two parts The Acceptance Region The Critical Region
3
To perform a statistical Test we 1.Collect the data. 2.Compute the value of the test statistic. 3.Make the Decision: If the value of the test statistic is in the Acceptance Region we decide to accept H 0. If the value of the test statistic is in the Critical Region we decide to reject H 0.
4
The z-test for Proportions Testing the probability of success in a binomial experiment
5
Situation A success-failure experiment has been repeated n times The probability of success p is unknown. We want to test –H 0 : p = p 0 (some specified value of p) Against –H A :
6
The Test Statistic The Acceptance and Critical Region Accept H 0 if: Reject H 0 if: Two-tailed critical region
7
The Acceptance and Critical Region Accept H 0 if: Reject H 0 if: One-tailed critical regions These are used when the alternative hypothesis (H A ) is one-sided Accept H 0 if: Reject H 0 if:
8
The Acceptance and Critical Region Accept H 0 if:, Reject H 0 if: One-tailed critical regions
9
The Acceptance and Critical Region Accept H 0 if:, Reject H 0 if: One-tailed critical regions
10
Comments Whether you use a one-tailed or a two-tailed tests is determined by the choice of the alternative hypothesis H A The alternative hypothesis, H A, is usually the research hypothesis. The hypothesis that the researcher is trying to “prove”.
11
Examples 1.A person wants to determine if a coin should be accepted as being fair. Let p be the probability that a head is tossed. One is trying to determine if there is a difference (positive or negative) with the fair value of p.
12
2.A researcher is interested in determining if a new procedure is an improvement over the old procedure. The probability of success for the old procedure is p 0 (known). The probability of success for the new procedure is p (unknown). One is trying to determine if the new procedure is better (i.e. p > p 0 ).
13
2.A researcher is interested in determining if a new procedure is no longer worth considering. The probability of success for the old procedure is p 0 (known). The probability of success for the new procedure is p (unknown). One is trying to determine if the new procedure is definitely worse than the one presently being used (i.e. p < p 0 ).
14
The z-test for the Mean of a Normal Population We want to test, , denote the mean of a normal population
15
The Situation Let x 1, x 2, x 3, …, x n denote a sample from a normal population with mean and standard deviation . Let we want to test if the mean, , is equal to some given value 0. Obviously if the sample mean is close to 0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.
16
The Test Statistic
17
The Acceptance and Critical Region This depends on H 0 and H A Accept H 0 if: Reject H 0 if: Two-tailed critical region Accept H 0 if: Reject H 0 if: One-tailed critical regions Accept H 0 if: Reject H 0 if:
18
Example A manufacturer Glucosamine capsules claims that each capsule contains on the average: 500 mg of glucosamine To test this claim n = 40 capsules were selected and amount of glucosamine (X) measured in each capsule. Summary statistics:
19
We want to test: Manufacturers claim is correct against Manufacturers claim is not correct
20
The Test Statistic
21
The Critical Region and Acceptance Region Using = 0.05 We accept H 0 if -1.960 ≤ z ≤ 1.960 z /2 = z 0.025 = 1.960 reject H 0 if z 1.960
22
The Decision Since z= -2.75 < -1.960 We reject H 0 Conclude: the manufacturers’s claim is incorrect:
23
“Students” t-test
24
Recall: The z-test for means The Test Statistic
25
Comments The sampling distribution of this statistic is the standard Normal distribution The replacement of by s leaves this distribution unchanged only the sample size n is large.
26
For small sample sizes: The sampling distribution of Is called “students” t distribution with n –1 degrees of freedom
27
Properties of Student’s t distribution Similar to Standard normal distribution –Symmetric –unimodal –Centred at zero Larger spread about zero. –The reason for this is the increased variability introduced by replacing by s. As the sample size increases (degrees of freedom increases) the t distribution approaches the standard normal distribution
29
t distribution standard normal distribution
30
The Situation Let x 1, x 2, x 3, …, x n denote a sample from a normal population with mean and standard deviation . Both and are unknown. Let we want to test if the mean, , is equal to some given value 0.
31
The Test Statistic The sampling distribution of the test statistic is the t distribution with n-1 degrees of freedom
32
The Alternative Hypothesis H A The Critical Region t and t /2 are critical values under the t distribution with n – 1 degrees of freedom
33
Critical values for the t-distribution or /2
34
Critical values for the t-distribution are provided in tables. A link to these tables are given with today’s lecture
35
Look up df Look up
36
Note: the values tabled for df = ∞ are the same values for the standard normal distribution
37
Example Let x 1, x 2, x 3, x 4, x 5, x 6 denote weight loss from a new diet for n = 6 cases. Assume that x 1, x 2, x 3, x 4, x 5, x 6 is a sample from a normal population with mean and standard deviation . Both and are unknown. we want to test: versus New diet is not effective New diet is effective
38
The Test Statistic The Critical region: Reject if
39
The Data The summary statistics:
40
The Test Statistic The Critical Region (using = 0.05) Reject if Conclusion: Accept H 0 :
41
Confidence Intervals
42
Confidence Intervals for the mean of a Normal Population, m, using the Standard Normal distribution Confidence Intervals for the mean of a Normal Population, m, using the t distribution
43
The Data The summary statistics:
44
Example Let x 1, x 2, x 3, x 4, x 5, x 6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:
45
Confidence Intervals (use = 0.05)
46
Comparing Populations Proportions and means
47
Sums, Differences, Combinations of R.V.’s A linear combination of random variables, X, Y,... is a combination of the form: L = aX + bY + … where a, b, etc. are numbers – positive or negative. Most common: Sum = X + YDifference = X – Y Simple Linear combination of X, bX + a
48
Means of Linear Combinations The mean of L is: Mean(L) = a Mean(X) + b Mean(Y) + … Most common: Mean( X + Y) = Mean(X) + Mean(Y) Mean(X – Y) = Mean(X) – Mean(Y) Mean(bX + a) = bMean(X) + a IfL = aX + bY + …
49
Variances of Linear Combinations If X, Y,... are independent random variables and L = aX + bY + … then Variance(L) = a 2 Variance(X) + b 2 Variance(Y) + … Most common: Variance( X + Y) = Variance(X) + Variance(Y) Variance(X – Y) = Variance(X) + Variance(Y) Variance(bX + a) = b 2 Variance(X)
50
If X, Y,... are independent normal random variables, then L = aX + bY + … is normally distributed. In particular: X + Y is normal with X – Y is normal with Combining Independent Normal Random Variables
51
Comparing proportions Situation We have two populations (1 and 2) Let p 1 denote the probability (proportion) of “success” in population 1. Let p 2 denote the probability (proportion) of “success” in population 2. Objective is to compare the two population proportions
52
We want to test either: or
53
The test statistic:
54
Where: A sample of n 1 is selected from population 1 resulting in x 1 successes A sample of n 2 is selected from population 2 resulting in x 2 successes
55
Logic:
56
The Alternative Hypothesis H A The Critical Region
57
Example In a national study to determine if there was an increase in mortality due to pipe smoking, a random sample of n 1 = 1067 male nonsmoking pensioners were observed for a five-year period. In addition a sample of n 2 = 402 male pensioners who had smoked a pipe for more than six years were observed for the same five-year period. At the end of the five-year period, x 1 = 117 of the nonsmoking pensioners had died while x 2 = 54 of the pipe-smoking pensioners had died. Is there a the mortality rate for pipe smokers higher than that for non-smokers
58
We want to test:
59
The test statistic:
60
Note:
61
The test statistic:
62
We reject H 0 if: Not true hence we accept H 0. Conclusion: There is not a significant ( = 0.05) increase in the mortality rate due to pipe-smoking
63
Estimating a difference proportions using confidence intervals Situation We have two populations (1 and 2) Let p 1 denote the probability (proportion) of “success” in population 1. Let p 2 denote the probability (proportion) of “success” in population 2. Objective is to estimate the difference in the two population proportions = p 1 – p 2.
64
Confidence Interval for = p 1 – p 2 100P% = 100(1 – ) % :
65
Example Estimating the increase in the mortality rate for pipe smokers higher over that for non- smokers = p 2 – p 1
66
Comparing Means Situation We have two normal populations (1 and 2) Let 1 and 1 denote the mean and standard deviation of population 1. Let 2 and 2 denote the mean and standard deviation of population 1. Let x 1, x 2, x 3, …, x n denote a sample from a normal population 1. Let y 1, y 2, y 3, …, y m denote a sample from a normal population 2. Objective is to compare the two population means
67
We want to test either: or
68
Consider the test statistic:
69
If: will have a standard Normal distribution This will also be true for the approximation (obtained by replacing 1 by s x and 2 by s y ) if the sample sizes n and m are large (greater than 30)
70
Note:
71
The Alternative Hypothesis H A The Critical Region
72
Example A study was interested in determining if an exercise program had some effect on reduction of Blood Pressure in subjects with abnormally high blood pressure. For this purpose a sample of n = 500 patients with abnormally high blood pressure were required to adhere to the exercise regime. A second sample m = 400 of patients with abnormally high blood pressure were not required to adhere to the exercise regime. After a period of one year the reduction in blood pressure was measured for each patient in the study.
73
We want to test: The exercize group did not have a higher average reduction in blood pressure The exercize group did have a higher average reduction in blood pressure vs
74
The test statistic:
75
Suppose the data has been collected and:
76
The test statistic:
77
We reject H 0 if: True hence we reject H 0. Conclusion: There is a significant ( = 0.05) effect due to the exercise regime on the reduction in Blood pressure
78
Estimating a difference means using confidence intervals Situation We have two populations (1 and 2) Let 1 denote the mean of population 1. Let 2 denote the mean of population 2. Objective is to estimate the difference in the two population proportions = 1 – 2.
79
Confidence Interval for = 1 – 2
80
Example Estimating the increase in the average reduction in Blood pressure due to the excercize regime = 1 – 2
81
Comparing Means – small samples Situation We have two normal populations (1 and 2) Let 1 and 1 denote the mean and standard deviation of population 1. Let 2 and 2 denote the mean and standard deviation of population 1. Let x 1, x 2, x 3, …, x n denote a sample from a normal population 1. Let y 1, y 2, y 3, …, y m denote a sample from a normal population 2. Objective is to compare the two population means
82
We want to test either: or
83
Consider the test statistic:
84
If the sample sizes (m and n) are large the statistic will have approximately a standard normal distribution This will not be the case if sample sizes (m and n) are small
85
The t test – for comparing means – small samples Situation We have two normal populations (1 and 2) Let 1 and denote the mean and standard deviation of population 1. Let 2 and denote the mean and standard deviation of population 1. Note: we assume that the standard deviation for each population is the same. 1 = 2 =
86
Let
87
The pooled estimate of . Note: both s x and s y are estimators of . These can be combined to form a single estimator of , s Pooled.
88
The test statistic: If 1 = 2 this statistic has a t distribution with n + m –2 degrees of freedom
89
The Alternative Hypothesis H A The Critical Region are critical points under the t distribution with degrees of freedom n + m –2.
90
Example A study was interested in determining if administration of a drug reduces cancerous tumor size. For this purpose n +m = 9 test animals are implanted with a cancerous tumor. n = 3 are selected at random and administered the drug. The remaining m = 6 are left untreated. Final tumour sizes are measured at the end of the test period
91
We want to test: The treated group did not have a lower average final tumour size. The exercize group did have a lower average final tumour size. vs
92
The test statistic:
93
Suppose the data has been collected and:
94
The test statistic:
95
We reject H 0 if: Hence we accept H 0. Conclusion: The drug treatment does not result in a significant ( = 0.05) smaller final tumour size, with d.f. = n + m – 2 = 7
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.