Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.

Similar presentations


Presentation on theme: "Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test."— Presentation transcript:

1 Hypothesis Testing

2 To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test statistic into two parts The Acceptance Region The Critical Region

3 To perform a statistical Test we 1.Collect the data. 2.Compute the value of the test statistic. 3.Make the Decision: If the value of the test statistic is in the Acceptance Region we decide to accept H 0. If the value of the test statistic is in the Critical Region we decide to reject H 0.

4 The z-test for Proportions Testing the probability of success in a binomial experiment

5 Situation A success-failure experiment has been repeated n times The probability of success p is unknown. We want to test –H 0 : p = p 0 (some specified value of p) Against –H A :

6 The Test Statistic The Acceptance and Critical Region Accept H 0 if: Reject H 0 if: Two-tailed critical region

7 The Acceptance and Critical Region Accept H 0 if: Reject H 0 if: One-tailed critical regions These are used when the alternative hypothesis (H A ) is one-sided Accept H 0 if: Reject H 0 if:

8 The Acceptance and Critical Region Accept H 0 if:, Reject H 0 if: One-tailed critical regions

9 The Acceptance and Critical Region Accept H 0 if:, Reject H 0 if: One-tailed critical regions

10 Comments Whether you use a one-tailed or a two-tailed tests is determined by the choice of the alternative hypothesis H A The alternative hypothesis, H A, is usually the research hypothesis. The hypothesis that the researcher is trying to “prove”.

11 Examples 1.A person wants to determine if a coin should be accepted as being fair. Let p be the probability that a head is tossed. One is trying to determine if there is a difference (positive or negative) with the fair value of p.

12 2.A researcher is interested in determining if a new procedure is an improvement over the old procedure. The probability of success for the old procedure is p 0 (known). The probability of success for the new procedure is p (unknown). One is trying to determine if the new procedure is better (i.e. p > p 0 ).

13 2.A researcher is interested in determining if a new procedure is no longer worth considering. The probability of success for the old procedure is p 0 (known). The probability of success for the new procedure is p (unknown). One is trying to determine if the new procedure is definitely worse than the one presently being used (i.e. p < p 0 ).

14 The z-test for the Mean of a Normal Population We want to test, , denote the mean of a normal population

15 The Situation Let x 1, x 2, x 3, …, x n denote a sample from a normal population with mean  and standard deviation . Let we want to test if the mean, , is equal to some given value  0. Obviously if the sample mean is close to  0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.

16 The Test Statistic

17 The Acceptance and Critical Region This depends on H 0 and H A Accept H 0 if: Reject H 0 if: Two-tailed critical region Accept H 0 if: Reject H 0 if: One-tailed critical regions Accept H 0 if: Reject H 0 if:

18 Example A manufacturer Glucosamine capsules claims that each capsule contains on the average: 500 mg of glucosamine To test this claim n = 40 capsules were selected and amount of glucosamine (X) measured in each capsule. Summary statistics:

19 We want to test: Manufacturers claim is correct against Manufacturers claim is not correct

20 The Test Statistic

21 The Critical Region and Acceptance Region Using  = 0.05 We accept H 0 if -1.960 ≤ z ≤ 1.960 z  /2 = z 0.025 = 1.960 reject H 0 if z 1.960

22 The Decision Since z= -2.75 < -1.960 We reject H 0 Conclude: the manufacturers’s claim is incorrect:

23 “Students” t-test

24 Recall: The z-test for means The Test Statistic

25 Comments The sampling distribution of this statistic is the standard Normal distribution The replacement of  by s leaves this distribution unchanged only the sample size n is large.

26 For small sample sizes: The sampling distribution of Is called “students” t distribution with n –1 degrees of freedom

27 Properties of Student’s t distribution Similar to Standard normal distribution –Symmetric –unimodal –Centred at zero Larger spread about zero. –The reason for this is the increased variability introduced by replacing  by s. As the sample size increases (degrees of freedom increases) the t distribution approaches the standard normal distribution

28

29 t distribution standard normal distribution

30 The Situation Let x 1, x 2, x 3, …, x n denote a sample from a normal population with mean  and standard deviation . Both  and  are unknown. Let we want to test if the mean, , is equal to some given value  0.

31 The Test Statistic The sampling distribution of the test statistic is the t distribution with n-1 degrees of freedom

32 The Alternative Hypothesis H A The Critical Region t  and t  /2 are critical values under the t distribution with n – 1 degrees of freedom

33 Critical values for the t-distribution  or  /2

34 Critical values for the t-distribution are provided in tables. A link to these tables are given with today’s lecture

35 Look up df Look up 

36 Note: the values tabled for df = ∞ are the same values for the standard normal distribution

37 Example Let x 1, x 2, x 3, x 4, x 5, x 6 denote weight loss from a new diet for n = 6 cases. Assume that x 1, x 2, x 3, x 4, x 5, x 6 is a sample from a normal population with mean  and standard deviation . Both  and  are unknown. we want to test: versus New diet is not effective New diet is effective

38 The Test Statistic The Critical region: Reject if

39 The Data The summary statistics:

40 The Test Statistic The Critical Region (using  = 0.05) Reject if Conclusion: Accept H 0 :

41 Confidence Intervals

42 Confidence Intervals for the mean of a Normal Population, m, using the Standard Normal distribution Confidence Intervals for the mean of a Normal Population, m, using the t distribution

43 The Data The summary statistics:

44 Example Let x 1, x 2, x 3, x 4, x 5, x 6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:

45 Confidence Intervals (use  = 0.05)

46 Comparing Populations Proportions and means

47 Sums, Differences, Combinations of R.V.’s A linear combination of random variables, X, Y,... is a combination of the form: L = aX + bY + … where a, b, etc. are numbers – positive or negative. Most common: Sum = X + YDifference = X – Y Simple Linear combination of X, bX + a

48 Means of Linear Combinations The mean of L is: Mean(L) = a Mean(X) + b Mean(Y) + … Most common: Mean( X + Y) = Mean(X) + Mean(Y) Mean(X – Y) = Mean(X) – Mean(Y) Mean(bX + a) = bMean(X) + a IfL = aX + bY + …

49 Variances of Linear Combinations If X, Y,... are independent random variables and L = aX + bY + … then Variance(L) = a 2 Variance(X) + b 2 Variance(Y) + … Most common: Variance( X + Y) = Variance(X) + Variance(Y) Variance(X – Y) = Variance(X) + Variance(Y) Variance(bX + a) = b 2 Variance(X)

50 If X, Y,... are independent normal random variables, then L = aX + bY + … is normally distributed. In particular: X + Y is normal with X – Y is normal with Combining Independent Normal Random Variables

51 Comparing proportions Situation We have two populations (1 and 2) Let p 1 denote the probability (proportion) of “success” in population 1. Let p 2 denote the probability (proportion) of “success” in population 2. Objective is to compare the two population proportions

52 We want to test either: or

53 The test statistic:

54 Where: A sample of n 1 is selected from population 1 resulting in x 1 successes A sample of n 2 is selected from population 2 resulting in x 2 successes

55 Logic:

56 The Alternative Hypothesis H A The Critical Region

57 Example In a national study to determine if there was an increase in mortality due to pipe smoking, a random sample of n 1 = 1067 male nonsmoking pensioners were observed for a five-year period. In addition a sample of n 2 = 402 male pensioners who had smoked a pipe for more than six years were observed for the same five-year period. At the end of the five-year period, x 1 = 117 of the nonsmoking pensioners had died while x 2 = 54 of the pipe-smoking pensioners had died. Is there a the mortality rate for pipe smokers higher than that for non-smokers

58 We want to test:

59 The test statistic:

60 Note:

61 The test statistic:

62 We reject H 0 if: Not true hence we accept H 0. Conclusion: There is not a significant (  = 0.05) increase in the mortality rate due to pipe-smoking

63 Estimating a difference proportions using confidence intervals Situation We have two populations (1 and 2) Let p 1 denote the probability (proportion) of “success” in population 1. Let p 2 denote the probability (proportion) of “success” in population 2. Objective is to estimate the difference in the two population proportions  = p 1 – p 2.

64 Confidence Interval for  = p 1 – p 2 100P% = 100(1 –  ) % :

65 Example Estimating the increase in the mortality rate for pipe smokers higher over that for non- smokers  = p 2 – p 1

66 Comparing Means Situation We have two normal populations (1 and 2) Let  1 and  1 denote the mean and standard deviation of population 1. Let  2 and  2 denote the mean and standard deviation of population 1. Let x 1, x 2, x 3, …, x n denote a sample from a normal population 1. Let y 1, y 2, y 3, …, y m denote a sample from a normal population 2. Objective is to compare the two population means

67 We want to test either: or

68 Consider the test statistic:

69 If: will have a standard Normal distribution This will also be true for the approximation (obtained by replacing  1 by s x and  2 by s y ) if the sample sizes n and m are large (greater than 30)

70 Note:

71 The Alternative Hypothesis H A The Critical Region

72 Example A study was interested in determining if an exercise program had some effect on reduction of Blood Pressure in subjects with abnormally high blood pressure. For this purpose a sample of n = 500 patients with abnormally high blood pressure were required to adhere to the exercise regime. A second sample m = 400 of patients with abnormally high blood pressure were not required to adhere to the exercise regime. After a period of one year the reduction in blood pressure was measured for each patient in the study.

73 We want to test: The exercize group did not have a higher average reduction in blood pressure The exercize group did have a higher average reduction in blood pressure vs

74 The test statistic:

75 Suppose the data has been collected and:

76 The test statistic:

77 We reject H 0 if: True hence we reject H 0. Conclusion: There is a significant (  = 0.05) effect due to the exercise regime on the reduction in Blood pressure

78 Estimating a difference means using confidence intervals Situation We have two populations (1 and 2) Let  1 denote the mean of population 1. Let  2 denote the mean of population 2. Objective is to estimate the difference in the two population proportions  =  1 –  2.

79 Confidence Interval for  =  1 –  2

80 Example Estimating the increase in the average reduction in Blood pressure due to the excercize regime  =  1 –  2

81 Comparing Means – small samples Situation We have two normal populations (1 and 2) Let  1 and  1 denote the mean and standard deviation of population 1. Let  2 and  2 denote the mean and standard deviation of population 1. Let x 1, x 2, x 3, …, x n denote a sample from a normal population 1. Let y 1, y 2, y 3, …, y m denote a sample from a normal population 2. Objective is to compare the two population means

82 We want to test either: or

83 Consider the test statistic:

84 If the sample sizes (m and n) are large the statistic will have approximately a standard normal distribution This will not be the case if sample sizes (m and n) are small

85 The t test – for comparing means – small samples Situation We have two normal populations (1 and 2) Let  1 and  denote the mean and standard deviation of population 1. Let  2 and  denote the mean and standard deviation of population 1. Note: we assume that the standard deviation for each population is the same.  1 =  2 = 

86 Let

87 The pooled estimate of . Note: both s x and s y are estimators of . These can be combined to form a single estimator of , s Pooled.

88 The test statistic: If  1 =  2 this statistic has a t distribution with n + m –2 degrees of freedom

89 The Alternative Hypothesis H A The Critical Region are critical points under the t distribution with degrees of freedom n + m –2.

90 Example A study was interested in determining if administration of a drug reduces cancerous tumor size. For this purpose n +m = 9 test animals are implanted with a cancerous tumor. n = 3 are selected at random and administered the drug. The remaining m = 6 are left untreated. Final tumour sizes are measured at the end of the test period

91 We want to test: The treated group did not have a lower average final tumour size. The exercize group did have a lower average final tumour size. vs

92 The test statistic:

93 Suppose the data has been collected and:

94 The test statistic:

95 We reject H 0 if: Hence we accept H 0. Conclusion: The drug treatment does not result in a significant (  = 0.05) smaller final tumour size, with d.f. = n + m – 2 = 7


Download ppt "Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test."

Similar presentations


Ads by Google