-Test for one and two means -Test for one and two proportions 2.3 Hypothesis Testing\ -Test for one and two means -Test for one and two proportions
WHY WE HAVE TO DO THE HYPOTHESIS? To make decisions about populations based on the sample information. Example :- we wish to know whether a medicine is really effective to cure a disease. So we use a sample of patients and take their data in effect of the medicine and make decisions. To reach the decisions, it is useful to make assumptions about the populations. Such assumptions maybe true or not and called the statistical hypothesis.
Definitions Hypothesis Test: It is a process of using sample data and statistical procedures to decide whether to reject or not to reject the hypothesis (statement) about a population parameter value (or about its distribution characteristics). Null Hypothesis, : Generally this is a statement that a population has a specific value. The null hypothesis is initially assumed to be true. Therefore, it is the hypothesis to be tested.
Alternative Hypothesis, : It is a statement about the same population parameter that is used in the null hypothesis and generally this is a statement that specifies that the population parameter has a value different in some way, from the value given in the null hypothesis. The rejection of the null hypothesis will imply the acceptance of this alternative hypothesis. Test Statistic: It is a function of the sample data on which the decision is to be based.
Critical/ Rejection region: It is a set of values of the test statistics for which the null hypothesis will be rejected. Critical point: It is the first (or boundary) value in the critical region. P-value: The probability calculated using the test statistic. The smaller the p-value is, the more contradictory is the data to .
Procedure for hypothesis testing 1. Define the question to be tested and formulate a hypothesis for a stating the problem. 2. Choose the appropriate test statistic and calculate the sample statistic value. The choice of test statistics is dependent upon the probability distribution of the random variable involved in the hypothesis. 3. Establish the test criterion by determining the critical value and critical region. 4. Draw conclusions, whether to accept or to reject the null hypothesis.
Example 2.7: The average monthly earnings for women in managerial and professional positions is RM 2400. Do men in the same positions have average monthly earnings that are higher than those for women? A random sample of n = 40 men in managerial and professional positions showed = RM3600 and s = RM 400. Test the appropriate hypothesis using = 0.01. Solution: The hypothesis to be tested are: We use normal distribution n > 30, as n = 40 Rejection region:
Test statistic: Conclusion : Since 18. 97 > 2 Test statistic: Conclusion : Since 18.97 > 2.33, falls in the rejection region, we reject and conclude that average monthly earnings for men in managerial and professional positions are significantly higher than those for women.
Example 2.8: Aisyah makes “kerepek ubi” and sell them in packets of 100g each. 12 randomly selected packets of “kerepek ubi” are taken and their weights in g are recorded as follows: Perform the required hypothesis test at 5% significance level to check whether the mean weight per packet if “kerepek ubi” is not equal to 100g. Solution: The hypothesis to be tested are: We use t distribution, Two-tailed test 98 102 100 96 91 97 94 101
Test Statistic: Cocnlusion: Since – 2. 737 < -2 Test Statistic: Cocnlusion: Since – 2.737 < -2.201, falls in the rejection region, we reject and conclude that weight per packet of “kerepek ubi” is not equal to 100g.
Exercise 2.9: A teacher claims that the student in Class A put in more hours studying compared to other students. The mean numbers of hours spent studying per week is 25hours with a standard deviation of 3 hours per week. A sample of 27 Class A students was selected at random and the mean number of hours spent studying per week was found to be 26hours. Can the teacher’s claim be accepted at 5% significance level? Answer: Z = 1.7321, Do not reject
Hypothesis testing for the differences between two population mean, Test hypothesis Test statistics i) Variance and are known, and both and are samples of any sizes. ii) If the population variances, and are unknown, then the following tables shows the different formulas that may be used depending on the sample sizes and the assumption on the population variances.
Equality of variances, when are unknown Sample size
Example 2.9: The mean lifetime of 30 bateries produced by company A is 50 hours and the mean lifetime of 35 bulbs produced by company B is 48 hours. If the standard deviation of all bulbs produced by company A is 3 hour and the standard deviation of all bulbs produced by company B is 3.5 hours, test at 1 % significance level that the mean lifetime of bulbs produced by Company A is better than that of company B. ( Variances are known) Solution: We reject . The mean lifetime of bulbs produced by company A is better than that of company B at 1% significance level.
Example 2.10: A mathematic placement test was given to two classes of 45 and 55 student respectively . In the first class the mean grade was 75 with a standard deviation of 8, while in the second class the mean grade was 80 with a standard deviation of 7. Is there a significant difference between the Performances of the two classes at 5% level of significance? Assume the population variances are equal. Solution:
Since , so we reject So there is a significant difference between the perforance of the two classes at 5% level of significance.
Exercise 2.10: A sample of 60 maids from country A earn an average of RM300 per week with a standard deviation of RM16, while a sample of 60 maids from country B earn an average of RM250 per week with a standard deviation of RM18. Test at 5% significance level that country A maids average earning exceed country B maids average earning more than RM40 per week. Answer : Z = 16.0817, Reject
Example 2.11: When working properly, a machine that is used to make chips for calculators produce 4% defective chips. Whenever the machine produces more than 4% defective chips it needs an adjustment. To check if the machine is working properly, the quality control department at the company often takes sample of chips and inspects them to determine if they are good or defective. One such random sample of 200 chips taken recently from the production line contained 14 defective chips. Test at the 5% significance level whether or not the machine needs an adjustment.
Exercise 2.11: A manufacturer of a detergent claimed that his detergent is least 95% effective is removing though stains. In a sample of 300 people who had used the Detergent and 279 people claimed that they were satisfied with the result. Determine whether the manufacturer’s claim is true at 1% significance level. Answer: Do not Reject
Example 2.12: A researcher wanted to estimate the difference between the percentages of two toothpaste users who will never switch to other toothpaste. In a sample of 500 users of toothpaste A taken by the researcher, 100 said that they will never switched to another toothpaste. In another sample of 400 users of toothpaste B taken by the same researcher, 68 said that they will never switched to other toothpaste. At the significance level 1%, can we conclude that the proportion of users of toothpaste A who will never switch to other toothpaste is higher than the proportion of users of toothpaste B who will never switch to other toothpaste?
Exercise 2.12: In a process to reduce the number of death due the dengue fever, two district, district A and district B each consists of 150 people who have developed symptoms of the fever were taken as samples. The people in district A is given a new medication in addition to the usual ones but the people in district B is given only the usual medication. It was found that, from district A and from district B, 120 and 90 people respectively recover from the fever. Test the hypothesis that the new medication better to cure the fever than the using the usual ones only using a level of significance of 5%. Answer: reject
Solve using p-value Seven people who have a problem with obesity were placed on a diet for one month. Their weight at the beginning and the end of the month were recorded as follows: (Assume variance are equal) Can we conclude that there is a difference in the mean for two populations at significance level 95%. Subject Begin (in kg) End (in kg) 1 105 85 2 120 3 90 75 4 110 95 5 100 6 104 88 7 98 72
Construct hypothesis Test Statistic p-value (get from output using Excel) Rejection Region Reject Conclusion Since , we reject . We can conclude that there is a difference in the means of two populations.
EXERCISES
Exercise 2.13 1. A new concrete mix being designed to provide adequate compressive strength for concrete blocks. The specification for a particular application calls for the blocks to have a mean compressive strength greater than 1350kPa. A sample of 100 blocks is produced and tested. Their mean compressive strength is 1366 kPa and their standard deviation is 70 kPa. Test hypothesis using = 0.05. Answer : Do not reject
2. A comparing properties of welds made using carbon dioxide as a shielding gas with those of welds made using a mixture of a argon and carbon dioxide. One property studied was the diameter of inclusions, which are particles embedded in the weld. A sample of 544 inclusions in welds made using argon shielding averaged 0.37 in diameter, with a standard deviation 0f 0.25 . A sample of 581 inclusions in welds made using carbon dioxide shielding average 0.40 in diameter, with a standard deviation of 0.26 . Can you conclude that the mean diameters of inclusions differ between the two shielding gases? Both variances for population are not equal. Answer : Reject
3. A method for measuring orthometric heights above the sea level is presented. For a sample of 1225 baselines, 926 gave results that were within the class C spirit leveling tolerance limits. Can we conclude that this method procedures results within the tolerance limits more than 75% of the time?
4. A survey asked which methods played a major role in the risk management strategy of their firms. In a sample of 43 oil companies, 22 indicated that risk transfer played a major role, while in a sample of 93 construction companies, 55 reported that risk transfer played a major role. Can we conclude at level 5% that the proportion of oil companies that employ the method of that risk transfer played is less than the proportion of construction companies that do?
5.The nicotine content in miligrams of two samples of tobacco were found to be as follows: Can we conclude that the mean between these two samples have difference at significance level 95%? Use p-value. Sample A 24 27 26 21 25 Sample B 30 28 31 22 36
6.A researcher wants to compare two companies which use to appraise the value of residential homes. He selected a sample of 10 residential properties and scheduled both firms for an appraisal. He get some results. Then the data are being analyzed and transform into an output as follows: t-Test: Two-Sample Assuming Equal Variances Variable 1 Variable 2 Mean 226.8 222.2 Variance 208.8444 204.1778 Observations 10 Pooled Variance 206.5111 Hypothesized Mean Difference df 18 t Stat 0.715766 P(T<=t) one-tail 0.241659 t Critical one-tail 1.734064 P(T<=t) two-tail 0.483319 t Critical two-tail 2.100922
From the output, can we conclude that there is a difference mean between these two populations at significance level 95%?Use p-value.