EPIDEMIOLOGY AND BIOSTATISTICS DEPT. 2011 Esimating Population Value with Hypothesis Testing.

Slides:



Advertisements
Similar presentations
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Advertisements

Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Inferential Statistics
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Introduction to Statistics
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
5-3 Inference on the Means of Two Populations, Variances Unknown
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Inference about Population Parameters: Hypothesis Testing
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Chapter 10 Hypothesis Testing
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
Statistical inference: confidence intervals and hypothesis testing.
Fundamentals of Hypothesis Testing: One-Sample Tests
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Estimation and Confidence Intervals
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Chapter 9 Large-Sample Tests of Hypotheses
Chapter 9: Testing Hypotheses
Chapter 8 Introduction to Hypothesis Testing
Lecture 7 Introduction to Hypothesis Testing. Lecture Goals After completing this lecture, you should be able to: Formulate null and alternative hypotheses.
Hypothesis testing Chapter 9. Introduction to Statistical Tests.
STA Statistical Inference
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Essential Statistics Chapter 131 Introduction to Inference.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
HYPOTHESIS TESTING. Statistical Methods Estimation Hypothesis Testing Inferential Statistics Descriptive Statistics Statistical Methods.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
AP Statistics Section 11.1 B More on Significance Tests.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter Nine Hypothesis Testing.
Chapter 9 Hypothesis Testing.
Presentation transcript:

EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing

LULU E. BUDIMAN Introduction Every member of a population cannot be examined so we use the data from a sample, taken from the same population, to estimate some measure, such as the mean, of the population itself. The sample will provide us with the best estimate of the exact 'truth' about the population. The method of sampling depends on the data available but the ideal method, as every member of the population has an equal chance of being selected, is random sampling.

LULU E. BUDIMAN We estimate limits within which we are expect the 'truth' about the population to lie and state how confident we are about this estimation. There are therefore two types of estimate of a population parameter: –Point estimate - one particular value –Interval estimate - an interval centred on the point estimate. Point estimate Interval estimate

LULU E. BUDIMAN Estimating population Point estimate is a single number used to estimate a population parameter. The best point estimate of the population mean is the sample mean. The accuracy with which the sample mean estimates the population mean is dependent upon how well the sample represents the population. Interval estimate, which is a range of values used to estimate a population parameter

LULU E. BUDIMAN Hypothesis Testing Statistics to test hypotheses take the following general form Hypothesis Testing Hypothesis testing is generally used when some comparison is to be made.

LULU E. BUDIMAN Hypothesis testing is the use of statistics to determine the probability that a given hypothesis is true. Hypothesis in statistics, is a claim or statement about property of a population

LULU E. BUDIMAN The usual process of hypothesis testing consists of four steps. Formulate the null hypothesis (commonly, that the observations are the result of pure chance) and the alternative hypothesis (commonly, that the observations show a real effect combined with a component of chance variation). Identify a test statistic that can be used to assess the truth of the null hypothesis. Compute the P-value, which is the probability that a test statistic at least as significant as the one observed would be obtained assuming that the null hypothesis were true. The smaller the value, the stronger the evidence against the null hypothesis. Compare the value to an acceptable significance value (sometimes called an alpha value). If, that the observed effect is statistically significant, the null hypothesis is ruled out, and the alternative hypothesis is valid.

LULU E. BUDIMAN Treatment A Treatment B Survive Not Survive Survive Not Survive Examples : We were to give a new cancer treatment to a group of patients Survival rate, for example, was different than the survival rate of those who do not receive the new treatment. What we are testing then is whether the sample patients who receive the new treatment come from the population we already know about (cancer patients without the treatment). Hipotesis????? H 0 ?....H 1 ?

LULU E. BUDIMAN The parameter (mean, proportion, relative risk, coefficient of correlation) in a study population, which can be estimated only by observing the sample, is equal to the values given by the hypothesis. If the estimated value for the parameter turns out to be close enough to the hypothesized value, we can accept the hypotheses. If not, we may have to reject the hypothesis.

LULU E. BUDIMAN A significance test estimate the likelihood that an observed result (e.g. a difference between two groups) is due to the chance. In other words, a significance test is used to find out whether a study result which is observed in a sample can be considered as a result which exists in the population from which the sample was drawn.

LULU E. BUDIMAN Example : We are investigating the medical risks associated with a certain occupation and we take a random sample of 20 men aged and their mean systolic blood pressure is found to be mmHg. Suppose the past experience has told us that in the population at large the mean systolic blood pressure for men of this age group is  = mmHg with standard deviation  = 15.1 mmHg. Does the evidence of our sample indicate an increased blood pressure associated with this occupation ?

LULU E. BUDIMAN Suppose for the moment, we propose a hypothesis, that there is no increase in blood pressure in this occupation, and the sample of 20 men can be regarded as a random sample from the whole population of men aged years. Then we know (in past experience) that the means of samples of 20 will be distributed normally about a mean of  = mmHg, with standard deviation  /  n = 15.1/  20 = 3.38 (standard error of the mean).

LULU E. BUDIMAN From what we know of the normal distribution sample means outside the range  1.96 x 3.38, i.e. outside to would occur only in 5 % of samples of this size, i.e, with probability Our sample mean lies outside this range because it is mmHg. What can we conclude ?

LULU E. BUDIMAN 1.Our hypothesis that there is no increase in systolic blood pressure in this occupation is correct and our sample mean was large purely by an unfortunate sampling fluke. That is, a result as extreme as our sample mean which has a probability of 0.05, just happened to occur. 2.Our hypothesis that there is no increase in systolic blood pressure in this occupation is wrong We cannot be sure which of these alternatives is correct, but because the probability that (1)is the correct conclusion is to small, we are obliged to conclude (2)Thus we conclude that it is likely that there is an increase is systolic blood pressure among men in this occupation and the probability P that we are wrong is less than We write this as p <0.05.This type of argument is called a significance test.

LULU E. BUDIMAN TEST STATISTIC PROVED !!

LULU E. BUDIMAN

From formula : 95 % confidence interval for  x  1.96  /  n or equivalently if : x -  Z =    n is numerically greater than 1.96 we say the difference betwen x and  is significant at the 5 % level and we write p <0.05. If the Z is greater than 2.58 the difference is significant at the 1 % level and we write p <0.01.

From formula : 95 % confidence interval for  :  = p  1.96 *   (1-  )/n or equivalently if : (p -  ) Z =   (1-  )/n If Z If the Z >1.96 the difference is significant at the 5 % level and we write p 2.58 the difference is highly significant (p < 0.01).

LULU E. BUDIMAN Mean, , is unknown PopulationRandom Sample I am 95% confident that  is between 40 & 60. Mean X = 50 Estimation Process Sample

LULU E. BUDIMAN Interpretation of a Confidence Interval (1 - α) x 100% of the confidence intervals –Constructed from different samples will actually contain the population mean. –The probability that you obtain a confidence interval that contains the population mean. Often it is more useful to quote two limits between which the parameter is expected to lie, together with the probability of it lying in that range. The limits are called the confidence limits and the interval between them the confidence interval. e.g. We are 95% confident that the mean male height lies between 158 cm and 175 cm.

LULU E. BUDIMAN The width of the confidence interval depends on three sensible factors:  the degree of confidence we wish to have in it, the chance of it including the 'truth', e.g. 95%;  the size of the sample, n;  the amount of variation among the members of the sample, i.e. its standard deviation, s.

LULU E. BUDIMAN P-value The P-value is the probability of observing a sample statistic as extreme as the test statistic, assuming the null hypothesis is true. Interpret the results: If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.

Conclusions in hypothesis testing * Always test the null hypothesis - Reject the H 0 - Fail to reject the H 0

Hypothesis test of a population mean, . The variable X is normally distributed in the population with mean  and variance  2. Two situations are considered : (1)  2 known (from previous experience) (2)  2 unknown. 1.  2 known To a test of any parameter which is estimated by a statistic whose sampling distribution is normal. The procedure is : a. Specify H 0 :  =  0, where  0 is a particular value. b. Specify H 1 :    0, say. c. Select a random sample of observations, x 1, x 2,..., x n d. Compute from a sample x =  x i / n

e. Consider the test statistic ( x -  0 ) Z = (  /  n) f. Determine the critical region from tables of the standard normal distribution (see table 1). Since the specification of H 1 has no direction, the critical region consists of both tails of the distribution. Thus, for a two-tailed test at 2  level of significance, reject H 0 if | Z | > Z (  ) { i.e. If Z > Z (  ) or Z 2 . In particular, if 2  = > Z (  ) = If 2  = > Z (  ) = 2.58.

LULU E. BUDIMAN Standard Normal Distribution Table

2.  2 unknown a. Consider the test statistic (x -  0 ) T = (s /  n) where s = the sample estimator of . T has a t-distribution on n-1 degrees of freedom. b. Determine the critical region from tables of the t- distribution (Table 2). From a two tailed test at the 2  level of significance, reject H 0 if : | T| > t n - 1 (  ) {i.e. T > t n - 1 (  ) or T 2 

LULU E. BUDIMAN Example : The sleeping time from the nine observations are 25; 31; 24; 28; 29; 30;31; 33 and 35 min. From these we wish to test at  = H 0 :  = 26 versus H 1 :   26 Suppose that the population variance is unknown and must be estimated from the sample. We assume the nine observations are from a normal population.

LULU E. BUDIMAN From these data, we compute x = s 2 = 12.53, s = From table 2 (appendix), t (8) = 2.306, and we reject H 0 if the computed T exceeds The computed T is T = ( ) / (3.539/  9) = 3.02 Which exceeds 2.306; thus we reject H 0 at the 0.05 significance level.

LULU E. BUDIMAN

Hypothesis test of a population proportion,  The procedure is to : a. Specify H 0 :  =  0, where  0 is a particular value. b. Specify H 1 :    0, say. c. Select a random sample of n individuals and determine the number x, of them with the characteristic. d. Compute from a sample p =x / n e. Consider the test statistic (p -  0 ) Z =  (  0 (1 -  0 ) / n) This test statistic has a standard normal distribution. f. Determine the critical region from tables of the standard normal distribution. For a two-tailed test at 2  level of significance, reject H 0 if | Z | > Z (  ) { i.e. If Z > Z (  ) or Z 2 .

LULU E. BUDIMAN PROBLEMS 1.The mean level of prothrombin in the normal population is known to be 20 mg/100 ml of plasma and standard deviation is 4 mg/100 ml. A sample of 40 patients showing vitamin K deficiency has a mean prothrombin level of 18.5 mg/100 ml. How reasonable is it to conclude that the true mean for patients with vitamin K deficiency is the same as that for the normal population ? 2.The height of adults living in suburban area of a large city has a mean equal to 160 cm, with standard deviation 7.5 cm. In a sample of 178 adults living in the inner city area, the mean height is found to be 156 cm. Assuming the same standard deviation for the two groups, are the mean heights significantly different ?

LULU E. BUDIMAN 3.A program to stop smoking expects to obtain a 75 % success rate. The observed number of definitive cessations in a group of 100 adult attending the program is 80. Is this sufficient evidence to conclude that the success rate has increased ? 4.From population mortality data, suppose that 4 % of males age 65 die within one year. If it is found that 60 of such males in a group of 1000 die within a year, is this evidence of an increase in mortality in this sample ? LULU E. BUDIMAN