Download presentation
Presentation is loading. Please wait.
Published byThomas Webster Modified over 9 years ago
1
Statistical Hypothesis Testing
2
Suppose you have a random variable X ( number of vehicle accidents in a year, stock market returns, time between el nino events, etc.) from some unknown distribution D. You formulate a hypothesis that a function f:X -> of this random variable satisfy the condition H 0 : f(X) = c vs H 1 : f(X) > c ( or f(X) < c ) Some usual examples are f(X) = E(X), i.e. the population mean is equal to some known value c=µ 0, or f(X) = Var(X) and c= 0 2. For historical reasons, H 0 is called the null hypothesis, and H 1 the alternative. Given some samples from the distribution D, we want to make a decision whether to accept or reject H 0, based on statistics from the samples. Statistical hypothesis testing is a statement about probabilities. Nothing is guaranteed. A good test should lead to a correct decision “most of the time”.
3
What you need: 1)A good sample statistic or estimator for f(X), call it m(X 1,X 2,…,X n ). What is a good sample statistic. Example, f(X) = E(X) = expected value of X. A bad statistic: you have 100 samples, and you use the sample statistic m = (X 1 + X 2 + … + X 10 )/10 for your sample mean. A better statistic: m = (X 1 + X 2 + … + X n )/n. It is good since it is an unbiased statistic (E(m) = theoretical mean of D, in the above example) and have minimum variance. Other criteria for goodness are consistency and linearity 2)Large enough sample size The more data, the better the sample estimate is. The law of large numbers states the sample mean (X 1 + X 2 + … + X n )/n converges to the true mean of the distribution D as n -> ∞, provided that the population mean exists, X i ’s are independent and has the same distribution as D.
4
3) f(X) exists and is finite Implicit in the previous statements is the assumption that f(X) exists and has a finite value. For example, if f(X) = E(X), then we’re assuming that the distribution D has a finite mean. This may not be the case, say when D is Cauchy distributed. The simplest Cauchy distribution has pdf (probability density function) It can be seen that the mean is not defined and neither is the variance, so it does not make sense to test for E(X) =µ 0. Be careful what you’re testing for. By the way, the ratio of two independent normal N(0,1) random variables is Cauchy distributed.
5
No assumption is made about the normality of the distribution D, and it does not have to be so. However, it is known that if D has finite mean and finite variance, and if the sample size n is large (~100 or more), then the statistic m(X 1,X 2,…,X n ) = (X 1 + X 2 + … + X n )/n approximates a normal distribution. This is the so-called Central Limit Theorem of probability. This result is often applied to justify the use of normal distribution statistics in large datasets. It is important to observe that the distribution D’ (not identical to D) of the test statistic m is known completely, given that the null hypothesis is true.
6
How does it work: Let m=m(X 1,X 2,…,X n ) be our test statistic. m itself is a random variable, with probability distribution D’. By assuming that the null hypothesis is true, the distribution D’ is determined completely, and we can find the value q (either directly or using lookup tables) such that Prob( Z > m ) = q where Z is a standard random variable with distribution D’. We then select a number α (called the significance level, usually =.05), and reject H 0 : if q < α accept H 0 : if q > α
7
What can happen: H 0 True False p β (type II error) 1- p (type I error) 1-β Accept Reject α=1-p is the significance level of the test. This is usually set to a value of 0.05. This means that we want the probability of making a type I error (false positives) to be small, ie. Prob(Reject H 0 | H 0 is true) = 0.05
8
Another way of interpreting this: when we reject H 0 using the test statistic, there is only a 5% chance that we could be wrong. Note that accepting the null hypothesis H 0 is a weaker conclusion than rejecting H 0. It only means that the test provide no evidence to contradict our assumption that H 0 is true. In some cases, it may be desirable to minimize type II errors as well. The power of a test is defined by Prob(Reject H 0 | H 0 is false) = 1- β The more powerful a test, the less chance of making a type II error. The function H(u,m) = Prob(Z > m | f(X)=u) is known as the power function of the test.
9
Example: Annual rainfall data for 8 years in inches at some locality 34.1, 33.7,27.4,31.1,30.9,35.2,28.4,32.1 Historically known to be normally distributed with mean=30 inches but with unknown variability. It is hypothesized that the mean annual rainfall has increased as of late. H 0 : µ = 30 H 1 : µ > 30 Sample statistic m = = 31.6. s 2 = sample variance = 7.5. Note has Student’s t distribution with 8-1=7 degrees of freedom, therefore Prob(Z > 31.6 | µ =30) = Prob( ) =.072 >.05 So we accept the null hypothesis at.05 significance level. No evidence to support µ > 30 inches with this test. Suppose we have evidence from another dataset collected nearby that the mean annual rainfall is actually 31 in. The power of our test is then Prob(Z > 31.6 | µ =31) = Prob( ) =.28 with β=1-.28 =.72.
10
Conclusion: Hypothesis testing is just another tool for analyzing data. It cannot be relied upon by itself as proof or disproof of an assumption. It may not even be practical. Independent verification of the test result using other methods is always necessary. “Absence of evidence is not the same as evidence of absence.” - Carl Sagan
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.