Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 HYPOTHESIS TESTING

Similar presentations


Presentation on theme: "Chapter 2 HYPOTHESIS TESTING"— Presentation transcript:

1 Chapter 2 HYPOTHESIS TESTING
BAE 5333 Applied Water Resources Statistics Biosystems and Agricultural Engineering Department Division of Agricultural Sciences and Natural Resources Oklahoma State University Source Dr. Dennis R. Helsel & Dr. Edward J. Gilroy 2006 Applied Environmental Statistics Workshop and Statistical Methods in Water Resources

2 Which Statistical Test to Use?
Type of Variables Grouped and Continuous Qualitative and Quantitative Distribution of Data Normal Non-normal

3 Statistical Test Based on Variable Type

4 Parametric vs. Nonparametric
Parametric Test Assumes Specific underlying distribution for the data Or invoke the Central Limit Theorem (see next slide) Computed with mean, standard deviation (i.e. uses parameters) Normal Distribution most common assumption. Data should look symmetric.

5 Central Limit Theorem (CLT)
Given a distribution with a mean (μ) and variance (σ2), the sampling distribution of the mean approaches a normal distribution with a mean (μ) and a variance σ2/N as N, the sample size, increases. No matter what the shape of the original distribution, the sampling distribution of the mean approaches a normal distribution. For most distributions, a normal distribution is approached very quickly as N increases. Parametric tests use the sample mean, which the CLT tells us will be approximately normally distributed. Source:

6 Central Limit Theorem Example
Computer sampled N values from a uniform distribution and computed the mean. This procedure was performed 500 times for samples sizes of 1, 4, 7, and 10. Mean N=Sample Size Value (x) a b Probability Uniform Distribution Source:

7 Central Limit Theorem Examples
Central Limit Theorem Applet Demonstrates the CLT using dice rolling Large number of rolls  bell-shaped curve ONE Die  approximate uniform distribution The Central Limit Theorem - How to Tame Wild Populations

8 Central Limit Theorem, cont.
A sampling distribution always has less variability than the population. σs = σp / n1/2 σs = standard deviation of sampling distribution σp = standard deviation of the population n = sample size n = 100  1/10 the variability of the population n =  1/100 the variability of the population The sampling distribution will act more like a normal distribution as the sample size increases, for any type distribution. If the CLT did not exist, we could not use parametric statistics since we could not reliably estimate a parameter like the mean by using an average derived from a much smaller sample. Source:

9 Parametric vs. Nonparametric
Nonparametric Test Does not assume a specific distribution for the data (Distribution Free) Uses median, percentiles Information extracted from positions (ranks) of the data Data may be symmetric or skewed

10 Choice Based on Data Distribution Close to a Normal Distribution?
Yes Use nonparametric test or transform data Parametric or nonparametric test

11 How Hypothesis Tests Work
Choose a test procedure based on type of variable and data distribution Establish hypothesis Null (Ho) Alternative (Ha) Select alpha (significance level) Compute test statistic and obtain p-value (measure of how much evidence we have against the null hypothesis) Compare p-value to alpha Reject Ho if p-value < alpha

12 Null vs. Alternative Hypothesis
Null Hypothesis (Ho) Assumed true until found unlikely by data The null state: no difference between groups, no correlation, no trend, etc. Alternative Hypothesis (Ha) Is chosen if null hypothesis is not supported by data Two Types: one-sided, two-sided

13 Total Suspended Solids (TSS)
in the Cimarron River Null Hypothesis (Ho): TSS is the same in the Cimarron River and the Arkansas River Alternative Hypothesis (Ha) Two-sided: TSS is different in the two rivers One-sided: TSS is higher in the Cimarron River One-sided: TSS is lower in the Cimarron River Reject Null Hypothesis when p < alpha

14 Alternative Hypothesis
One or Two-sided? Two Sided (not equal to) Change can be high or low “Has TSS changed?” “Are values different between the Cimarron River and the Arkansas River?” One Sided (< or >) Change is directional “Has TSS increased?” “Are values higher between the Cimarron River and the Arkansas River?”

15 Outliers (p value) in Red
Two-sided Test Outliers (p value) in Red

16 Outliers (p value) in Red
One-sided Test Outliers (p value) in Red

17 How Do You Decide an Alpha Level?
Alpha (significance level) is a management decision What percent error can I live with? Set by scientist, legislation, or management Common: 5% (alpha=0.05) Alpha is the probability of rejecting the null hypothesis when it is actually true (Type I Error)

18 How Do You Decide an Alpha Level?
Continued… Choosing an alpha level: strength of evidence needed depends on study objectives, and cost of rejecting null hypothesis Small alpha Preponderance of evidence Smaller alpha Clear and convincing Very small alpha Beyond a reasonable doubt

19 Type I and Type II Errors
Type I Error (α) – occurs if one rejects the null hypothesis when it is true Type II Error (β) – occurs if one does NOT reject the null hypothesis when it is false Ho True Ho False Type I Error α Correct Decision 1- α Type II β 1- β Reject Ho Do Not Reject Ho Source: Elementary Statistics, 6th ed., Allan Bluman, McGraw-Hill

20 Type III Error Correctly rejecting the null hypothesis for the wrong reason. Correctly rejecting the null hypothesis, but incorrectly attributing the cause. Correctly identifying an effect, but incorrectly attributing the cause of the effect. Source:

21 Reading Assignment Chapter 4 Hypothesis Testing (pages 97 to 116)
Statistical Methods in Water Resources by D.R. Helsel and R.M. Hirsch


Download ppt "Chapter 2 HYPOTHESIS TESTING"

Similar presentations


Ads by Google