Download presentation
Presentation is loading. Please wait.
Published byAleesha Day Modified over 9 years ago
1
Statistics for Water Science: Hypothesis Testing: Fundamental concepts and a survey of methods Unite 5: Module 17, Lecture 2
2
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s2 Statistics A branch of mathematics dealing with the collection, analysis, interpretation and presentation of masses of numerical data: Descriptive Statistics (Lecture 1) Basic description of a variable Hypothesis Testing (Lecture 2) Asks the question – is X different from Y? Predictions (Lecture 3) What will happen if…
3
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s3 Objectives Introduce the basic concepts and assumptions of significance tests Distributions on parade Developing hypotheses What is “true”? Survey statistical methods for testing for differences in populations of numbers Sample size issues Appropriate tests What we won’t do: Elaborate on mathematical underpinnings of tests (take a good stats course for this!)
4
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s4 The mean: A measure of central tendency The Standard Deviation: A measure of the ‘spread’ of the data From our last lecture
5
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s5 Tales of the normal distribution Many kinds of data follow this symmetrical, bell-shaped curve, often called a Normal Distribution. Normal distributions have statistical properties that allow us to predict the probability of getting a certain observation by chance.
6
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s6 When sampling a variable, you are most likely to obtain values close to the mean 68% within 1 SD 95% within 2 SD 2.0 1.0 0 1.0 2.0 Tales of the normal distribution
7
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s7 Note that a couple values are outside the 95th (2 SD) interval These are improbable Tales of the normal distribution 2.0 1.0 0 1.0 2.0
8
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s8 The essence of hypothesis testing: If an observation appears in one of the tails of a distribution, there is a probability that it is not part of that population. Tales of the normal distribution 2.0 1.0 0 1.0 2.0
9
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s9 “Significant Differences” A difference is considered significant if the probability of getting that difference by random chance is very small. P value: The probability of making an error by chance Historically we use p < 0.05
10
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s10 The magnitude of the effect A big difference is more likely to be significant than a small one The probability of detecting a significant difference is influenced by:
11
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s11 The spread of the data If the Standard Deviation is low, it will be easier to detect a significant difference The probability of detecting a significant difference is influenced by:
12
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s12 The number of observations Large samples more likely to detect a difference than a small sample The probability of detecting a significant difference is influenced by:
13
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s13 Hypothesis testing Hypothesis: A statement which can be proven false Null hypothesis HO: “There is no difference” Alternative hypothesis (HA): “There is a difference…” In statistical testing, we try to “reject the null hypothesis” If the null hypothesis is false, it is likely that our alternative hypothesis is true “False” – there is only a small probability that the results we observed could have occurred by chance
14
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s14 Alpha Level Reject Null Hypothesis P > 0.05Not significantNo P < 0.051 in 20SignificantYes P <0.011 in 100SignificantYes P < 0.0011 in 1000 Highly Significant Yes Common probability levels
15
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s15 Accept HoReject Ho Ho is TrueCorrect Decision Type I Error Alpha Ho is False Type II Error Beta Correct Decision Types of statistical errors (you could be right, you could be wrong)
16
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s16 2.0 1.0 0 1.0 2.0 Type I Error Type II Error Examples of type I and type II errors
17
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s17 Common statistical tests QuestionTest Does a single observation belong to a population of values?Z-test Are two (or more populations) of number different?T-test F-test (ANOVA) Is there a relationship between x and yRegression Is there a trend in the data (special case of aboveRegression
18
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s18 On June 26, 2002, a temperature probe reading at 7 m depth in Medicine Lake was 20.3 0 C. Is this unusually high for June? Note: this is a “one-tailed test”, we just want to know if it’s high We’re not asking if it is unusually low or high (2- tailed) Does a single observation belong to a population of values: The Z-test
19
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s19 The Z-distribution is a Normal Distribution, with special properties: Mean = 0 Variance = 1 Z = (observed value – mean)/standard error Standard error = standard deviation * sqrt(n) The Z distribution The z distribution: Standard normal distribution)
20
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s20 Calculate the Z-score for the observed data Compare the Z score with the significant value for a one tailed test (1.645) Medicine lake example
21
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s21 The Deep Math… Since 6.89 > the critical Z value of 1.64 Our deep temperature is significantly higher than the June average temperature. Further exploration shows that a storm the previous day caused the warmer surface waters to mix into the deeper waters. Z = (observed value – mean)/standard error Standard error = standard deviation * sqrt(n) Z = (20.3 – 19.7) 0.08 = 6.89
22
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s22 Are two populations different: The t-test Also called Student’s t-test. “Student” was a synonym for a statistician that worked for Guinness brewery Useful for “small” samples (<30) One of the most basic statistical tests, can be performed in Excel or any common statistical package
23
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s23 Are two populations different: The t-test
24
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s24 Are two populations different: The t-test One of the most basic statistical tests, can be performed in Excel or any common statistical package Same principle as Z-test – calculate a t value, and assess the probability of getting that value
25
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s25 In Excel Formula: @ttest(Pop1, Pop2, #Tails, TestType) Tailed tests: 1 or 2 TestType 1 - paired (if there is a logical pairing of XY data) 2 - equal variance 3 - unequal variance Test returns exact probability value
26
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s26 @ttest(Pop1, Pop2, 1, 3) = 1.5 * 10-149 Example: 1-tailed temperature comparison
27
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s27 ANOVA: Tests of multiple populations ANOVA – analysis of variance Compare 2 or more populations Surface temperatures for 3 lakes Can handle single or multiple factors One way ANOVA – comparing lakes Two-way ANOVA – compare two factors Temperature x Light effects on algal populations Repeated measures ANOVA – compare factors over time
28
Developed by: Host Updated: Jan. 21, 2004 U5-m17b-s28 Next Time: Regression - Finding relationships among variables
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.