Download presentation
1
Data Analysis
2
A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements along a continuum, such as Flow Velocity What type of variable would “Mottled Sculpin /meter2” be? What type of variable is “Substrate Type”? What type of variable is “% of bank that is undercut”?
3
A Few Necessary Terms Explanatory Variable: Independent variable. On x-axis. The variable you use as a predictor. Response Variable: Dependent variable. On y-axis. The variable that is hypothesized to depend on/be predicted by the explanatory variable.
4
Statistical Tests: Appropriate Use
For our data, the response variable will always be continuous. T-test: A categorical explanatory variable with 2 options. ANOVA: A categorical explanatory variable with >2 options. Regression: A continuous explanatory variable
5
Statistical Tests Hypothesis Testing: In statistics, we are always testing a Null Hypothesis (Ho) against an alternate hypothesis (Ha). Test Statistic: p-value: The probability of observing our data or more extreme data assuming the null hypothesis is correct Statistical Significance: We reject the null hypothesis if the p-value is below a set value, usually 0.05.
6
Student’s T-Test Tests the statistical significance of the difference between means from two independent samples
7
Compares the means of 2 samples of a categorical variable
Mottled Sculpin/m2 Cross Plains Salmo Pond
8
Precautions and Limitations
Meet Assumptions Observations from data with a normal distribution (histogram) Samples are independent Assumed equal variance (boxplot) No other sample biases Interpreting the p-value
9
Analysis of Variance (ANOVA)
Tests the statistical significance of the difference between means from two or more independent samples Grand Mean Mottled Sculpin/m2 Riffle Pool Run ANOVA website
10
Precautions and Limitations
Meet Assumptions Observations from data with a normal distribution Samples are independent Assumed equal variance No other sample biases Interpreting the p-value Pairwise T-tests to follow
11
Simple Linear Regression
What is it? Least squares line When is it appropriate to use? Assumptions? What does the p-value mean? The R-value? How to do it in excel
12
Simple Linear Regression
Tests the statistical significance of a relationship between two continuous variables, Explanatory and Response
13
Precautions and Limitations
Meet Assumptions Observations from data with a normal distribution Samples are independent Assumed equal variance Relationship is linear No other sample biases Interpret the p-value and R-squared value.
14
Residual Plots Residuals are the distances from observed points to the best-fit line Residuals always sum to zero Regression chooses the best-fit line to minimize the sum of square-residuals. It is called the Least Squares Line.
15
Residuals
16
Residual vs. Fitted Value Plots
Observed Values (Points) Model Values (Line)
17
Residual Plots Can Help Test Assumptions
“Normal” Scatter Curve (linearity) Fan Shape: Unequal Variance
18
Have we violated any assumptions?
19
R-Squared and P-value High R-Squared
Low p-value (significant relationship)
20
R-Squared and P-value Low R-Squared
Low p-value (significant relationship)
21
R-Squared and P-value High R-Squared
High p-value (NO significant relationship)
22
R-Squared and P-value Low R-Squared
High p-value (No significant relationship)
23
P-value indicates the strength of the relationship between the two variables
You can think of this as a measure of predictability R-Squared indicates how much variance is explained by the explanatory variable. If this is low, other variables likely play a role. If this is high, it DOES NOT INDICATE A SIGNIFICANT RELATIONSHIP!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.