Download presentation
Presentation is loading. Please wait.
Published byKristian McBride Modified over 9 years ago
1
Steps 3 & 4: Analysis & interpretation of evidence
2
2 Hypothesis testing: a simple scenario SCENARIO 1 DO measured upstream & downstream over 9 months — Upstream = 9.3 mg/L — Downstream = 8.4 mg/L Difference significant at P<0.05 SCENARIO 2 DO measured upstream & downstream over 3 months — Upstream = 7.9 mg/L — Downstream = 4.2 mg/L Difference not significant at P<0.05 Which scenario presents a stronger case for DO causing impairment?
3
3 Using statistics responsibly Use caution in interpreting differences Look at magnitude & consistency of differences, rather than statistical significance Statistical significance detects differences exceeding natural variance — Does not detect stressor effects — Does not equal biological significance Can use statistics, but also use your head — Consider relationship between minimum detectable difference (power) & biologically relevant difference
4
4 That said, graphs & statistics can help a lot… STEP 2 List candidate causes STEP 3 Evaluate data from case STEP 4 Evaluate data from elsewhere Scatterplots Boxplots Correlation analysis Conditional probability analysis Regression analysis Predicting environmental condition from biological observations (PECBO) Quantile regression Classification & regression tree Species sensitivity distributions (SSDs)
5
5
6
6
7
7
8
8
9
9 Sampling/data collection issues Want to collect candidate cause & biological response variables at same sites, at same times Want to time sampling to accurately assess exposures — For continuous exposures, try to catch sensitive life stages & conditions that enhance exposure or effects — For episodic exposures, try to characterize episodes through continuous, event, and post-event sampling Want to collect good reference comparisons, in space & time — At unimpaired sites, at same times you sample impaired sites
10
10
11
11 Data analysis tools Species sensitivity distribution (SSD) generator — Tool that calculates & plots proportion of species affected at different levels of exposure in laboratory toxicity tests
12
12 Data analysis tools Species sensitivity distribution (SSD) generator — Tool that calculates & plots proportion of species affected at different levels of exposure in laboratory toxicity tests Command-line R scripts — Powerful and free statistical package [http://www.r-project.org/] — R scripts provided for PECBO (predicting environmental conditions from biological observations) CADStat — Menu-driven package of data visualization & statistical techniques, based on JGR (Java GUI for R)
13
13 R R is a powerful and free statistical package… [http://www.r-project.org/] …but it’s run at the command line, so there’s a steep learning curve
14
14 JGR/CADStat: a gentler introduction to R JGR provides point-and-click tools for basic R functions — Loading data — Setting up a workspace — Managing data in your workspace — Installing packages CADStat adds several tools to JGR — Point-and-click interfaces for tools useful in causal analysis — Rendering of analysis results in browser window — Additional help files – Scatterplots – Boxplots – Correlation analysis – Linear regression – Quantile regression – Conditional probability analysis – PECBO
15
15 Load your data into CADStat, or use example data…
16
16
17
17 Where are we? Step 1 — Biological effects of interest defined — Impaired & reference sites identified Step 2 — Map completed — Conceptual model developed — Candidate causes identified — Data identified & collected Now we’re ready to analyze available data & use results as evidence in Steps 3 & 4
18
18 Begin with data exploration Want to examine relationships between different variables — Linear & more complex — Expected & unexpected 1 st step is visualizing data — Scatterplots & boxplots — No statistics, hypotheses, or p-values
19
19 Looking at your data… examples from exercise Which candidate cause do the data relate to? Do the data support or weaken the case for the candidate cause? What type of evidence do the data represent? PC1 – referencePC2 – impaired Dissolved oxygen (mg/L)7.37.9 Canopy covermoderatelow In Fall 2004, an unpermitted industrial discharge with high metal concentrations was discovered and removed, just upstream of PC2. EPT taxa richness over time 2004200520062007 PC118161516 PC2891315
20
20 Boxplots Depict distribution of observations – Center box indicates spread of 50% of observations – Whiskers indicate either max & min values, or 1.5X the interquartile range Good data exploration & visualization tool Can be used to evaluate data from case & data from elsewhere – Spatial/temporal co-occurrence – Stressor-response relationships from the field or from other field studies – Causal pathway
21
21 Scatterplots Graphical displays of matched data – Dependent variable (y-axis) vs. independent variable (x-axis) – Often dependent variable is biological response, independent variable is measure of candidate cause Conductivity (uS/cm) Good data exploration & visualization tool – Can view several at once in scatterplot matrix Can be used to evaluate data from case & from elsewhere – Stressor-response relationships from the field, from other field studies, & from laboratory studies – Causal pathway
22
22 Examples from exercise: scatterplots Do the data support or weaken the case for temperature as a candidate cause? What type(s) of evidence do these data provide? Maximum summer water temperature (˚C) EPT taxa richness PC1 PC2
23
23 Examples from exercise: scatterplots Do the data support or weaken the cases for copper & zinc as candidate causes? What type(s) of evidence do these data provide? Log [Zn], ug/L EPT taxa richness Log [Cu], ug/L
24
24 Method for measuring degree of association between 2 variables in a matched data set Elevation Water Temp Air Temp Log Conductivity Elevation1-0.75-0.25-0.02 Water Temp-0.7510.640.07 Air Temp-0.250.6410.39 Log Conductivity -0.020.070.391 Correlation analysis – Correlation coefficient provides quantitative measure of degree to which 2 variables co-vary Good data exploration tool Can be used to evaluate data from case – Stressor-response relationships from the field – Causal pathway
25
25 Method for quantifying relationship between dependent & independent variable – Can have ≥ 1 independent variables Regression analysis Models can be used to: – Predict value of response variable for new values of explanatory variable – Estimate value of explanatory variable needed to account for change in response variable – Model natural variation in stressors, to define reference expectations 90 th percentile prediction limits
26
26 Quantile regression Models relationship between specified quantile of response variable and explanatory variable (stressor) – 50 th quantile gives median line, under which 50% of observed responses occur – 90 th quantile gives line under which 90% of observed responses occur Provides means of estimating location of upper boundary on scatterplot Used to evaluate stressor-response relationships from other field studies (Step 4: Evaluating data from elsewhere)
27
27 Applying quantile regression SUPPORTS WEAKENS
28
28 Uses taxa list & niche characteristics to predict environmental conditions at site Predicting environmental conditions from biological observations (PECBO) Ameletus Diphetor Predicted temperature if both Ameletus & Diphetor are present at site Maximum likelihood inference based on taxon- environment relationships Useful when environmental data not available
29
29 Using PECBO to verify predictions R 2 = 0.67 Biologically-based predictions can provide valuable supporting evidence, under “Verified predictions” — If stressor elevated at site, can test hypothesis that biologically-based prediction should also be elevated Tools for calculating taxon- environmental relationships & PECBO available on CADDIS
30
30 Species sensitivity distributions (SSDs) An SSD is a statistical distribution describing the variation of responses of a set of species to a stressor; it can be used to estimate the proportion of species adversely affected at a given stressor intensity. SSD plot showing LC 50 s for arthropods exposed to dissolved cadmium
31
31 Now where are we? Step 1 — Biological effects of interest defined — Impaired & reference sites identified Now we’re ready to evaluate & score the evidence across candidate causes Step 2 — Map completed — Conceptual model developed — Candidate causes identified — Data identified & collected Steps 3 & 4 — Available data examined & analyzed — Available data organized into types of evidence
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.