Steps 3 & 4: Analysis & interpretation of evidence.

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Forecasting Using the Simple Linear Regression Model and Correlation
Describing Relationships Using Correlation and Regression
Identifying the probable cause. 2 Define the Case List Candidate Causes Evaluate Data from the Case Evaluate Data from Elsewhere Identify Probable Cause.
Descriptive Statistics In SAS Exploring Your Data.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
1 BA 275 Quantitative Business Methods Simple Linear Regression Introduction Case Study: Housing Prices Agenda.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
The Stressor Identification Process. 2 Define the Case List Candidate Causes Evaluate Data from the Case Evaluate Data from Elsewhere Identify Probable.
Chapter 2 Simple Comparative Experiments
Linear Regression Example Data
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Correlational Designs
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 7 Forecasting with Simple Regression
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Inference for regression - Simple linear regression
Relationship of two variables
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Exploratory Data Analysis. Height and Weight 1.Data checking, identifying problems and characteristics Data exploration and Statistical analysis.
Research Terminology for The Social Sciences.  Data is a collection of observations  Observations have associated attributes  These attributes are.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Name: Angelica F. White WEMBA10. Teach students how to make sound decisions and recommendations that are based on reliable quantitative information During.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Steps 3 & 4: Evaluating types of evidence for the Truckee River case study.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 12 Correlational Designs.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Chapter 8 Making Sense of Data in Six Sigma and Lean
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Lecture 10: Correlation and Regression Model.
Step 5: Identifying probable causes. 2 What’s going on at Pretend Creek? TYPE OF EVIDENCE↑ metals↑ temp↓ DO Data from the case Spatial co-occurrenceYESNO.
The Big Picture Where we are coming from and where we are headed…
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
PCB 3043L - General Ecology Data Analysis.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
What Do You See?. A scatterplot is a graphic tool used to display the relationship between two quantitative variables. How to Read a Scatterplot A scatterplot.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 12 Correlational Designs.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Introduction to CADStat. CADStat and R R is a powerful and free statistical package [
Displaying the Observed Distribution of Quantitative Variables Histogram –Divide the range of the variable into equally spaced intervals - called bins.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Thursday, May 12, 2016 Report at 11:30 to Prairieview
MATH-138 Elementary Statistics
BAE 5333 Applied Water Resources Statistics
Statistics for Managers using Microsoft Excel 3rd Edition
Chapter 14: Correlation and Regression
Introduction to Statistics
Steps 3 & 4: Evaluating types of evidence.
Introductory Statistics
Presentation transcript:

Steps 3 & 4: Analysis & interpretation of evidence

2 Hypothesis testing: a simple scenario SCENARIO 1 DO measured upstream & downstream over 9 months — Upstream = 9.3 mg/L — Downstream = 8.4 mg/L Difference significant at P<0.05 SCENARIO 2 DO measured upstream & downstream over 3 months — Upstream = 7.9 mg/L — Downstream = 4.2 mg/L Difference not significant at P<0.05 Which scenario presents a stronger case for DO causing impairment?

3 Using statistics responsibly Use caution in interpreting differences Look at magnitude & consistency of differences, rather than statistical significance Statistical significance detects differences exceeding natural variance — Does not detect stressor effects — Does not equal biological significance Can use statistics, but also use your head — Consider relationship between minimum detectable difference (power) & biologically relevant difference

4 That said, graphs & statistics can help a lot… STEP 2 List candidate causes STEP 3 Evaluate data from case STEP 4 Evaluate data from elsewhere Scatterplots Boxplots Correlation analysis Conditional probability analysis Regression analysis Predicting environmental condition from biological observations (PECBO) Quantile regression Classification & regression tree Species sensitivity distributions (SSDs)

5

6

7

8

9 Sampling/data collection issues Want to collect candidate cause & biological response variables at same sites, at same times Want to time sampling to accurately assess exposures — For continuous exposures, try to catch sensitive life stages & conditions that enhance exposure or effects — For episodic exposures, try to characterize episodes through continuous, event, and post-event sampling Want to collect good reference comparisons, in space & time — At unimpaired sites, at same times you sample impaired sites

10

11 Data analysis tools Species sensitivity distribution (SSD) generator — Tool that calculates & plots proportion of species affected at different levels of exposure in laboratory toxicity tests

12 Data analysis tools Species sensitivity distribution (SSD) generator — Tool that calculates & plots proportion of species affected at different levels of exposure in laboratory toxicity tests Command-line R scripts — Powerful and free statistical package [ — R scripts provided for PECBO (predicting environmental conditions from biological observations) CADStat — Menu-driven package of data visualization & statistical techniques, based on JGR (Java GUI for R)

13 R R is a powerful and free statistical package… [ …but it’s run at the command line, so there’s a steep learning curve

14 JGR/CADStat: a gentler introduction to R JGR provides point-and-click tools for basic R functions — Loading data — Setting up a workspace — Managing data in your workspace — Installing packages CADStat adds several tools to JGR — Point-and-click interfaces for tools useful in causal analysis — Rendering of analysis results in browser window — Additional help files – Scatterplots – Boxplots – Correlation analysis – Linear regression – Quantile regression – Conditional probability analysis – PECBO

15 Load your data into CADStat, or use example data…

16

17 Where are we? Step 1 — Biological effects of interest defined — Impaired & reference sites identified Step 2 — Map completed — Conceptual model developed — Candidate causes identified — Data identified & collected Now we’re ready to analyze available data & use results as evidence in Steps 3 & 4

18 Begin with data exploration Want to examine relationships between different variables — Linear & more complex — Expected & unexpected 1 st step is visualizing data — Scatterplots & boxplots — No statistics, hypotheses, or p-values

19 Looking at your data… examples from exercise Which candidate cause do the data relate to? Do the data support or weaken the case for the candidate cause? What type of evidence do the data represent? PC1 – referencePC2 – impaired Dissolved oxygen (mg/L) Canopy covermoderatelow In Fall 2004, an unpermitted industrial discharge with high metal concentrations was discovered and removed, just upstream of PC2. EPT taxa richness over time PC PC

20 Boxplots Depict distribution of observations – Center box indicates spread of 50% of observations – Whiskers indicate either max & min values, or 1.5X the interquartile range Good data exploration & visualization tool Can be used to evaluate data from case & data from elsewhere – Spatial/temporal co-occurrence – Stressor-response relationships from the field or from other field studies – Causal pathway

21 Scatterplots Graphical displays of matched data – Dependent variable (y-axis) vs. independent variable (x-axis) – Often dependent variable is biological response, independent variable is measure of candidate cause Conductivity (uS/cm) Good data exploration & visualization tool – Can view several at once in scatterplot matrix Can be used to evaluate data from case & from elsewhere – Stressor-response relationships from the field, from other field studies, & from laboratory studies – Causal pathway

22 Examples from exercise: scatterplots Do the data support or weaken the case for temperature as a candidate cause? What type(s) of evidence do these data provide? Maximum summer water temperature (˚C) EPT taxa richness PC1 PC2

23 Examples from exercise: scatterplots Do the data support or weaken the cases for copper & zinc as candidate causes? What type(s) of evidence do these data provide? Log [Zn], ug/L EPT taxa richness Log [Cu], ug/L

24 Method for measuring degree of association between 2 variables in a matched data set Elevation Water Temp Air Temp Log Conductivity Elevation Water Temp Air Temp Log Conductivity Correlation analysis – Correlation coefficient provides quantitative measure of degree to which 2 variables co-vary Good data exploration tool Can be used to evaluate data from case – Stressor-response relationships from the field – Causal pathway

25 Method for quantifying relationship between dependent & independent variable – Can have ≥ 1 independent variables Regression analysis Models can be used to: – Predict value of response variable for new values of explanatory variable – Estimate value of explanatory variable needed to account for change in response variable – Model natural variation in stressors, to define reference expectations 90 th percentile prediction limits

26 Quantile regression Models relationship between specified quantile of response variable and explanatory variable (stressor) – 50 th quantile gives median line, under which 50% of observed responses occur – 90 th quantile gives line under which 90% of observed responses occur Provides means of estimating location of upper boundary on scatterplot Used to evaluate stressor-response relationships from other field studies (Step 4: Evaluating data from elsewhere)

27 Applying quantile regression SUPPORTS WEAKENS

28 Uses taxa list & niche characteristics to predict environmental conditions at site Predicting environmental conditions from biological observations (PECBO) Ameletus Diphetor Predicted temperature if both Ameletus & Diphetor are present at site Maximum likelihood inference based on taxon- environment relationships Useful when environmental data not available

29 Using PECBO to verify predictions R 2 = 0.67 Biologically-based predictions can provide valuable supporting evidence, under “Verified predictions” — If stressor elevated at site, can test hypothesis that biologically-based prediction should also be elevated Tools for calculating taxon- environmental relationships & PECBO available on CADDIS

30 Species sensitivity distributions (SSDs) An SSD is a statistical distribution describing the variation of responses of a set of species to a stressor; it can be used to estimate the proportion of species adversely affected at a given stressor intensity. SSD plot showing LC 50 s for arthropods exposed to dissolved cadmium

31 Now where are we? Step 1 — Biological effects of interest defined — Impaired & reference sites identified Now we’re ready to evaluate & score the evidence across candidate causes Step 2 — Map completed — Conceptual model developed — Candidate causes identified — Data identified & collected Steps 3 & 4 — Available data examined & analyzed — Available data organized into types of evidence