Developing O/E (Observed-to-Expected) Models for Assessing Biological Condition Chuck Hawkins Western Center for Monitoring and Assessment of Freshwater.

Developing O/E (Observed-to-Expected) Models for Assessing Biological Condition Chuck Hawkins Western Center for Monitoring and Assessment of Freshwater Ecosystems Utah State University 11 May 2006 National Water Quality Monitoring Council 5th National Monitoring Conference San Jose, California

Content of Short Course O/E as a concept. E: simple idea, not so easy to estimate. Sampling, probabilities of capture, and E. E as a function of taxon-specific probabilities of capture. Predicting E: it only hurts for a while! Errors, inferences, and two types of assessments. O/E and the WSA – understanding the numbers.

What is O/E? O/E is a measure of the taxonomic completeness of the biological community observed at a site E = 8 taxaO = 3 taxa O/E 0.38

O/E Allows Comparison of “Apples” and “Oranges”

O/E is a Site-Specific, Standardized Measure of Biodiversity Loss 0.70 O = 7 E = 10 O = 21 E = 30

E: simple idea, not so easy to estimate: Accurately and precisely describing the biota expected in different waterbodies.

Hypothetical Variation in Probabilities of Capture of Several Taxa Along One Natural Environmental Gradient Temperature Probability of Capture The challenge is compounded because taxa pc’s are simultaneously controlled by several natural factors.

A Segue into the Messy Issue of Probabilities Sampling means uncertainty! Sampling error, probabilities of capture, and E.

Sampling Effort Number of Taxa Field Sample Complete Census Lab sub-sample The actual composition and number of taxa in any given sample will have a random component.

E as a function of taxon-specific probabilities of capture Although E = 8 taxa is a true statement, the picture of distinct composition is misleading. The real composition associated with E is actually a bit fuzzy. PC = 1PC = 0.8PC = 0.5 PC = 0.2

Sampling and Probabilities of Capture Taxon Replicate Sample NumberFreq (pc) 12345678910 Baetis**********1.0 Perla********0.8 Corixa*****0.5 Drunella*****0.5 Epeorus*0.1 Sp Count33324322432.9 E = ∑ pc =  number of taxa / sample = 2.9.

Calculating O/E is simple…., if we can estimate probabilities of capture.

How O/E is Calculated: Sum of taxa pc’s estimates the number of taxa (E) that should be observed given standard sampling. TaxonpcO Atherix0.92● Baetis0.86● Caenis0.70 Drunella0.63 Epeorus0.51● Farula0.32 Gyrinus0.07 Hyalella0.00 E4.013 O/E = 3 / 4.01 = 0.75 O2O2 O3O3 ● ● ●● ●● ● 33

Predicting E: it only hurts for a while! Two basic approaches: –Model many individual species (logistic regression models) and then combine the many predictions. –Model a few assemblage types and then ‘back out’ probabilities of capture for individual species. –We do the latter.

Yes, explaining how E is predicted can be a little complicated. “In layman’s terms? I’m afraid I don’t know any layman’s terms.” Mc 2 x  R 2 6.673×10 -11 m 3 kg -1 s -2 E =

The basic approach to estimating pc’s from predictions of assemblage type was worked out several years ago. Moss, D., M. T. Furse, J. F. Wright, and P. D. Armitage. 1987. The prediction of the macro- invertebrate fauna of unpolluted running-water sites in Great Britain using environmental data. Freshwater Biology 17:41-52. Empirical modeling that derives predictions of the probability of capturing a species at a new location from observations at ‘reference’ sites. A primer is on our web page: www.cnr.usu.edu/wmc

Three Major Steps in Estimating E 1.Classify reference sites based on their biological similarity. 2.Predict the class of a new site from environmental attributes with a discriminant functions model. 3.Weight frequencies of occurrences of taxa within classes by the site’s probabilities of class membership to estimate pc’s and then E.

Classifying Reference Sites (sites within classes are seldom spatially clustered) Cool Water, High Elevation Warm Water, Low Elevation Limestone Watershed Granite Limestone Spring Stream Small Medium Small Big Medium A B C D E F G H Environmental features associated with biologically defined classes.

Hydropsyche 100% Caenis 95% Baetis 90% Tricorythodes 80% Drunella grandis 70% Baetis 100% Drunella grandis 85% Arctopsyche80% Neophylax75% Optioservus70% Baetis 95% Epeorus 90% Simulium 90% Arctopsyche 75% Zapada70% We could use these numbers to estimate E at a new site belonging to one of these stream ‘types’, but… what if the site is ‘medium-big’, etc.? A B C Small Big Medium Granite

Class A Class B Class C Class D Discriminant Analysis Biologically Defined Reference Classes: Discriminant Model Reference Site Predictor Variables: Catchment Area Geology Latitute Longitude Elevation etc. Discriminant Functions Models Classify New Sites in Terms of Their Probabilities of Class Membership

Discriminant Model Predictor Variables Values A0.50.60.30 B0.40.20.08 C0.10.00.00 D0.00.00.00 Probability of Taxon Being in a Sample if the Site is in Reference Condition = 0.38 Frequency of Taxon in Class Probability of Class Membership By Weighting Taxon Frequencies of Occurrence within a Class by the Probabilities of Class Membership, We Can Estimate Individual Taxon Probabilities of Capture Class Contribution to PC

The model estimates the pc’s of every taxon (i.e., those observed in at least one reference site sample) at every assessed site, not just a few as shown here for illustration. Also, if a taxon is predicted to have a pc of zero, it does not contribute to O! O/E = 3 / 4.07 = 0.74 TaxonpcO Atherix0.70* Baetis0.92* Caenis0.86 Drunella0.63 Epeorus0.51* Farula0.38 Gyrinus0.07 Hyalella0.00* E4.073

Errors, inferences, and two types of assessments. Model error. Inferring site condition. Inferring regional conditions.

Need to Estimate Prediction Error for Site Assessments Is a site with O/E = 0.8 impaired? E O 1 O/E

Statistical Issues Regarding Inferences of Impairment (Single Samples) Statistical Hypothesis: Is the observed O/E value for a single sample from the same distribution of values estimated for reference sites, i.e., the site is either equivalent to reference or not. We should ideally set a threshold value to balance Type I and II errors. Easy to set Type I error, but Type II errors are problematic. 10 th and 90 th percentiles of reference site values have been used.

How Good can a Model Be? SD of O/E values calculated at reference quality sites is a measure of overall model error. –Part sampling error –Part prediction error (random and systematic) A model can be no more precise than random sampling error. A model should be no worse than a null model – i.e., assume all sites have similar biota.

For Regional Assessments, We Want to Compare the Distribution of Observed O/E Values Among Sites with the Expected Distribution 1 O/E Expected if All Sites are in Reference Condition Actual Distribution 35% 40% 25% Stream Miles Fair Poor Ref

Statistical Issues Regarding Inferences of Impairment (Multiple Sites and Replicated Samples at a Site) Statistical Hypothesis: is the observed mean different from 1 (the reference mean)? This test allows us to ask questions regarding how impaired a site or population of sites is. Sensitivity of the test is a function of model precision and sample size. Methods for balancing Type I and II errors are well worked out. Replicate samples at a site allow estimation of confidence limits around estimates of O/E.

O/E and the WSA – understanding the numbers. WSA reference sites How many models? Reference site classes Model predictors Model performance Assessment results

1097 Reference Sites in Three Super-Ecoregions WEST PLAINS EASTERN HIGHLANDS

Sample Sizes WESTPLAINSEAST Calibration527140217 Validation1254048

Great variability in geographic distribution of sites within classes. Western Model used 30 classes of streams for modeling Graphic courtesy of Pete Ode.

Predictor Variables WestPlainsE. Highlands Longitude---Longitude Elevation --- Day of Year Basin Area Stream Slope --- Air TemperatureFreeze-Free DaysAir Temperature Log Precipitation Wet Days

Variation in Predictor Variable Values Within and Among the Eastern Highland Reference Site Classes

Model Performance ValidationWestPlainsE. Highlands Mean0.990.950.99 SD (model)0.200.240.18 SD (null)0.260.300.22 Test Sites Mean0.840.860.81

Sites sampled for the Wadeable Streams Assessment by EPA Region.

Biodiversity status of the Nation’s streams as measured by O/E. Data summarized as % of stream miles in each of 4 O/E classes.

Concluding Remarks O/E has an intuitive biological meaning. It means the same thing everywhere. Its derivation and interpretation are independent of type and knowledge of stressors in the region. It is quantitative, but….

Our interpretations of assessments are still only as good as our understanding of aquatic ecosystems.

Developing O/E (Observed-to-Expected) Models for Assessing Biological Condition Chuck Hawkins Western Center for Monitoring and Assessment of Freshwater.

Similar presentations

Presentation on theme: "Developing O/E (Observed-to-Expected) Models for Assessing Biological Condition Chuck Hawkins Western Center for Monitoring and Assessment of Freshwater."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Developing O/E (Observed-to-Expected) Models for Assessing Biological Condition Chuck Hawkins Western Center for Monitoring and Assessment of Freshwater.

Similar presentations

Presentation on theme: "Developing O/E (Observed-to-Expected) Models for Assessing Biological Condition Chuck Hawkins Western Center for Monitoring and Assessment of Freshwater."— Presentation transcript:

Similar presentations

About project

Feedback