Fixed sites and monitoring/bioassessment bias in ephemeral systems Wayne Robinson Charles Sturt University.

Slides:



Advertisements
Similar presentations
Outcomes of The Living Murray Icon Sites Application Project Stuart Little Project Officer, The Living Murray Environmental Monitoring eWater CRC Participants.
Advertisements

Testing Theories: Three Reasons Why Data Might not Match the Theory.
The current status of fisheries stock assessment Mark Maunder Inter-American Tropical Tuna Commission (IATTC) Center for the Advancement of Population.
Chapter 10 Sampling and Sampling Distributions
Estimating Hurdle Rates. Cost of Capital To evaluate project, need estimates of cashflows, and also estimate of an appropriate hurdle rate (r). Hurdle.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Meta-analysis & psychotherapy outcome research
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Theory testing Part of what differentiates science from non-science is the process of theory testing. When a theory has been articulated carefully, it.
Analyzing Institutional Assessment Results Longitudinally (with nonparametric effect sizes) Dr. Bradley Thiessen, Director of Institutional Research Problem:
Chapter Nine Copyright © 2006 McGraw-Hill/Irwin Sampling: Theory, Designs and Issues in Marketing Research.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Lab 3b: Distribution of the mean
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Generalized linear MIXED models
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
The effect of variable sampling efficiency on reliability of the observation error as a measure of uncertainty in abundance indices from scientific surveys.
CAN DIAGNOSTIC TESTS HELP IDENTIFY WHAT MODEL STRUCTURE IS MISSPECIFIED? Felipe Carvalho 1, Mark N. Maunder 2,3, Yi-Jay Chang 1, Kevin R. Piner 4, Andre.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
CPUE analysis methods, progress and plans for 2011 Simon Hoyle.
Recommended modeling approach Version 2.0. The law of conflicting data Axiom Data is true Implication Conflicting data implies model misspecification.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 6- 1.
Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.
Some Wildlife Census Techniques
Active Learning Lecture Slides For use with Classroom Response Systems
Sampling Distributions – Sample Means & Sample Proportions
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture Slides Elementary Statistics Twelfth Edition
Module 9: Choosing the Sampling Strategy
Dependent-Samples t-Test
Sampling Distributions
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
New Zealand Orange Roughy Fisheries and assessments SPRFMO THIRD WORKSHOP - DEEP WATER WORKING GROUP Alistair Dunn 23 May 2017.
Benefit: Cost Ratio.
Goals of Statistics.
Statistics: The Z score and the normal distribution
Assessing Disclosure Risk in Microdata
WELFARE AND THE ROLE OF FISH
Introduction to estimation: 2 cases
Chapter 10 Samples.
S2 Chapter 6: Populations and Samples
SAMPLING (Zikmund, Chapter 12.
Chapter 7 Sampling Distributions.
Combining Random Variables
Sampling Distributions
Power, Sample Size, & Effect Size:
Chapter 6 Hypothesis tests.
Determining the distribution of Sample statistics
Making every drop count Optimising outcomes through coordination
Chapter 7 Sampling Distributions
Chapter 7 Sampling Distributions.
SAMPLING.
Sampling Distribution of the Sample Mean
Writing the executive summary section of your report
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Chapter 9.1: Sampling Distributions
Chapter 7 Sampling Distributions.
SAMPLING (Zikmund, Chapter 12).
Sampling and Power Slides by Jishnu Das.
Lecture 7 Sampling and Sampling Distributions
Group 2.
Chapter 7 Sampling Distributions.
The Central Limit Theorem
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Sampling Distributions (§ )
Sampling and estimation
8.3 Estimating a Population Mean
Chapter 7 Sampling Distributions.
Presentation transcript:

Fixed sites and monitoring/bioassessment bias in ephemeral systems Wayne Robinson Charles Sturt University

Context Aggregating assessments to large or small spatial scales More than one sample in time For example, assessing a local asset or a region or national program The Living Murray condition monitoring program

Fixed sites Average differences between sites Average trend of sites Bad for bias and true status Great for trend? - assumes initial bias is constant through time Review of typical approaches

Random sites Differences between population average Trend of population averages Not as good for trend for 8 – 15 years Less bias thus great for status Review of typical approaches

Rotating panel Trend + population averages Trade off between fixed and rotating Review of typical approaches

Sites dropout for numerous reasons Access unavailable Landholder tolerance No longer representative Managed Act of nature Equipment failure Logistics Etc The longer a study goes for, the higher site attrition is likely to be and the less likely the samples represent the (initial) population These are generally termed ‘missing at random’ (MAR)

Sometimes sites dropout because of the ‘moving sampling frame’. E.g less watered area Original sites have an inclusion probability when selected New sites have different ‘inclusion probabilities’ when selected Some data providers may add some more sites Requires statistical adjustments in calculations Q1. Should we really be worried about dropout sites? i.e. Are there consequences for the assessments? Missing non-random (MNAR) Drop out sites

Furthermore, in ephemeral systems, sometimes sites are added because of the moving sampling frame New sites have different inclusion probabilities when selected Q1. Should we really be worried about the moving sampling frame? i.e. Are there consequences for the assessments? Imagine doing a shopping basket survey for a large city, based on the city limits of 40 years ago!

Common methods for dealing with dropout sites Oversample subjects/sites at the start to allow for attrition e.g. common approach in medical studies Restrict the sampling frame to subjects/sites less likely to dropout e.g. only report in more permanent sites (SRA) Refresh the samples by adding new subjects/sites e.g. stock market indices requires adjustments for different inclusion probabilities

Common methods for dealing with dropout sites Oversample subjects/sites at the start to allow for attrition e.g. common approach in medical studies Restrict the sampling frame to subjects/sites less likely to dropout e.g. only report in more permanent sites (SRA) Refresh the samples by adding new subjects/sites e.g. stock market indices requires adjustments for different inclusion probabilities *This is the only approach that deals with an expanding sampling frame

The Living Murray – Icon Sites Large scale, asset based monitoring program Three very similar forests relevant to this report Koondrook-Perricoota Forest Gunbower Forest Barmah-Millewa Forest All use similar within site sampling protocols for sampling fish Different site selection protocols

The Living Murray – Icon Sites Large scale, asset based monitoring program Three very similar forests relevant to this report Koondrook-Perricoota Forest Gunbower Forest Barmah-Millewa Forest All use similar within site sampling protocols for sampling fish Different site selection protocols Prescribed by MDBA to use fixed monitoring sites

$60 Million in works to flood KP forest

Delivering up to 6 ML/Day

Up to 100 days/year

Koondrook-Perricoota Forest 32,000 Ha Starting from a dry state Missed the brief about using fixed sites (luckily) Estimated 32 sites for decent estimates of status of native fish The forest flooded in 2010, and 80% of available sites were sampled at random (n=32), and there has been a census almost every other year 54 sites in total sampled for fish since 2011

Koondrook-Perricoota Forest (Fish Sampling) Starting from a dry state Missed the brief about using fixed sites (luckily) The forest flooded in 2010, and 80% of available sites were sampled at random (n=32), and there has been a census almost every other year 54 sites in total sampled since 2010 All available habitat in the forest is mapped each year before sampling

Barmah – Millewa Forest (Fish Sampling) 21 fixed sites, across 3 strata Sampling Strata Permanent River Sites Permanent Creek Sites Wetland/Lak e Sites 1 MAR in two years 1 Refresh site 1 MNAR for three years Occasional MAR 1 Refresh site 5 MNAR for various years No MAR This strata is equivalent to KPF IRES/wetlands They do not know what their sampling frame is

Koondrook-Perricoota Forest

Koondrook-Perricoota Forest Census 80% 91% Census

Retrospective look at KPF fish monitoring data How ‘bad’ is the bias caused by dropout sites (after 5 years)? Randomly select n sites from those sampled in year 1 A.Follow only these through the study (No Refresh) B.Supplement with refresh sites where possible, from The 2011 frame All available sites in any year C.Compare with the census data Do this lots of times with varying sample sizes All samples are subjected to additional calculations for; inclusion probabilities size of waterbody Finite population corrections

Retrospective look at KPF fish monitoring data How ‘bad’ is the bias caused by dropout sites (after 5 years)? Randomly select n sites from those sampled in year 1 A.Follow only these through the study (No Refresh) B.Supplement with refresh sites where possible, from The 2011 frame All available sites in any year C.Compare with the census data Do this lots of times with varying sample sizes All samples are subjected to additional calculations for; inclusion probabilities size of waterbody Finite population corrections CAVEAT! Only 80 % of the population was sampled in 2011, Thus the initial sample is also subject to random sampling bias Assumed not too large as such a large proportion of the populations sampled

Interpreting the results Red reference line is Census mean Green reference line is mean of initial group of sites (a little bias here) Histogram is the distribution of the random samples All results are basic fish nativeness scores

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 1: Proportion Native Fish Species Richness

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 1: Proportion Native Fish Species Richness No Bias in my bootstrap sampling methodology (random error)

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 1: Proportion Native Fish Species Richness No refresh = biased + 16% + 14% + 13% -3% ns -22%

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 1: Proportion Native Fish Species Richness No refresh = biased Refresh from initial frame = biased + 16% + 17% + 14% + 13% + 17% -3% ns - 2% -22%

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 1: Proportion Native Fish Species Richness No refresh = biased Refresh from initial frame = biased Refresh from all available sites = [less] biased + 16%+ 7% + 17% + 14%+ 8% + 14% + 13%+ 6% + 17% -3% ns-.3%ns - 2% -22%-.0%ns - 22%

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 1: Proportion Native Fish Species Richness Adjusting site weights using new site inclusion probabilities practically eliminates bias + 16%+ 7% + 17% +1% ns + 14%+ 8% + 14% +4% ns + 13%+ 6% + 17% -2% ns -3% ns-.3%ns - 2% -.3% ns -22%-.0%ns - 22% -4%

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 2: Proportion Native Fish catch Adjusting site weights using new site inclusion probabilities practically eliminates bias +10%+3% +8% +3% ns +2% ns+1% ns + 2% +1% ns +16% +6%+18% -1% ns +12%+3% +8% +1 % ns -31%-.0%ns -31% -6%

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 3: Proportion Native Fish biomass Adjusting site weights using new site inclusion probabilities practically eliminates bias + 49%+ 7% +26% -14% ns + 46%+ 33% + 54% -26% ns - 40%+10% ns + 21% -6% ns -22%-.1%ns - 8% +7 % ns +25%-.0%ns +25% +5%

Results N = 7 sites Comparable with BMF, BMF does not use refresh sites Index 3: Proportion Native Fish biomass (Random) Starting BIAS is not constant through time + 49%+ 7% +26% -14% ns + 46%+ 33% + 54% -26% ns - 40%+10% ns + 21% -6% ns -22%-.1%ns - 8% +7 % ns +25%-.0%ns +25% +5%

Summary It is clear that not reviewing the sampling frame for each survey leaves the results susceptible to bias When sites dropout they should be replaced using the current sampling frame Inclusion probabilities are required for correct calculations

Questions My question for you DO YOU have a good big census data set that I can borrow?

Data Providers NSW DPI Fisheries, Arthur Rylah Institute, North Central Catchment management Authority, Murray Darling Basin Authority