Sampling Strategies for Chinook-Salmon Spawning Populations

Slides:



Advertisements
Similar presentations
Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Advertisements

Workshop: Monitoring and Evaluation of Harvest on Columbia River Salmonids July 31- August 1, 2007.
1 Sampling designs for spawning data on The Middle Fork Salmon River *a lot like the Middle Fork Salmon R. What sampling design should be used for estimating.
Introduction Simple Random Sampling Stratified Random Sampling
Sampling: Final and Initial Sample Size Determination
Chapter 10: Estimating with Confidence
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Precision of Redd Based Escapement Estimates for Steelhead
Ch 4: Stratified Random Sampling (STS)
1 STARMAP: Project 2 Causal Modeling for Aquatic Resources Alix I Gitelman Stephen Jensen Statistics Department Oregon State University August 2003 Corvallis,
Appropriate Sampling Ann Abbott Rocky Mountain Research Station
Centrality and Prestige HCC Spring 2005 Wednesday, April 13, 2005 Aliseya Wright.
Statistical Inference and Sampling Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
1 Accounting for Spatial Dependence in Bayesian Belief Networks Alix I Gitelman Statistics Department Oregon State University August 2003 JSM, San Francisco.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Example For simplicity, assume Z i |F i are independent. Let the relative frame size of the incomplete frame as well as the expected cost vary. Relative.
7-1 Chapter Seven SAMPLING DESIGN. 7-2 Sampling What is it? –Drawing a conclusion about the entire population from selection of limited elements in a.
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
Sampling Procedures and sample size determination.
1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R
Sampling Theory and Surveys GV917. Introduction to Sampling In statistics the population refers to the total universe of objects being studied. Examples.
Leon-Guerrero and Frankfort-Nachmias,
1 Spatial and Spatio-temporal modeling of the abundance of spawning coho salmon on the Oregon coast R Ruben Smith Don L. Stevens Jr. September.
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Spatial Survey Designs Anthony (Tony) R. Olsen USEPA NHEERL Western Ecology Division Corvallis, Oregon (541) Web Page:
Sampling: Theory and Methods
Chris Bare, Jim Latshaw, Ian Tattam, Jim Ruzycki, and Rich Carmichael Estimating Chinook escapement to the John Day River basin using a mark-recapture.
Chapter 11: Estimation Estimation Defined Confidence Levels
Jeremy Cram 1, Christian Torgersen 2, Ryan Klett 1, George Pess 3, Andrew Dittman 3, Darran May 3 1. University of Washington, School of Forest Resources,
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Random Sampling, Point Estimation and Maximum Likelihood.
Introduction to Inferential Statistics. Introduction  Researchers most often have a population that is too large to test, so have to draw a sample from.
Chapter 18 Additional Topics in Sampling ©. Steps in Sampling Study Step 1: Information Required? Step 2: Relevant Population? Step 3: Sample Selection?
Two-Sample Inference Procedures with Means. Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.6 inches.
Population and sample. Population: are complete sets of people or objects or events that posses some common characteristic of interest to the researcher.
1 Systematic Sampling (SYS) Up to now, we have only considered one design: SRS of size n from a population of size N New design: SYS DEFN: A 1-in-k systematic.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
Nick Smith, Kim Iles and Kurt Raynor Partly funded by BC Forest Science Program and Western Forest Products Sector sampling – some statistical properties.
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
Sampling Methods. Probability Sampling Techniques Simple Random Sampling Cluster Sampling Stratified Sampling Systematic Sampling Copyright © 2012 Pearson.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Discrete Probability Distributions Define the terms probability distribution and random variable. 2. Distinguish between discrete and continuous.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-5 Estimating a Population Variance.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
1 OUTPUT ANALYSIS FOR SIMULATIONS. 2 Introduction Analysis of One System Terminating vs. Steady-State Simulations Analysis of Terminating Simulations.

Review HW: E1 A) Too high. Polltakers will never get in touch with people who are away from home between 9am and 5pm, eventually they will eventually be.
Anthony (Tony) R. Olsen USEPA NHEERL Western Ecology Division Corvallis, Oregon Voice: (541) Generalized Random Tessellation.
Technical Details of Network Assessment Methodology: Concentration Estimation Uncertainty Area of Station Sampling Zone Population in Station Sampling.
IPDET Module 9: Choosing the Sampling Strategy. IPDET © Introduction Introduction to Sampling Types of Samples: Random and Nonrandom Determining.
Lecture 4 Forestry 3218 Avery and Burkhart, Chapter 3 Shiver and Borders, Chapter 5 Forest Mensuration II Lecture 4 Stratified Random Sampling.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Sampling Concepts Nursing Research. Population  Population the group you are ultimately interested in knowing more about “entire aggregation of cases.
Population vs Sample Population = The full set of cases Sample = A portion of population The need to sample: More practical Budget constraint Time constraint.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Statistical Concepts Breda Munoz RTI International.
Two-Sample Inference Procedures with Means. Two-Sample Procedures with Means two differentGoal: Compare two different populations/treatments INDEPENDENT.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Section 4.2 Random Sampling.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
John Loucks St. Edward’s University . SLIDES . BY.
Two-Sample Inference Procedures with Means
2. Stratified Random Sampling.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Lecture 7 Sampling and Sampling Distributions
Modeling with the normal distribution
Presentation transcript:

Sampling Strategies for Chinook-Salmon Spawning Populations 4/15/2017 Sampling Strategies for Chinook-Salmon Spawning Populations Jean-Yves Pip Courbois, Steve Katz, Chris Jordan, Michelle Rub, and Ashley Steel – NOAA Fisheries, NWFSC – Russel F. Thurow and Daniel J. Isaak – U.S. Forest Service, Rocky Mountain Research Station What sampling strategies should be used for estimating the number of chinook redds on a river network*? Status estimation – number of spring-chinook redds in Middle Fork Salmon River one year Our objectives here are simple, to determine which sampling strategy should be used to estimate chinook spawning status. 1 *a lot like the Middle Fork Salmon R.

Chinook redds At the conclusion of the annual spawning migration, adult female chinook prepare a spawning bed, a redd Disturbed gravels (light- colored area) indicate a Chinook redd Total number of redds is an indicator of population health, now and future 2

The Middle Fork Salmon River 4/15/2017 The Middle Fork Salmon River National Wild and Scenic River in the Frank Church River of No Return Wilderness – roadless area Drains about 7,330 km2 of central Idaho Two level 4 HUCs and 126 level 6 HUCs Home to 15 native fishes including 7 salmonid taxa Spring chinook salmon – ESA listed 655 km of chinook spawning reaches Index reaches Km2 = square kilometers. Index reaches – delineated in the 1950’s, change periodically, these are those used since approx 1995, “good salmon habitat” N 10 20 40 Kilometers 3

Number of redds – “the Truth” Since 1995 we have counted the number of redds in the entire watershed via helicopter Where necessary sampled by foot This study uses six years of data: These data will be considered the truth year 1995 1996 1997 1998 2001 2002 Total redds 20 83 424 661 1789 1730 4

Examples: Small and large runs 1995 2002 5

4/15/2017 Objectives Criteria Design-based standard error of estimator coverage probability (how many times 95% confidence interval actually contains the number of redds) cost Sampling and measurement unit: 200-meter reaches (N=3,274) Keep things fair by sampling the same total length of stream, sampling fraction =.1 and .05 (n=327 and 164) Although some standard errors can be calculated analytically the coverage needs to be addressed via simulation. 65.4 and 32.8 km. sampled 6

Methods Use simulation by resampling the population over and over . 7

Costs & crew-trips Each sampling unit in the MF is assigned to an access point There are two types of access points: air fields and trailheads, same price Cost for access sites = maximum distance from access site to sampling reaches in each “direction” along network Total cost = sum of costs for 15 access sites 4 “directions” = 4 round trips required 8

distances in 5km intervals. Many areas require over 20 km hike Maximum distance is 33 km. Many areas require over 20 km hike 9

The sampling strategies 4/15/2017 The sampling strategies Index – Sample the index reaches or SRS within index only. Simple random sampling – Cluster sampling – simple random sampling of 1 km. length units. Systematic sampling – Sort tributaries in random order systematically sample along resulting line. Stratify by Index – Sample independently within and outside the index regions. Adaptive cluster sampling – Choose segments with a simple random sample. If sampled sites have redds sample adjacent segments. Spatially balanced design – GRiTS, select segments as sampling units rather than points. Generalized random tessellated design from EPA’s EMAP designs and now from these two projects represented here. 10

Index sampling When the sample size is smaller than the overall size of the index region a simple random sample of the segments within the index is collected. Two possibilities to estimate the number of redds from the index sample: Assume there are no redds outside of the index – estimates will be too small (all) Assume that the average number of redds per segment outside the index is the same inside and simply inflate the index estimator – estimates will be too large (rep) 11

Systematic sampling Order the tributaries in random order along a line Choose sampling interval, k, so that final sample size is approximately n Select a random number, r, between 1 and k Sample reaches r, r+k, r+2k, …, r+(n-1)k Systematic sampling is cluster sampling where clusters are made up of units far apart in space and one cluster is sampled k r r+k r+2k r+3k r+4k 12

Stratify by Index Stratify by index and oversample index reaches Simple random sample in each stratum Allocation: Equal allocation: Usually does not perform well Proportional allocation: Does not oversample index sites so will probably not have good precision Optimal allocation: need to know the standard deviation year 1995 1996 1997 1998 2001 2002 proportion in index 0.76 0.54 0.48 0.42 0.46 13

Adaptive cluster sampling Original sample is simple random sample If sampled site meets criteria also sample sites in neighborhood Criteria: presence of redds Neighborhood: segments directly upstream and downstream Continue until sites do not meet criteria Both legs of confluences include neighbor 6 5 6 4 3 Meets criteria 4 and do not meet criteria 1 3 2 Two sample sizes: ADAPT-EN – equate expected final sample size ADAPT-N – equate original SRS sample size in original sample 2 1 Meets criteria Final sample includes: 2 1 3 4 6 14

Results: Normalized standard error of estimators 4/15/2017 Results: Normalized standard error of estimators Run size 20 83 424 661 1789 1730 GRTS 75.6 35.1 16.6 13.7 11.0 10.9 SYS 68.4 38.4 17.0 14.0 10.2 10.8 STRS-index (optimal) 59.6 31.5 17.4 14.3 13.3 12.3 ADAPT-N 76.4 35.8 18.2 14.8 12.1 11.8 SRS 36.3 19.5 16.1 13.8 ADAPT-EN 76.6 35.9 21.3 18.5 19.2 18.1 Cluster 93.2 44.5 29.6 24.9 24.4 23.6 This is the CV of t-hat. Now we can compare these over the run size. Index - all 34.4 29.6 32.5 32.7 34.6 31.1 Index – rep. 247 158 132 129 122 134 15

Standard error estimation for systematic strategy 4/15/2017 Standard error estimation for systematic strategy This problem is not evident in the GRTS design. 16

Results: Empirical coverage probability 4/15/2017 Results: Empirical coverage probability Empirical coverage probability Run size 20 83 424 661 1789 1730 GRTS 81.1 90.7 93.0 92.0 93.4 93.8 SYS 88.7 92.6 91.2 96.0 94.1 STRS-index (optimal) 77.3 91.3 92.8 94.6 93.2 ADAPT-N 83.3 93.6 94.4 94.2 SRS 82.4 89.8 92.7 94.0 93.9 ADAPT-EN 82.5 92.3 Cluster 75.1 86.7 89.9 91.8 92.2 This is the CV of t-hat. Now we can compare these over the run size. Index - all 98.7 99.8 92.9 77.6 32.7 55.9 Index – rep. 89.4 2.0 0.2 17

4/15/2017 Costs kilometers traveled 18

Relative precision per cost 4/15/2017 Relative precision per cost Precision per cost Units = 1/km traveled 10% sampling fraction There are several possibilities for combining the precision and cost: Two simple approaches are to hold one fixed and find the minimum of the other Otherwise Here is the precision per mean cost in 1/km traveled. One could also plot the precision per std. Dev. Of cost. Or use something such as CI width per km. Traveled. In either case the y-axis is difficult to interpret The standard errors are standardized by the size of the run then multiplied by the cost in kilometer traveled by foot. So the precision is unit free and the denominator is in KMs. So small runs: either stratified by index or SRS-1km are best bang for our buck medium runs: stratified by index Large runs: Systematic. 19 run size

Conclusions Precision Medium to large runs: Systematic strategies (systematic and GRTS) Standard error difficult to estimate for systematic strategy Small runs: Stratified by index Requires optimal allocation which is difficult to determine Cost and precision Small runs – cheap strategies best, either index or SRS-1km Medium runs – intermediately priced designs, stratify by index Large runs – precise strategies best, either systematic strategies or stratified by index 20

Six years 1997 1998 21

Discussion Adaptive cluster strategy is not as precise as other designs. It is optimal for rare clustered populations during small years the redds are not clustered enough during large years they are not rare enough only during the medium years does it compete with other designs Many of the designs require extra information Stratified Adaptive These results suggest more complex designs such as combining stratified with systematic or adaptive Real vs. simulated data? 22

Lucas Boone Courbois, born August 4, 2004 Acknowledgements Tony Olsen (US-EPA), Damon Holzer, George Pess, (NOAA-Fisheries) Funding for this research has been provided by NOAA-Fisheries Northwest Fisheries Sciences Center Cumulative Risk Initiative and partially by the US EPA cooperative agreement CR29096 to Oregon State University and its its subagreement E0101B-A to the University of Washington. This research has not been formally reviewed by NOAA-Fisheries or the EPA. The views expressed in this document are solely those of the authors; NOAA-Fisheries and the EPA do not endorse any products or commercial services mentioned herein. Lucas Boone Courbois, born August 4, 2004 Seattle WA. 23

24

Six years 1995 1996 25

Six years 2001 2002 26

Stratify by 6th field HUC 27

Points vs. Lines Pick points -- points are picked along stream continuum and the measurement unit is constructed around the point advantages: different size measurement units are easily implemented disadvantages: difficulty with overlapping units inadvertent variable probability design because of confluences and headwaters Analysis may be complicated Pick Segments – Universe is segmented before sampling and segments are picked from population of segments advantages: simple to implement simple estimators disadvantages: Difficult frame construction before sampling Cannot accommodate varying lengths of sampling unit 28

Methods Sampling strategies include sampling design and estimator Sampling and measurement unit: 3,274 200-meter segments Measurement design assumes no measurement error Estimator for the total . sample design and confidence interval 29

Adaptive Cluster Sampling Use the draw-by-draw probability estimator: Let wi be the average number of redds in the network of which segment i belongs, then with variance Thompson 1992 30

Access to MFSR Roadless area Airplane access possible 31

air vs. car access 32

Index sample Not sure how to build estimates for total number of redds in Middle fork. expand current estimator (assume same density outside of index) use current estimate (assume 0 redds outside of index) year 1995 1996 1997 1998 2001 2002 Number counted in Index 19 62 290 448 1178 1199 Total number of redds 20 83 424 661 1789 1730 33

34

Stratify by Index Oversample index sites where most redds are located Simple random sample in each stratum Equal allocation: Proportional allocation: year 1995 1996 1997 1998 2001 2002 5.33 12.68 36.61 47.34 121.43 106.98 coverage 90.4 94.6 94.2 94.8 92.9 93.4 year 1995 1996 1997 1998 2001 2002 7.77 15.26 41.08 52.37 124.90 115.56 coverage 88.0 94.7 95.0 94.9 94.4 93.6 35

Stratify by index Optimal allocation Using year 1995 1996 1997 1998 2001 2002 proportion in index 0.76 0.54 0.48 0.42 0.46 n index 746 530 475 464 407 445 n other 230 446 501 512 569 531 year 1995 1996 1997 1998 2001 2002 5.49 12.76 36.60 47.26 120.50 106.58 coverage 92.0 95.0 94.3 95.5 92.9 93.5 36

Stratify by index Using year 1995 1996 1997 1998 2001 2002 n index 746 530 475 464 407 445 n other 230 446 501 512 569 531 37

To do stratified by 6th field HUC Better estimators for Adaptive designs. Cost function including road/airplane travel crew trips/day units 38