1 of 27 How Many Samples do I Need? Part 2 Presenter: Sebastian Tindall (60 minutes) (5 minute “stretch” break) DQO Training Course Day 1 Module 5.

Slides:



Advertisements
Similar presentations
1 of 45 How Many Samples do I Need? Part 3 Presenter: Sebastian Tindall 60 minutes DQO Training Course Day 1 Module 6.
Advertisements

How Many Samples do I Need? Part 2
MARLAP Measurement Uncertainty
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 19 Confidence Intervals for Proportions.
Confidence Intervals for Proportions
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 18, Slide 1 Chapter 18 Confidence Intervals for Proportions.
Importance of Quality Assurance Documentation and Coordination with Your Certified Laboratory Amy Yersavich and Susan Netzly-Watkins.
Confidence Intervals for
Jump to first page STATISTICAL INFERENCE Statistical Inference uses sample data and statistical procedures to: n Estimate population parameters; or n Test.
Topic 7 Sampling And Sampling Distributions. The term Population represents everything we want to study, bearing in mind that the population is ever changing.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
BA 427 – Assurance and Attestation Services
DQOs and the Development of MQOs Carl V. Gogolak USDOE Environmental Measurements Lab.
Chapter 19: Confidence Intervals for Proportions
QA/QC FOR ENVIRONMENTAL MEASUREMENT
1 of 25 The EPA 7-Step DQO Process Step 5 - Define Decision Rules 15 minutes Presenter: Sebastian Tindall DQO Training Course Day 2 Module 14.
1 of 23 From Qualitative Concept to Practical Implementation Evolution of the Data Quality Objectives Concept DQO Training Course Day 1 Module 1 15 minutes.
Determining Sample Size
1 of 35 The EPA 7-Step DQO Process Step 4 - Specify Boundaries (30 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 4.
1 of 23 EPA Inspector General Audit Reports 15 minutes DQO Training Course Day 1 Module 2 Presenter: Sebastian Tindall.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
The Scientific Method Formulation of an H ypothesis P lanning an experiment to objectively test the hypothesis Careful observation and collection of D.
1 of 21 Introduction to the EPA 7-Step DQO Process DQO Training Course Day 1 Module 7 (30 minutes) Steps Presenter: Sebastian Tindall.
1 of 39 The EPA 7-Step DQO Process Step 7 - Optimize Sample Design DQO Case Study 45 minutes Presenter: Sebastian Tindall DQO Training Course Day 3 Module.
I Introductory Material A. Mathematical Concepts Scientific Notation and Significant Figures.
1 of 36 Managing Uncertainty with Systematic Planning for Environmental Decision Making 3-Day DOE DQO Training Day 1.
1 of 49 Key Concepts Underlying DQOs and VSP DQO Training Course Day 1 Module minutes (75 minute lunch break) Presenter: Sebastian Tindall.
Audit Sampling: An Overview and Application to Tests of Controls
1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4.
1 of 40 The EPA 7-Step DQO Process Step 2 - Identify the Decisions Presenter: Sebastian Tindall (30 minutes) DQO Training Course Day 2 Module 12.
1 of 32 Systematic Planning for Environmental Decision-Making DOE EM-3 Day 2 DQO Training Colorado Department of Public Health & Environment EPA Conference.
1 of 37 Key Concepts Underlying DQOs and VSP DQO Training Course Day 1 Module 4 (60 minutes) (75 minute lunch break) Presenter: Sebastian Tindall.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
1 of 50 The EPA 7-Step DQO Process Step 7 - Optimize Sample Design 60 minutes Presenter: Sebastian Tindall DQO Training Course Day 3 Module 16.
1 of 39 The EPA 7-Step DQO Process Step 3 - Identify Inputs (45 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 3.
Session Objectives To revisit the Audit Risk Model and Materiality concepts; To explain the Theory of Sampling as applied to audit To Explain the link.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Section 10.1 Confidence Intervals
1 of 35 The EPA 7-Step DQO Process Step 2 – Identify the Decision Presenter: Sebastian Tindall 15 minutes (75 minute Lunch break) DQO Training Course Day.
Introduction to the EPA 7-Step DQO Process
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 of 36 The EPA 7-Step DQO Process Step 6 - Specify Error Tolerances (60 minutes) (15 minute Morning Break) Presenter: Sebastian Tindall DQO Training Course.
Academic Research Academic Research Dr Kishor Bhanushali M
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
1 of 27 The EPA 7-Step DQO Process Step 5 - Define Decision Rules (15 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 5.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Statistics for Decision Making Basic Inference QM Fall 2003 Instructor: John Seydel, Ph.D.
Copyright © 2009 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 of 86 The EPA 7-Step DQO Process Step 7 - Optimize Sample Design (70 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 7.
1 of 39 How Many Samples do I Need? Part 3 Presenter: Sebastian Tindall (50 minutes) (5 minute “stretch” break) DQO Training Course Day 1 Module 6.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
1 of 19 Managing Uncertainty with Systematic Planning for Environmental Decision-Making 3-Day DQO Training Day 2.
1 of 31 The EPA 7-Step DQO Process Step 6 - Specify Error Tolerances 60 minutes (15 minute Morning Break) Presenter: Sebastian Tindall DQO Training Course.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 of 48 The EPA 7-Step DQO Process Step 6 - Specify Error Tolerances 3:00 PM - 3:30 PM (30 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course.
Statistics 19 Confidence Intervals for Proportions.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
Sampling and Sampling Distribution
Applied Statistics and Probability for Engineers
Confidence Intervals for Proportions
Day 2 DQO Training Course Module 2 The EPA 7-Step DQO Process
Confidence Intervals for Proportions
Confidence Intervals for Proportions
CONCEPTS OF ESTIMATION
Confidence Intervals for Proportions
Confidence Intervals for Proportions
The samples and the Error
Presentation transcript:

1 of 27 How Many Samples do I Need? Part 2 Presenter: Sebastian Tindall (60 minutes) (5 minute “stretch” break) DQO Training Course Day 1 Module 5

2 of 27 Analytical + Sampling & Sub-sampling + Natural heterogeneity of the site = Total Uncertainty Uncertainty is Additive! Remember the uncertainty is additive for all steps in sampling and analysis

3 of 27 What is the one phenomenon that causes ALL sampling error? HETEROGENEITY

4 of 27 The SYSTEM functions as if it believes that… { Data Uncertainty Automatically Managed Data Quality Prescriptive Analytical Methods = { Analytical Uncertainty Automatically Managed Decision Quality = { Decision Uncertainty Automatically Managed

5 of 27 Take-Home MessageNon- Representative Sample PerfectAnalyticalChemistry+ “BAD” DATA

6 of 27 Sample vs. Analytical Certainty Analytical = 5% TOTAL ERROR Sampling = 95% 1,280 Onsite 1,220 Lab 331 Onsite 286 Lab 500 Onsite 416 Lab 164 Onsite 136 Lab 24,000 Onsite 27,700 Lab 27,800 Onsite 42,800 Lab 39,800 Onsite 41,400 Lab

7 of 27 Dilemma! Average is not specific to an area, or sub-sample size. None of the equations for the number of samples, or the average or standard deviation include size/area Some guidance suggests 1 sample/20 cu yd but this is indefensible Must decide on the scale of the decision or exposure unit to represent the population of interest Must sample within the scale of the decision unit

8 of 27 Typical Sampling Design EPA “ Methods for Evaluating the Attainment of Soil Cleanup Standards - Vol 1”, 1989 Equation 6.6 Wrong Often Usually way off or unknown Assumed Normal Distribution

9 of 27 n  5 1. Will usually fail to truly capture heterogeneity…. of population(s) 2. Results in large uncertainty which is seldom: - Identified - Quantified - or even Acknowledged Typical Sampling Design (cont.)

10 of 27 Uncertainty M o = M d = M n Normal Mo  Md  MnMo  Md  Mn Lognormal M 0 = mode M d = median M n = mean % of time when x <  is high, (when n is small)

11 of 27 Why Decisions are suspect n Failure to define population accurately n Failure to obtain representative data from the population of interest n Failure to accurately determine the frequency distribution of the COPCs n Failure to select the appropriate statistical method for generating adequate samples n Failure to use the appropriate UCL in making the decision

12 of 27 Definitions of Representativeness n A sample is representative when the mean square, r 2 SE, of Sampling Error (SE) is not larger than a certain standard of representativeness regarded as acceptable. n Representativeness is the sum of the square of the mean of SE (m SE ), and the variance of the SE (s 2 SE ). r 2 SE  m 2 SE + s 2 SE <= r 2 oSE Preparation of Soil Sampling Protocols: Sampling Techniques and Strategies, EPA/600/R-92/128, July 1992

13 of 27 Definitions of Representativeness n A sample collected in such a manner that the sampling error is less than a specified amount. n A sample of a universe or whole that can be expected to exhibit the average properties of the universe or whole (40 CFR ). n A sample that answers a question about a population with a specified confidence Sampling for Environmental Activities, Envirostat, 2003

14 of 27 Definitions of Representativeness n Representativeness expresses the degree to which sample data accurately and precisely represents a characteristic of a population, parameter variations at a sampling point, or an environmental condition. Representativeness is a qualitative parameter which is most concerned with the proper design of the sampling program. The representativeness criterion is best satisfied by making certain that sampling locations are selected properly and a sufficient number of samples are collected. Representativeness is addressed by describing sampling techniques and the rational used to select sampling locations. DQOs for Remedial Response Activities: Development Process, US EPA 1987

15 of 27 Quantity of Data Matters. Why? WARNING: The Statistician General has determined that drawing conclusions from insufficient data may be hazardous to your decisions.

16 of 27 Sample Size Rules of Thumb “Samples of less than 10 are usually too small to rely on sample estimates even in ‘nice’ parametric cases.” “In many practical contexts, the number 30 is used as a ‘minimum’ sample size.” M.R. Chernick in Bootstrap Methods: A Practitioner's Guide, 1999, pp. 150, 151.

17 of 27 Sample Size Rules of Thumb “Although it is always dangerous to set ‘rules of thumb’ for sample sizes, I would suggest that in most cases it would be wise to take n ≥ 50.” M.R. Chernick in Bootstrap Methods: A Practitioner's Guide, 1999, p. 151.

18 of 27 Sample Size Rules of Thumb “For practical purposes it will be assumed here that a ‘too small number’ is less than 30, and a ‘large number’ is at least 50.” Pierre Gy in Sampling for Analytical Purposes, 1998, p. 70.

19 of 27 Sample Size Rules of Thumb “In practice, there appears to be no simple rule for determining how large n should be….If the distribution is highly skewed, an n of 50 or more may be required.” Richard Gilbert in Statistical Methods for Environmental Pollution Monitoring, 1987, p. 140.

20 of 27 Quantity of Data Matters. Why? “If the sample size is ‘large’ then most traditional estimators will yield the same conclusions and simple estimators suffice.” H. Lacayo, Jr. in Environmental Statistics: Handbook of Statistics Volume 12, 1994, p. 891.

21 of 27 “Lacking distribution information, it is impossible to devise an optimal sampling strategy.” - Jenkins, et. al “Assessment of Sampling Error Associated with Collection and Analysis of Soil Samples at Explosives-Contaminated Sites” U.S. Army Corps of Engineers, Cold Regions Research & Engineering Laboratory, p. 1.

22 of 27 How Many Samples do I Need? Begin with the End in Mind Optimal Sampling Design Alternative Sample Designs , , ,  Correct Equation for n (Statistical Method) Population Frequency Distribution Contaminant Concentrations in the Spatial Distribution of the Population The end DATA

23 of 27 Q: Where do you obtain the contaminant distribution information in order to select the correct sampling design to ensure representativeness, etc? A: From sampling data. Q: How much sampling data do you need? A: Depends upon the consequences of making the wrong decision.

24 of 27 Sample Representativeness Are we honestly addressing Heterogeneity (sampling uncertainty)? n Now we are finally able to address this issue, defensibly and affordably ! n Use cheaper analytical technologies that allow you to increase sample density n Use real-time measurements at the site of the sample to support real-time decision-making n IF willing to honestly balance analytical uncertainty against overall data uncertainty

25 of 27 The TRIAD Approach Systematic Planning Dynamic Work Plans Real-Time Measurement Technologies

26 of 27 Unifying Concept for TRIAD: Managing Uncertainty n Systematic planning –Identify decision goals w/ tolerable overall uncertainty –Identify major uncertainties (cause decision error) –Identify strategy to manage each major uncertainty n Use the Field Analytical Method (FAM) and a Dynamic Work Plan (DWP) to effectively manage sampling uncertainty (sample representativeness) n Use various strategies to manage analytical uncertainty when using FAM

27 of 27 End of Module 5 Thank you Questions? We will now take a 5-minute Afternoon “Stretch” Break. Please be back in 5 minutes