1 of 45 How Many Samples do I Need? Part 3 Presenter: Sebastian Tindall 60 minutes DQO Training Course Day 1 Module 6.

Slides:



Advertisements
Similar presentations
Presented by Shannon H. McDonald, P.G. August 4, 2010.
Advertisements

Lessons Learned Multi Incremental Sampling Alaska Forum on the Environment February, 2009 Alaska Department of Environmental Conservation.
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
How Many Samples do I Need? Part 2
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Confidence Intervals for Proportions
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 18, Slide 1 Chapter 18 Confidence Intervals for Proportions.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chapter 19: Confidence Intervals for Proportions
Chemometrics Method comparison
Chapter 10 Hypothesis Testing
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
1 of 23 From Qualitative Concept to Practical Implementation Evolution of the Data Quality Objectives Concept DQO Training Course Day 1 Module 1 15 minutes.
Incremental Sampling Methodology (ISM) Part 1 - Introduction to ISM – Jeffrey E. Patterson Jeffrey E. Patterson – TCEQ, Technical Specialist, Superfund.
Determining Sample Size
1 of 35 The EPA 7-Step DQO Process Step 4 - Specify Boundaries (30 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 4.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination CHAPTER Eleven.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
CHAPTER 20: Inference About a Population Proportion ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
1 of 39 The EPA 7-Step DQO Process Step 7 - Optimize Sample Design DQO Case Study 45 minutes Presenter: Sebastian Tindall DQO Training Course Day 3 Module.
1 of 49 Key Concepts Underlying DQOs and VSP DQO Training Course Day 1 Module minutes (75 minute lunch break) Presenter: Sebastian Tindall.
Laboratory QA/QC An Overview.
1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
1 of 32 Systematic Planning for Environmental Decision-Making DOE EM-3 Day 2 DQO Training Colorado Department of Public Health & Environment EPA Conference.
1 of 37 Key Concepts Underlying DQOs and VSP DQO Training Course Day 1 Module 4 (60 minutes) (75 minute lunch break) Presenter: Sebastian Tindall.
1 of 50 The EPA 7-Step DQO Process Step 7 - Optimize Sample Design 60 minutes Presenter: Sebastian Tindall DQO Training Course Day 3 Module 16.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
1 of 39 The EPA 7-Step DQO Process Step 3 - Identify Inputs (45 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 3.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
Section 10.1 Confidence Intervals
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Introduction to the EPA 7-Step DQO Process
Chapter Thirteen Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 of 36 The EPA 7-Step DQO Process Step 6 - Specify Error Tolerances (60 minutes) (15 minute Morning Break) Presenter: Sebastian Tindall DQO Training Course.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Copyright © 2009 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
1 of 39 How Many Samples do I Need? Part 3 Presenter: Sebastian Tindall (50 minutes) (5 minute “stretch” break) DQO Training Course Day 1 Module 6.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 of 19 Managing Uncertainty with Systematic Planning for Environmental Decision-Making 3-Day DQO Training Day 2.
1 of 27 How Many Samples do I Need? Part 2 Presenter: Sebastian Tindall (60 minutes) (5 minute “stretch” break) DQO Training Course Day 1 Module 5.
1 of 31 The EPA 7-Step DQO Process Step 6 - Specify Error Tolerances 60 minutes (15 minute Morning Break) Presenter: Sebastian Tindall DQO Training Course.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
1 of 7 Exercise 4 Hypothesis Testing: Beta Error 30 minutes Presenter: Sebastian Tindall DQO Training Course Day 2.
Chapter Eleven Sample Size Determination Chapter Eleven.
1 of 48 The EPA 7-Step DQO Process Step 6 - Specify Error Tolerances 3:00 PM - 3:30 PM (30 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Statistics 19 Confidence Intervals for Proportions.
THEORY OF SAMPLING MMEA Certainty Seminar Markku Ohenoja 1 Markku Ohenoja / Control Engineering Laboratory
SUR-2250 Error Theory.
Confidence Intervals for Proportions
Confidence Intervals for Proportions
Confidence Intervals for Proportions
Confidence Intervals for Proportions
Presentation transcript:

1 of 45 How Many Samples do I Need? Part 3 Presenter: Sebastian Tindall 60 minutes DQO Training Course Day 1 Module 6

2 of 45 How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!

3 of 45 Sampling for Environmental Activities Chuck Ramsey EnviroStat, Inc. PO Box 636 Fort Collins, CO fax

4 of 45 Sampling for Analytical Purposes Pierre Gy Translated by A.G. Royle John Wiley & Sons 1998 ISBN:

5 of 45 Pierre Gy’s Sampling Theory and Sampling Practices, 2 nd Edition Francis F. Pitard CRC Press 1993 ISBN: Heterogeneity, Sampling Correctness, and Statistical Process Control

6 of 45 Seven Major Sampling Errors n Fundamental Error - FE n Grouping and Segregation Error - GSE n Materialization Error - ME –Delimination Error - DE –Extraction Error - EE n Preparation Error - PE n Trends - CE 2 n Cycles - CE 3

7 of 45 Seven Major Sampling Errors SE = FE + GSE + DE + EE + PE + CE 2 +CE 3

8 of 45 Ramsey’s “Rules” n All measurements are an average n With discreet sampling, the sample average is a random variable n With discreet sampling, the sample SD is an artifact of the sample collection process

9 of 45 Ramsey’s “Rules” n Heterogeneity is the rule n Multi-increment sampling can drive a skewed distribution towards normal (by invoking the CLT) n FE 2 –proportional to particle size –inversely proportional to mass n Lab data are suspect (error can be large)

10 of 45 Ramsey’s “Rules” (cont.) n Good sampling technique is critical n Typical sample sizes will underestimate the mean n Quality control (QC) is important — NO boiler plate; (e.g., PARCC) — QC must be problem specific n Maximize the use of onsite analysis to guide planning & decisions n DQOs are the most important component of the process

11 of 45 Ramsey’s “Rules” (cont.) n One measurement is a crap shoot: –Tremendous heterogeneity (variability) between: n Particles within a sample n Aliquots of a sample n Duplicate samples n Never take ONE grab sample to base a decision –Always collect X increments and use AT LEAST one multi-increment sample to make the decision

12 of 45 Average Exposure In discreet sampling: the sample mean is a random variable. the 95% UCL is a random variable. the sample range is a random variable. the sample standard deviation is a random variable the sample standard deviation is an artifact of sample collection process. n (# samples) is NOT proportional to the size of the population (e.g. area, mass, or volume).

13 of 45 Average A = 16 ppm Average B = 221 ppm Average from discrete sampling is a random variable A B A B A A A A B B B B B A A Average depends on locations sampled

14 of 45 Hot Spots 1,000,000 g at site 100,000 g > AL Take 10 samples 1> AL Remove that 1 Re-sample = clean Wrong! If 100,000 >AL Minus 1 Still 99,999>AL x AL= action level

15 of 45 Hot Spots Simply Means: “I want to look at units (e.g. Mass, volume) that are becoming smaller and smaller and smaller and smaller and smaller and smaller and smaller” $ $ $ $ $ $ $ $ $

16 of 45 Additional Population Considerations Sample support - “physical size, shape and orientation of the material that is extracted from the sampling unit that is actually available to be measured or observed, and therefore, to represent the sampling unit.” Assure enough sample for analyses Specify how the sample support will be processed and sub-sampled for analysis. EPA Guidance on Choosing a Sampling Design for Environmental Data Collection, EPA QA/G-5S, December 2002, EPA/240/R-02/005

17 of 45 Sub-Sampling The DQO must define what represents the population in terms of laboratory sample size: Typical laboratory sample sizes that are digested or extracted: metals - 1g, volatiles - 5g, semi-volatiles - 30 g The 1g or 30g sample analyzed by the lab is supposed to represent a larger area/mass (e.g., acre). Does it?

18 of 45 Multi-Increment Sampling is the Way to Go Next slides show “How to” perform multi-increment sampling

19 of 45 n = m * k k = 3 m = 2 FAM/Laboratory Collect “n” samples Group into “k” increments Combine “k” into “m” multi- increments Remember; we want the AVERAGE over the Decision Unit

20 of 45 Multi-Increment Sampling n = number of samples required k = increments m= samples analyzed n=m*kMass of mMass of kTotal Mass sent to lab 100= *11 Kg1000 g100 Kg 100=50*21 Kg500 g50 Kg 100=25*41 Kg250 g25 Kg 100=20*51 Kg200 g20 Kg 100=10* 1 Kg100 g10 Kg 100=5*201 Kg50 g5 Kg 100=4*251 Kg40 g4 Kg 100=2*501 Kg20 g2 Kg 100=1* 1 Kg10 g1 Kg

21 of 45 Multi-Increment Sampling is the Way to Go exposure unit = decision unit [DU] (1) calc d & FE & mass (2,3,4) 10 scoops (5) Samples & QC (6) Lab (7) Grind (9) Re-Calculate particle size (8) Sub sample mass for lab analysis (10) Analyze entire sub sample (11) Average concentration for DU (12,13)

22 of 45 Multi-Increment Sampling is the Way to Go 1.Agree on exposure unit or decision unit. 2.Select or measure a reasonable maximum sample particle size. 3.Select the FE. 4.Calculate the mass of sample needed based on the FE and particle size. 5.Select n, m, & k 6.Using a square scoop large enough to capture the maximum particle size, collect enough sample increments (k) to equal the mass calculated in #4 and place in a jar, combining increments into one “sample”. 7.Repeat within a given decision unit to produce replicates (duplicate, triplicates, etc.) to generate QC “samples”. 8.Deliver the sample and QC sample(s) to the lab (m).

23 of 45 Multi-Increment Sampling is the Way to Go, continued 9. Calculate the particle size of sample needed based on the desired sub- sampling FE and the mass that the lab normally uses for a given analysis (extraction). 10. Lab may have to grind entire mass of field sample (& QCs) to the agreed upon maximum analytical particle size in # Lab must perform one-dimensional sub-sampling of entire mass [spread entire ground sample on flat surface in thin layer, then systematically or randomly collect sufficient small mass sub-sampling increments to equal the mass the laboratory requires for an analysis; do likewise for each QC sample]. 12. Combine sub-sampling increments into the “sample”, then digest/extract/analyze the sample and QC samples. 13. Calculate the COPC concentration from each sample. 14. Concentration represents average concentration or activity per decision unit.

24 of 45 Comparison of Discrete vs. Multi- Increment Remember: (In discreet sampling ) 1.An average is a random variable; 2.The SD is an artifact of the sample collection process.

25 of 45 SHOW VDT File X-bar as Random Variable

26 of 45 Effects of Grinding a Soil Walsh, Marianne E.; Ramsey, Charles A.; Jenkins, Thomas F., The Effect of Particle Size Reduction by Grinding on Subsampling Variance for Explosives Residues in Soil, Chemosphere 49 (2002)

27 of 45 Fundamental Error FE = fundamental error M = mass of sample (g) d = maximum particle size <5% oversize (cm) M d FE  ~ EPA/600/R-92/128, July 1992

28 of 45 Fundamental Error 22.5= ~ clfg  c - mineralogical factor - density factor (for soil ~ 2.5)  l - liberation factor (between 0 -1)  f - shape factor (for soil ~0.5)  g - granulometric factor ~0.25 M d FE  ~

29 of 45 Fundamental Error Solve for particle size Solve for mass of sample OR d FEM  )( FE d M 

30 of 45 Constant Particle Size 9217 gm20% 4097 gm30% Particle Size cm

31 of 45 Examples of FE, Mass, Particle Size

32 of 45 Examples of FE, Mass, Particle Size May not work well or at all with some media Clay Water Air

33 of 45 Example n Soil like material n Largest particle about 4 mm n Action limit is 500 ppm n Analytical aliquot is one gram n Is this acceptable? Compliments of EnviroStat, Inc.

34 of 45 Example (cont) Check particle size representatives FE percent = 120% Compliments of EnviroStat, Inc. EPA/600/R-92/128, July 1992 FE = = 1.2 FE percent = 1.2 * 100

35 of 45 Example (cont) What mass is required to reduce FE to 15%? But lab can analyze 10 grams at the most Compliments of EnviroStat, Inc.

36 of 45 Example (cont) To what particle size does the sample need to be reduced to achieve FE of 15%? Compliments of EnviroStat, Inc.

37 of 45 Example (cont) What is the FE to take 64 grams and grind it to 0.1 cm and take one gram? Ignoring all the other errors Compliments of EnviroStat, Inc.

38 of 45 Example (cont) n Option 1 –take at least 64 grams and grind to 0.1 cm –analyze one gram n Option 2 –take at least 64 grams and grind to 0.22 cm –analyze 10 grams n Other options –investigate/estimate sampling factors (clfg) Compliments of EnviroStat, Inc.

39 of 45 Multi-increment Sampling n Saves money by taking fewer samples to make decision n Eliminates the classical statistics obstacles n Samples are representative of population n Results are defensible n Does not excite the public n Faster n Cheaper

40 of 45 n All measurements are an average n In discreet sampling, n the sample average is a random variable n The sample range is a random variable n The sample UCL is a random variable n The sample standard deviation is a random variable n In discreet sampling, the SD is an artifact of the sample collection process n Heterogeneity is the rule n Multi-increment sampling can save your butt! n Multi-increment sampling can get you defensible data within your sampling & analyses budget Key Points

41 of 45 n Due to inherent heterogeneity, collecting representative sample is difficult n Managing Uncertainty approach and “Ramsey’s Rules” advocate –using cheaper, real-time, on-site methods –increasing sample density or coverage n Controlling laboratory analysis quality does not control all error n Errors occur in each step of the collection and analysis process Key Points (cont.)

42 of 45 n Managing Uncertainty approach encourages use of DWP to provide flexibility to obtain sufficient sample density n Larger the “mass”, the lower the sampling error n Smaller the “particle”, the lower the sampling error n Proper sub-sampling is critical n Sample design must assess the normal, skewed, and badly skewed distributions n For badly skewed computer simulations are needed n Multi-increment samples drive the distribution to normal Key Points (cont.)

43 of 45 How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!

44 of 45 Summary  Use Classical Statistical sampling approach: Very likely to fail to get representative data in most cases  Use Other Statistical sampling approaches: Bayesian Geo-statistics Kriging  Use M-Cubed Approach: Based on Massive FAM  Use Multi-Increment sampling approach: Can use classical statistics Cheaper Faster Defensible: restricted to surfaces (soils, sediments, etc.) MASSIVE DATA Required 

45 of 45 End of Module 6 Thank you Questions? Comments? This concludes our presentation for Day 1 See you here at 8:30 AM tomorrow for Day 2.