Download presentation
Presentation is loading. Please wait.
Published byNeil Lucas Modified over 9 years ago
1
1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4
2
2 of 45 Topics to Discuss in Module 4 n How many samples based on –Census –Sampling n Types of decision error n Definitions of common statistical terms
3
3 of 45 How Many Samples do I Need? n = 5 Quick & Dirty Method n = (total $) ($ per sample) Budget Method
4
4 of 45 What is the underlying variation in the material being sampled? How Many Samples do I Need? It depends! What is the decision? What is the tolerance for mistakes? How will the data be used?
5
5 of 45 How Many Samples do I Need? (The Real Answer) Just Enough!
6
6 of 45 How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!
7
7 of 45 Decisions with Absolute Certainty n Requires knowing the “true condition” of the population in question –Perform a census n Collect and analyze every possible member of the population in question
8
8 of 45 n Population –Universe of items (elements) within the spatial boundary n All the possible soil samples in the Smith’s backyard n All the people in the U.S.A. –Translation: you have to count/measure (sample) EVERY single member of the population Decisions with Absolute Certainty (cont.)
9
9 of 45 Football Field One-Acre Football Field 30'0"
10
10 of 45 Number of Samples in a One-Acre Field...there are = 1,000,000 possible surface soil samples in a one-acre field. If one surface soil sample = 2.5” x 2.5” x 6” deep, then…. The perimeter of a one-acre field measures 272.25 feet by 160 feet. How many surface soil samples can I take from a one-acre field?
11
11 of 45 Cost of Sampling Entire One-Acre Field How much would it cost to know the true condition of the one-acre field? If it costs $3000 to test one surface soil sample, it would cost $3,000,000,000 to test all possible population units.
12
12 of 45 Testing All Possible Samples CENSUS n Testing all possible population units (samples) is the ONLY way to know the true condition of the site with absolute certainty n However, time and money considerations usually prevent us from doing this
13
13 of 45 Decisions with Absolute Certainty n Perform a census –totally impractical n Therefore, we can never make a decision with absolute certainty n So what’s left to do?
14
14 of 45 Testing a Few Samples (from the larger population) ESTIMATION n Estimates of the true condition of the site are usually made from a few (representative) samples –Taking a few samples (making a few measurements) and using them to represent the site –Make inferences (even sweeping claims) about the population of interest based on these few samples
15
15 of 45 The Process of Estimation n An estimate is just an educated guess based on incomplete information n Educated guesses will be wrong, to some degree n In other words, the process of estimation contains inherent errors
16
16 of 45 Estimation Errors n Are NOT mistakes. They do not suggest that anything was done improperly n Are an inherent part of the process of estimation n Are simply deviations from the true condition of the site n Introduce uncertainty into the decision- making process
17
17 of 45 Consequences of Uncertainty n Decision errors are true mistakes n Examples: –Walking away from a dirty site –Cleaning up a clean site n Decision errors can be managed Estimation ErrorsDecision Errors
18
18 of 45 Decision Errors n Are acceptable or tolerable …within limits n We set tolerable limits on the percentage of time we are willing to: –Walk away from a dirty site –Clean up a clean site
19
19 of 45 Where do errors occur? Planning Sampling Analysis Data Vs Decision
20
20 of 45 Definition of Terms n Population –Everyone or everything of interest –Example: All the people in this class n Sample –Some subset of the population –Example: Five people randomly chosen from the class
21
21 of 45 Definition of Terms n Population Parameter –The true value of the population characteristic (e.g., age) that can only be known if all possible samples are measured –Example: true mean age of all the people in the class, calculated using data from every member of the population n Sample Statistic –The estimated value of the population characteristic that is calculated from sample data –Example: estimate of the true mean age of all people in the class, calculated using data from a subset (sample) of the population
22
22 of 45 Comparison n Population Parameter –Represents “true condition” of the population –Decisions can be made with 100% certainty (0% uncertainty) n Sample Statistic –Represents “estimated condition” of the population –Decision cannot be made with 100% certainty
23
23 of 45 Class Question? n What is the true mean age in this class? n What is the estimated mean age in this class? –Randomly select 5 ages n 2nd estimated mean age in this class? –Randomly select 15 ages (See Computer Age Demo)
24
24 of 45 True Mean Age of All the People in This Class n In this case - where we are only interested in measuring a small group of people who are all in the same room at the same time - it is not too difficult to determine the true mean age with 100% certainty. But: –What if some people failed to respond? –What if some people “fudged” a little? –What if some of the response forms got lost?
25
25 of 45 Types of Decision Errors n Before we can talk about acceptable limits for making decision errors, we must first understand what correct decisions and decision errors look like and define some terms n There are two types of correct decisions and two types of decision errors that can be made
26
26 of 45 Chance of Deciding Site is Dirty 1.0 0.5 0.0 6 pCi/g Action Level Low True Mean 226 Ra concentration High Ideal Decision Rule Graph of Perfect Decision Making
27
27 of 45 Chance of Deciding Site is Dirty 1.0 0.5 0.0 6 pCi/g Action Level Low True Mean 226 Ra Concentration High Typical Curve Graph of Typical Decision Making
28
28 of 45 Site is dirtySite is clean 100 True State of Site Alternative Action Walk away from siteClean up site 75 Probability of deciding that the site is dirty 0.0 0.5 1.0 Action LevelLower Bound of Gray Region Typical Curve Null Hypothesis: The Site is dirty. Decision Performance Goal Diagram True mean COPC Concentration The Gray Region
29
29 of 45 Is Site dirty?Is Site clean? Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site 95 UCL% COPC Concentration DL PSQ ∞ 75 X A 100 Action Level 95 UCL 1AUCL 1B 110
30
30 of 45 Is Site dirty?Is Site clean? Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site 95 UCL% COPC Concentration DL PSQ ∞ 110 X B 100 Action Level UCL B 120
31
31 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Conclusion: Site is dirty. Action: Clean up a dirty site. A correct decision. Sample Mean UCL True Mean Deviation
32
32 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Sample Mean UCL True Mean Conclusion: Site is clean. Action: Walk away from a dirty site. An incorrect decision. Deviation
33
33 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Sample Mean UCL True Mean Conclusion: Site is clean. Action: Walk away from a clean site. A correct decision. Deviation
34
34 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Sample Mean UCL True Mean Deviation Conclusion: Site is dirty. Action: Clean up a clean site. An incorrect decision.
35
35 of 45 100 True State of Site Alternative Action 75 Probability of deciding that the True Mean is greater that or equal to the Action Level 0.0 0.5 1.0 True Mean Sample Mean UCL Deviation Action Level Lower Bound of Gray Region Null Hypothesis: The Site is dirty. Walk away from siteClean up site True mean COPC Concentration Site is dirtySite is clean The Gray Region When the True Mean is well above the Action Level...... and it is highly likely that we will correctly decide to clean up a dirty site.... then there should be high a probability that the Sample Mean UCL will also be above the Action Level...
36
36 of 45 1.0 Null Hypothesis: The Site is dirty. 100 True State of Site Alternative Action 75 Probability of deciding that the site is dirty 0.0 0.5 True Mean Sample Mean UCL Deviation Action Level Lower Bound of Gray Region Site is dirtySite is clean The Gray Region... then there should be a very low probability that the Sample Mean UCL will be above the Action Level... Walk away from siteClean up site True mean COPC Concentration... and it is highly unlikely that we will incorrectly decide to clean up a clean site. If the True Mean is well below the Lower Bound of the Gray Region...
37
37 of 45 Null Hypothesis: The Site is dirty. 100 True State of Site Alternative Action 75 Probability of deciding that the site is dirty 0.0 0.5 1.0 True Mean Sample Mean UCL Deviation Action Level Lower Bound of Gray Region Walk away from siteClean up site True mean COPC Concentration... and that we will agree to incorrectly decide to clean up a clean site. Site is dirtySite is clean When the True Mean is IN the gray region…..... then there is an increased probability that the Sample Mean UCL will be above the Action Level... The Gray Region
38
38 of 45 Site is dirtySite is clean 100 True State of Site Alternative Action Walk away from siteClean up site 75 Probability of deciding that the site is dirty 0.0 0.5 1.0 Action LevelLower Bound of Gray Region Typical Curve Null Hypothesis: The Site is dirty. Decision Performance Goal Diagram True mean COPC Concentration The Gray Region
39
39 of 45 Sampling and Analyses Cost Unnecessary Disposal and/or Cleanup Cost $$ Sampling and Analyses Cost Threat to Public Health and Environment $$ PRP 1 FocusRegulatory 1 Focus Managing Uncertainty is a Balancing Act
40
40 of 45 Key Points n We will never know the true condition of the site - time and money prevent this n Therefore we must estimate the true condition through sampling n Estimates based on samples are not factual statements about the site. They are educated guesses n Estimates must be in error - because they use incomplete information
41
41 of 45 n Errors are not mistakes - just deviations from the truth n Errors (deviations) introduce uncertainty into the decision-making process n Errors and uncertainty can be managed so that you can still get the job done and prove that you did it Key Points (cont.)
42
42 of 45 n The DQO Process is designed to help you manage uncertainty and: –Get the job done efficiently –Prove that you did it defensibly Key Points (cont.)
43
43 of 45 Primary Benefit of the DQO Process: Managing uncertainty through “FAILING TO PLAN….. IS PLANNING TO FAIL”
44
44 of 45 How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!
45
45 of 45 Summary of Parts 1, 2, 3 will be at the end of Module 6 End of Module 4 Questions? Thank you We will now take a 15 minute break. Please be back in 15 minutes.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.