Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4.

Similar presentations


Presentation on theme: "1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4."— Presentation transcript:

1 1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4

2 2 of 45 Topics to Discuss in Module 4 n How many samples based on –Census –Sampling n Types of decision error n Definitions of common statistical terms

3 3 of 45 How Many Samples do I Need? n = 5 Quick & Dirty Method n = (total $)  ($ per sample) Budget Method

4 4 of 45 What is the underlying variation in the material being sampled? How Many Samples do I Need? It depends! What is the decision? What is the tolerance for mistakes? How will the data be used?

5 5 of 45 How Many Samples do I Need? (The Real Answer) Just Enough!

6 6 of 45 How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!

7 7 of 45 Decisions with Absolute Certainty n Requires knowing the “true condition” of the population in question –Perform a census n Collect and analyze every possible member of the population in question

8 8 of 45 n Population –Universe of items (elements) within the spatial boundary n All the possible soil samples in the Smith’s backyard n All the people in the U.S.A. –Translation: you have to count/measure (sample) EVERY single member of the population Decisions with Absolute Certainty (cont.)

9 9 of 45 Football Field One-Acre Football Field 30'0"

10 10 of 45 Number of Samples in a One-Acre Field...there are = 1,000,000 possible surface soil samples in a one-acre field. If one surface soil sample = 2.5” x 2.5” x 6” deep, then…. The perimeter of a one-acre field measures 272.25 feet by 160 feet. How many surface soil samples can I take from a one-acre field?

11 11 of 45 Cost of Sampling Entire One-Acre Field How much would it cost to know the true condition of the one-acre field? If it costs $3000 to test one surface soil sample, it would cost $3,000,000,000 to test all possible population units.

12 12 of 45 Testing All Possible Samples CENSUS n Testing all possible population units (samples) is the ONLY way to know the true condition of the site with absolute certainty n However, time and money considerations usually prevent us from doing this

13 13 of 45 Decisions with Absolute Certainty n Perform a census –totally impractical n Therefore, we can never make a decision with absolute certainty n So what’s left to do?

14 14 of 45 Testing a Few Samples (from the larger population) ESTIMATION n Estimates of the true condition of the site are usually made from a few (representative) samples –Taking a few samples (making a few measurements) and using them to represent the site –Make inferences (even sweeping claims) about the population of interest based on these few samples

15 15 of 45 The Process of Estimation n An estimate is just an educated guess based on incomplete information n Educated guesses will be wrong, to some degree n In other words, the process of estimation contains inherent errors

16 16 of 45 Estimation Errors n Are NOT mistakes. They do not suggest that anything was done improperly n Are an inherent part of the process of estimation n Are simply deviations from the true condition of the site n Introduce uncertainty into the decision- making process

17 17 of 45 Consequences of Uncertainty n Decision errors are true mistakes n Examples: –Walking away from a dirty site –Cleaning up a clean site n Decision errors can be managed Estimation ErrorsDecision Errors

18 18 of 45 Decision Errors n Are acceptable or tolerable …within limits n We set tolerable limits on the percentage of time we are willing to: –Walk away from a dirty site –Clean up a clean site

19 19 of 45 Where do errors occur? Planning Sampling Analysis Data Vs Decision

20 20 of 45 Definition of Terms n Population –Everyone or everything of interest –Example: All the people in this class n Sample –Some subset of the population –Example: Five people randomly chosen from the class

21 21 of 45 Definition of Terms n Population Parameter –The true value of the population characteristic (e.g., age) that can only be known if all possible samples are measured –Example: true mean age of all the people in the class, calculated using data from every member of the population n Sample Statistic –The estimated value of the population characteristic that is calculated from sample data –Example: estimate of the true mean age of all people in the class, calculated using data from a subset (sample) of the population

22 22 of 45 Comparison n Population Parameter –Represents “true condition” of the population –Decisions can be made with 100% certainty (0% uncertainty) n Sample Statistic –Represents “estimated condition” of the population –Decision cannot be made with 100% certainty

23 23 of 45 Class Question? n What is the true mean age in this class? n What is the estimated mean age in this class? –Randomly select 5 ages n 2nd estimated mean age in this class? –Randomly select 15 ages (See Computer Age Demo)

24 24 of 45 True Mean Age of All the People in This Class n In this case - where we are only interested in measuring a small group of people who are all in the same room at the same time - it is not too difficult to determine the true mean age with 100% certainty. But: –What if some people failed to respond? –What if some people “fudged” a little? –What if some of the response forms got lost?

25 25 of 45 Types of Decision Errors n Before we can talk about acceptable limits for making decision errors, we must first understand what correct decisions and decision errors look like and define some terms n There are two types of correct decisions and two types of decision errors that can be made

26 26 of 45 Chance of Deciding Site is Dirty 1.0 0.5 0.0 6 pCi/g Action Level Low True Mean 226 Ra concentration High Ideal Decision Rule Graph of Perfect Decision Making

27 27 of 45 Chance of Deciding Site is Dirty 1.0 0.5 0.0 6 pCi/g Action Level Low True Mean 226 Ra Concentration High Typical Curve Graph of Typical Decision Making

28 28 of 45 Site is dirtySite is clean 100 True State of Site Alternative Action Walk away from siteClean up site 75 Probability of deciding that the site is dirty 0.0 0.5 1.0 Action LevelLower Bound of Gray Region Typical Curve Null Hypothesis: The Site is dirty. Decision Performance Goal Diagram True mean COPC Concentration The Gray Region

29 29 of 45 Is Site dirty?Is Site clean? Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site 95 UCL% COPC Concentration DL PSQ ∞ 75 X A 100 Action Level 95 UCL 1AUCL 1B 110

30 30 of 45 Is Site dirty?Is Site clean? Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site 95 UCL% COPC Concentration DL PSQ ∞ 110 X B 100 Action Level UCL B 120

31 31 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Conclusion: Site is dirty. Action: Clean up a dirty site. A correct decision. Sample Mean UCL True Mean Deviation

32 32 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Sample Mean UCL True Mean Conclusion: Site is clean. Action: Walk away from a dirty site. An incorrect decision. Deviation

33 33 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Sample Mean UCL True Mean Conclusion: Site is clean. Action: Walk away from a clean site. A correct decision. Deviation

34 34 of 45 Is Site dirty?Is Site clean? 100 Decision-Making Procedure: Apply Decision Rule Alternative Action Walk away from siteClean up site Action Level 95 UCL% COPC Concentration DL PSQ ∞ Sample Mean UCL True Mean Deviation Conclusion: Site is dirty. Action: Clean up a clean site. An incorrect decision.

35 35 of 45 100 True State of Site Alternative Action 75 Probability of deciding that the True Mean is greater that or equal to the Action Level 0.0 0.5 1.0 True Mean Sample Mean UCL Deviation Action Level Lower Bound of Gray Region Null Hypothesis: The Site is dirty. Walk away from siteClean up site True mean COPC Concentration Site is dirtySite is clean The Gray Region When the True Mean is well above the Action Level...... and it is highly likely that we will correctly decide to clean up a dirty site.... then there should be high a probability that the Sample Mean UCL will also be above the Action Level...

36 36 of 45 1.0 Null Hypothesis: The Site is dirty. 100 True State of Site Alternative Action 75 Probability of deciding that the site is dirty 0.0 0.5 True Mean Sample Mean UCL Deviation Action Level Lower Bound of Gray Region Site is dirtySite is clean The Gray Region... then there should be a very low probability that the Sample Mean UCL will be above the Action Level... Walk away from siteClean up site True mean COPC Concentration... and it is highly unlikely that we will incorrectly decide to clean up a clean site. If the True Mean is well below the Lower Bound of the Gray Region...

37 37 of 45 Null Hypothesis: The Site is dirty. 100 True State of Site Alternative Action 75 Probability of deciding that the site is dirty 0.0 0.5 1.0 True Mean Sample Mean UCL Deviation Action Level Lower Bound of Gray Region Walk away from siteClean up site True mean COPC Concentration... and that we will agree to incorrectly decide to clean up a clean site. Site is dirtySite is clean When the True Mean is IN the gray region…..... then there is an increased probability that the Sample Mean UCL will be above the Action Level... The Gray Region

38 38 of 45 Site is dirtySite is clean 100 True State of Site Alternative Action Walk away from siteClean up site 75 Probability of deciding that the site is dirty 0.0 0.5 1.0 Action LevelLower Bound of Gray Region Typical Curve Null Hypothesis: The Site is dirty. Decision Performance Goal Diagram True mean COPC Concentration The Gray Region

39 39 of 45 Sampling and Analyses Cost Unnecessary Disposal and/or Cleanup Cost $$ Sampling and Analyses Cost Threat to Public Health and Environment $$ PRP 1  FocusRegulatory 1  Focus Managing Uncertainty is a Balancing Act

40 40 of 45 Key Points n We will never know the true condition of the site - time and money prevent this n Therefore we must estimate the true condition through sampling n Estimates based on samples are not factual statements about the site. They are educated guesses n Estimates must be in error - because they use incomplete information

41 41 of 45 n Errors are not mistakes - just deviations from the truth n Errors (deviations) introduce uncertainty into the decision-making process n Errors and uncertainty can be managed so that you can still get the job done and prove that you did it Key Points (cont.)

42 42 of 45 n The DQO Process is designed to help you manage uncertainty and: –Get the job done efficiently –Prove that you did it defensibly Key Points (cont.)

43 43 of 45 Primary Benefit of the DQO Process: Managing uncertainty through “FAILING TO PLAN….. IS PLANNING TO FAIL”

44 44 of 45 How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!

45 45 of 45 Summary of Parts 1, 2, 3 will be at the end of Module 6 End of Module 4 Questions? Thank you We will now take a 15 minute break. Please be back in 15 minutes.


Download ppt "1 of 45 How Many Samples do I Need? Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break) DQO Training Course Day 1 Module 4."

Similar presentations


Ads by Google