Putting Confidence Into Your Lab’s Results Alan Steele, Barry Wood & Rob Douglas National Research Council Ottawa, CANADA National.

Slides:



Advertisements
Similar presentations
Users Guide to the QDE Toolkit Pro National ResearchConseil national Council Canadade recherches Excel Tools for Presenting Metrological Comparisons by.
Advertisements

Developing a Questionnaire
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Measurements and Errors Introductory Lecture Prof Richard Thompson 4 th October 2007.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Determining Sample Size  The required sample size can be.
1 Introduction to the GUM ( International Guidelines for calculating and expressing uncertainty in measurement) Prepared by Les Kirkup and Bob Frenkel.
Statistics: Data Analysis and Presentation Fr Clinic II.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 10 th Edition.
Evaluating Hypotheses
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
8-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft.
Copyright ©2011 Pearson Education 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft Excel 6 th Global Edition.
A Statistical Approach to Method Validation and Out of Specification Data.
Uncertainties of measurement in EXCEL
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Statistics for Managers Using Microsoft® Excel 7th Edition
Determining the Size of
Inferential Statistics
University of Florida Mechanical and Aerospace Engineering 1 Useful Tips for Presenting Data and Measurement Uncertainty Analysis Ben Smarslok.
Statistics Introduction 1.)All measurements contain random error  results always have some uncertainty 2.)Uncertainty are used to determine if two or.
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
ITRI Industrial Technology Research Institute 2014 NCSL International Workshop and Symposium 1 Evaluation of Proficiency Testing Results with a Drifting.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.
Sample size vs. Error A tutorial By Bill Thomas, Colby-Sawyer College.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Confidence Interval Estimation
Unit 1 Accuracy & Precision.  Data (Singular: datum or “a data point”): The information collected in an experiment. Can be numbers (quantitative) or.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
Confidence in Metrology: At the National Lab & On the Shop Floor Alan Steele, Barry Wood & Rob Douglas National Research Council Ottawa, CANADA
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) Each has some error or uncertainty.
User’s Guide to the ‘QDE Toolkit Pro’ National ResearchConseil national Council Canadade recherches Excel Tools for Presenting Metrological Comparisons.
American Association for Laboratory Accreditation Practical Solutions to Traceability and Uncertainty in Accreditation Presented to CITAC-NCSLI Joint Workshop.
User’s Guide to the ‘QDE Toolkit Pro’ National ResearchConseil national Council Canadade recherches Excel Tools for Presenting Metrological Comparisons.
Fundamentals of Data Analysis Lecture 10 Management of data sets and improving the precision of measurement pt. 2.
Physics 270 – Experimental Physics. Standard Deviation of the Mean (Standard Error) When we report the average value of n measurements, the uncertainty.
ATMS 451: Instruments and Observations MWF 11:30 AM – 12:20 PM 310c ATG TuTh 10:30 AM – 12:20 PM 108 or 610 ATG** (be prepared for changes)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Conducting a User Study Human-Computer Interaction.
Make observations to state the problem *a statement that defines the topic of the experiments and identifies the relationship between the two variables.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Human-Computer Interaction. Overview What is a study? Empirically testing a hypothesis Evaluate interfaces Why run a study? Determine ‘truth’ Evaluate.
POOLED DATA DISTRIBUTIONS GRAPHICAL AND STATISTICAL TOOLS FOR EXAMINING COMPARISON REFERENCE VALUES Alan Steele, Ken Hill, and Rob Douglas National Research.
Chap 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers Using Microsoft Excel 7 th Edition, Global Edition Copyright ©2014 Pearson Education.
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Sample Size Determination
ATMS 451: Instruments and Observations MWF 11:30 AM – 12:20 PM 310c ATG TuTh 10:30 AM – 12:20 PM 108 (be prepared for changes)
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
User’s Guide to the ‘QDE Toolkit Pro’ National ResearchConseil national Council Canadade recherches Excel Tools for Presenting Metrological Comparisons.
Rick Walker Evaluation of Out-of-Tolerance Risk 1 Evaluation of Out-of-Tolerance Risk in Measuring and Test Equipment Rick Walker Fluke - Hart Scientific.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
Guidelines for building a bar graph in Excel and using it in a laboratory report IB Biology (December 2012)
User’s Guide to the QDE Toolkit Excel Tools for Presenting Metrological Comparisons by B.M. Wood, R.J. Douglas & A.G. Steele Guide Version 1.14 for QDE.
Chapter 7 Confidence Interval Estimation
User’s Guide to the ‘QDE Toolkit Pro’
User’s Guide to the ‘QDE Toolkit Pro’
Conducting a User Study
Just What Is Science Anyway???
The Scientific Method.
A Pragmatic Method for Pass Fail Conformance Reporting
Chapter 4 Statistics.
User’s Guide to the ‘QDE Toolkit Pro’
Presentation transcript:

Putting Confidence Into Your Lab’s Results Alan Steele, Barry Wood & Rob Douglas National Research Council Ottawa, CANADA National ResearchConseil national Council Canadade recherches

Slide 2 Steele Wood and Douglas: Confidence NCSL International, July 2001 Outline Comparison Measurements –interpretation of results –proficiency testing for accreditation Probability Calculus –confidence intervals –confidence levels A Toolkit for Excel –some Visual Basic Code A Worked Example –with real comparison data Conclusions

Slide 3 Steele Wood and Douglas: Confidence NCSL International, July 2001 Proficiency Testing Accreditation bodies routinely specify that “proficiency testing” on a regularly scheduled basis is a requirement for maintaining accreditation Usually the Pilot Laboratory for the comparison is the National Metrology Institute Usually the Pilot Laboratory result is taken as the comparison reference value, and the participants’ are evaluated against this “truth” This is a time-consuming and expensive exercise!

Slide 4 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparisons Measurement comparisons provide the main experimental evidence for “equivalence” In general, all participants measure a common artifact and their various results are analyzed from a single common perspective The participants may be different laboratories, or different measurement stations on your shop floor

Slide 5 Steele Wood and Douglas: Confidence NCSL International, July 2001 Key Comparisons and NMIs National Metrology Institutes have recently signed a “Mutual Recognition Arrangement” in which the validity of their Calibration and Measurement Capabilities is expressed The scientific underpinning for this arrangement is a series of “Key Comparisons” which are conducted at the very highest levels of metrology In practice, they are not much different from the proficiency tests already in general use among accredited laboratories around the world

Slide 6 Steele Wood and Douglas: Confidence NCSL International, July 2001 Reporting Results A metrologist reports a result in two parts –the mean value: m L –the uncertainty: u L The results are plotted as data points with error bars

Slide 7 Steele Wood and Douglas: Confidence NCSL International, July 2001 Uncertainty Budgets The ISO Guide to the Expression of Uncertainty of Measurement is widely used as the basis for formulating and publishing laboratory uncertainty statements regarding measurement capabilities “Error bars” are an intrinsically probabilistic description of our belief in “what will happen next time” based on what we have done in the past Flip x and y axes

Slide 8 Steele Wood and Douglas: Confidence NCSL International, July 2001 Probability Distributions An ISO Guide-compliant uncertainty statement means that the error bars represent the most expert opinion about the underlying normal (Gaussian) probability distribution The fancy name for working with these distributions is Probability Calculus In general, we are interested in integrals of the probability distribution Integration is only “fancy addition”

Slide 9 Steele Wood and Douglas: Confidence NCSL International, July 2001 Confidence Levels A confidence level is what we get upon integrating a probability distribution over a given range [a,b] The fractional probability of observing a value between a & b is the normalized integration of the probability distribution function in the range [a, b] This is just addition of all the ‘bits’ of the function between a & b 1   68%2   95%

Slide 10 Steele Wood and Douglas: Confidence NCSL International, July 2001 Confidence Intervals Remember: a confidence level is what we get by integrating the distribution over a given range [a,b] The confidence interval is the fancy name for the range associated with the confidence level The range [-1 ,+1  ] is the 68% confidence interval The range [-2 ,+2  ] is the 95% confidence interval 1   68%2   95%

Slide 11 Steele Wood and Douglas: Confidence NCSL International, July 2001 Why would you want to do this? Lots of time and energy (and expense!) is invested in creating a laboratory result in a comparison Getting the maximum amount of information from a measurement comparison is desirable You’d like to show off your “confidence” to colleagues (and auditors!) Quantifying things is what we do as metrologists Your clients may want specific quantified answers to questions of Demonstrated Equivalence based on your Proficiency Testing results

Slide 12 Steele Wood and Douglas: Confidence NCSL International, July 2001 How hard is it to do this? With normal distributions, the arithmetic is pretty easy You can try this for yourself and really see how it works… …or you can let us do it for you! We have generated simple expressions to help evaluate normal confidence levels and normal confidence intervals, using well known statistical methods developed over the last hundred years or so We have put these expressions into a Toolkit for Excel

Slide 13 Steele Wood and Douglas: Confidence NCSL International, July 2001 A Toolkit for Excel At NRC, we have written a Quantified Demonstrated Equivalence Toolkit for Microsoft Excel ® The Toolkit is freely available by contacting us at We’ll add you to our mailing list and send you a copy of the sample spreadsheet with the Toolkit, plus a “User’s Guide” in.pdf format

Slide 14 Steele Wood and Douglas: Confidence NCSL International, July 2001 Toolkit Functions and Macros The Toolkit contains Functions to: –calculate pair uncertainties (including correlations) –calculate weighted averages –calculate confidence levels –calculate confidence intervals The Toolkit contains Macros to: –generate bilateral “tables of equivalence” –generate bilateral “tables of confidence intervals” –generate bilateral “tables of confidence levels”

Slide 15 Steele Wood and Douglas: Confidence NCSL International, July 2001 Toolkit Philosophy and Operation Functions and Macros are built right in to the Spreadsheet, and work just like “regular” Excel components

Slide 16 Steele Wood and Douglas: Confidence NCSL International, July 2001 Toolkit Philosophy and Operation The code is written in Visual Basic You can examine the code to see how it works Long variableNames help to “self document” the programs You don’t have to look at the code or write your own functions to use the QDE Toolkit from NRC

Slide 17 Steele Wood and Douglas: Confidence NCSL International, July 2001 A Worked Example 13 Laboratories participated in a Proficiency Test at 10 k 

Slide 18 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison to the NMI: E n One common measure of success in Proficiency Tests is the “Normalized Error” This is the ratio of the laboratory deviation to the expanded uncertainty: E n (k=2) = abs(m Lab - m Ref )/sqrt(U Lab 2 + U Ref 2 ) Generally, the Laboratory “passes” when E n < 1 E n is a dimensionless quantity

Slide 19 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison to the NMI: QDC A quantified approach to Proficiency Tests is to ask the following question: What is the probability that a repeat comparison would yield results such that Lab 1’s 95% uncertainty interval encompasses the Pilot Lab value? We call this “Quantified Demonstrated Confidence” QDC is a dimensionless quantity expressed in %

Slide 20 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison to the NMI: E n vs QDC and are both dimensionless quantities E n and its interpretation as an acceptance criterion are difficult to explain to non-metrologists QDC and its numerical value are easily explained to non-metrologists Note that when E n = 1 (and U Ref << U Lab ) QDC = 50% Normalized ErrorQuantified Demonstrated Confidence

Slide 21 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison to the NMI: QDE 0.95 A different quantified approach to Proficiency Tests is to ask the following question: Within what confidence interval can I expect the Lab 1 value and the Pilot Lab value to agree, with a 95% confidence level? We call this “Quantified Demonstrated Equivalence” QDE 0.95 is a dimensioned quantity, same units as V

Slide 22 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison between Labs: Agreement We can ask similar questions about agreement between any two participants in the experiment: Within what confidence interval (in ppm) can I expect the Lab 1 value and the Lab 2 value to agree, with a 95% confidence level?

Slide 23 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison between Labs: Confidence What if we ask: What is the probability that a repeat comparison would yield results such that Lab 1’s 95% uncertainty interval encompasses Lab 2’s value? Or how about: What is the probability that a repeat comparison would yield results such that Lab 2’s 95% uncertainty interval encompasses Lab 1’s value?

Slide 24 Steele Wood and Douglas: Confidence NCSL International, July 2001 Comparison between Labs: Confidence The answers to these questions of Quantified Demonstrated Confidence are shown here

Slide 25 Steele Wood and Douglas: Confidence NCSL International, July 2001 Quantifying Equivalence What is the probability that a repeat comparison would have a Lab 2 value within Lab 1’s 95% uncertainty interval? Probability Calculus tells us the answer: QDC = 47% This is exactly the type of “awkward question” that a Client might ask! 95% interval

Slide 26 Steele Wood and Douglas: Confidence NCSL International, July 2001 Quantifying Equivalence What is the probability that a repeat comparison would have a Lab 1 value within Lab 2’s 95% uncertainty interval? Probability Calculus tells us the answer: QDC = 22% These subtly different “awkward” questions have very different “straightforward” answers! 95% interval

Slide 27 Steele Wood and Douglas: Confidence NCSL International, July 2001 Tricky things about Equivalence Equivalence is not transitive –Lab 1 and Lab 2 may both be “equivalent” to the Pilot, but not to each other! Equivalence is not commutative –we are asking two very different questions here! 95% interval QDC = 47% 95% interval QDC = 22%

Slide 28 Steele Wood and Douglas: Confidence NCSL International, July 2001 Conclusions You are already doing quite a bit of Probability Calculus when you present your results The arithmetic for quantified calculations is very straightforward when we have Normal Distributions Adding Statistical Confidence explicitly into your Lab’s results helps you to explain them to non-metrologists, and to present precisely what Proficiency Testing has demonstrated for: –equivalence from different National Laboratories –accreditation assessment –your clients –your factory floor