STATISTICAL TESTS AND ERROR ANALYSIS

Slides:



Advertisements
Similar presentations
Chapter 7 Statistical Data Treatment and Evaluation
Advertisements

Errors in Chemical Analyses: Assessing the Quality of Results
Chapter 5: Confidence Intervals.
Chapter 8 Estimation: Additional Topics
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapters 3 Uncertainty January 30, 2007 Lec_3.
Quality Control Procedures put into place to monitor the performance of a laboratory test with regard to accuracy and precision.
Limitations of Analytical Methods l The function of the analyst is to obtain a result as near to the true value as possible by the correct application.
Evaluating Hypotheses
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Chapter 7 Estimation: Single Population
Experimental Evaluation
CE 428 LAB IV Error Analysis (Analysis of Uncertainty) Almost no scientific quantities are known exactly –there is almost always some degree of uncertainty.
Statistics of repeated measurements
ANALYTICAL CHEMISTRY CHEM 3811
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Standard error of estimate & Confidence interval.
Statistics Introduction 1.)All measurements contain random error  results always have some uncertainty 2.)Uncertainty are used to determine if two or.
Chapter 6 Random Error The Nature of Random Errors
Review of normal distribution. Exercise Solution.
V. Rouillard  Introduction to measurement and statistical analysis ASSESSING EXPERIMENTAL DATA : ERRORS Remember: no measurement is perfect – errors.
Answering questions about life with statistics ! The results of many investigations in biology are collected as numbers known as _____________________.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 4.
The following minimum specified ranges should be considered: Drug substance or a finished (drug) product 80 to 120 % of the test concentration Content.
Estimation of Statistical Parameters
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Estimation in Sampling!? Chapter 7 – Statistical Problem Solving in Geography.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
PARAMETRIC STATISTICAL INFERENCE
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
Statistics and Quantitative Analysis Chemistry 321, Summer 2014.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Version 2012 Updated on Copyright © All rights reserved Dong-Sun Lee, Prof., Ph.D. Chemistry, Seoul Women’s University Chapter 5 Errors in Chemical.
Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY.
DATA ANALYSIS ERRORS IN CHEMICAL ANALYSIS Normal phrases in describing results of an analysis “pretty sure” “very sure” “most likely” “improbable” Replaced.
Chapter 5 Errors In Chemical Analyses Mean, arithmetic mean, and average (x) are synonyms for the quantity obtained by dividing the sum of replicate measurements.
CHEM2017 ANALYTICAL CHEMISTRY
Overview CONFIDENCE INTERVALS STUDENT’S T / T STATISTICS
Lecture 4 Basic Statistics Dr. A.K.M. Shafiqul Islam School of Bioprocess Engineering University Malaysia Perlis
I Introductory Material A. Mathematical Concepts Scientific Notation and Significant Figures.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Introduction to Analytical Chemistry
4- Data Analysis and Presentation Statistics. CHAPTER 04: Opener.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
ANALYTICAL PROPERTIES PART III ERT 207 ANALYTICAL CHEMISTRY SEMESTER 1, ACADEMIC SESSION 2015/16.
1 Tests of Significance In this section we deal with two tests used for comparing two analytical methods, one is a new or proposed method and the other.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Chapter 4 Statistics. Is my red blood cell count high today?
Chapter 4 Statistics Tools to accept or reject conclusion from experimental measurements Deal with random error only.
B. Neidhart, W. Wegscheider (Eds.): Quality in Chemical Measurements © Springer-Verlag Berlin Heidelberg 2000 U. PyellBasic Course Experiments to Demonstrate.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Quality Control: Analysis Of Data Pawan Angra MS Division of Laboratory Systems Public Health Practice Program Office Centers for Disease Control and.
ERT 207 Analytical Chemistry ERT 207 ANALYTICAL CHEMISTRY Dr. Saleha Shamsudin.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Chapter 4 Exploring Chemical Analysis, Harris
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Chapter 6: Random Errors in Chemical Analysis. 6A The nature of random errors Random, or indeterminate, errors can never be totally eliminated and are.
Chapter 5: Errors in Chemical Analysis. Errors are caused by faulty calibrations or standardizations or by random variations and uncertainties in results.
Chapter6 Random Error in Chemical Analyses. 6A THE NATURE OF RANDOM ERRORS 1.Error occur whenever a measurement is made. 2.Random errors are caused by.
Home Reading Skoog et al. Fundamental of Analytical Chemistry. Chapters 5 and 6.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
ESTIMATION.
CONCEPTS OF ESTIMATION
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

STATISTICAL TESTS AND ERROR ANALYSIS

PRECISION AND ACCURACY PRECISION – Reproducibility of the result ACCURACY – Nearness to the “true” value

Uncertainty in every experiment (measurement) How sure are you that the experimentally obtained value is close to the “true” value? How close is it? Finding errors Experimental error Uncertainty in every experiment (measurement)

SYSTEMATIC / DETERMINATE ERROR Reproducible under the same conditions in the same experiment Can be detected and corrected for It is always positive or always negative To detect a systematic error: Use Standard Reference Materials Run a blank sample Use different analytical methods Participate in “round robin” experiments (different labs and people running the same analysis)

RANDOM / INDETERMINATE ERROR Uncontrolled variables in the measurement Can be positive or negative Cannot be corrected for Random errors are independent of each other Random errors can be reduced by: Better experiments (equipment, methodology, training of analyst) Large number of replicate samples Random errors show Gaussian distribution for a large number of replicates Can be described using statistical parameters

For a large number of experimental replicates the results approach an ideal smooth curve called the GAUSSIAN or NORMAL DISTRIBUTION CURVE Characterised by: The mean value – x gives the center of the distribution The standard deviation – s measures the width of the distribution

The mean or average, x  the sum of the measured values (xi) divided by the number of measurements (n) The standard deviation, s  measures how closely the data are clustered about the mean (i.e. the precision of the data) NOTE: The quantity “n-1” = degrees of freedom

Other ways of expressing the precision of the data: Variance Variance = s2 Relative standard deviation Percent RSD / coefficient of variation

POPULATION DATA For an infinite set of data, n → ∞ : x → µ and s → σ population mean population std. dev. The experiment that produces a small standard deviation is more precise . Remember, greater precision does not imply greater accuracy. Experimental results are commonly expressed in the form: mean  standard deviation

The Gaussian curve equation: = Normalisation factor It guarantees that the area under the curve is unity. Probability of measuring a value in a certain range = area below the graph of that range The Gaussian curve whose area is unity is called a normal error curve. µ = 0 and σ = 1

The standard deviation measures the width of the Gaussian curve. (The larger the value of σ, the broader the curve) Range Percentage of measurements µ ± 1σ 68.3 µ ± 2σ 95.5 µ ± 3σ 99.7 The more times you measure, the more confident you are that your average value is approaching the “true” value. The uncertainty decreases in proportion to

EXAMPLE Replicate results were obtained for the analysis of lead in blood. Calculate the mean and the standard deviation of this set of data. Replicate [Pb] / ppb 1 752 2 756 3 4 751 5 760

Replicate [Pb] / ppb 1 752 2 756 3 4 751 5 760 NB DON’T round a std dev. calc until the very end.

754  4 ppb Pb Also: Variance = s2

Motor vehicle emissions Lead plumbing Pewter Lead-based paints Lead is readily absorbed through the gastro intestinal tract. In blood, 95% of the lead is in the red blood cells and 5% in the plasma. About 70-90% of the lead assimilated goes into the bones, then liver and kidneys. Lead readily replaces calcium in bones. The symptoms of lead poisoning depend upon many factors, including the magnitude and duration of lead exposure (dose), chemical form (organic is more toxic than inorganic), the age of the individual (children and the unborn are more susceptible) and the overall state of health (Ca, Fe or Zn deficiency enhances the uptake of lead). European Community Environmental Quality Directive – 50 g/L in drinking water World Health Organisation – recommended tolerable intake of Pb per day for an adult – 430 g Pb – where from? Motor vehicle emissions Lead plumbing Pewter Lead-based paints Weathering of Pb minerals Food stuffs < 2 mg/kg Pb Next to highways 20-950 mg/kg Pb Near battery works 34-600 mg/kg Pb Metal processing sites 45-2714 mg/kg Pb

CONFIDENCE INTERVALS The confidence interval is the expression stating that the true mean, µ, is likely to lie within a certain distance from the measured mean, x. – Student’s t test The confidence interval is given by: where t is the value of student’s t taken from the table.

A ‘t’ test is used to compare sets of measurements. Usually 95% probability is good enough.

Example: The mercury content in fish samples were determined as follows: 1.80, 1.58, 1.64, 1.49 ppm Hg. Calculate the 50% and 90% confidence intervals for the mercury content. Find x = 1.63 s = 0.131 50% confidence: t = for n-1 = There is a 50% chance that the true mean lies between 1.58 and 1.68 ppm

x = 1.63 s = 0.131 90% confidence: t = for n-1 = 90% 50% 1.68 1.48 1.58 1.78 90% 50% 90% confidence: t = for n-1 = There is a 90% chance that the true mean lies between 1.48 and 1.78 ppm

Confidence intervals - experimental uncertainty

APPLYING STUDENT’S T: 1) COMPARISON OF MEANS Comparison of a measured result with a ‘known’ (standard) value tcalc > ttable at 95% confidence level  results are considered to be different  the difference is significant! Statistical tests are giving only probabilities. They do not relieve us of the responsibility of interpreting our results!

2) COMPARISON OF REPLICATE MEASUREMENTS For 2 sets of data with number of measurements n1 , n2 and means x1, x2 : One sample, many measurements Where Spooled = pooled std dev. from both sets of data tcalc > ttable at 95% confidence level  difference between results is significant. Degrees of freedom = (n1 + n2 – 2)

3) COMPARISON OF INDIVIDUAL DIFFERENCES Use two different analytical methods, A and B, to make single measurements on several different samples. Perform t test on individual differences between results: Many samples, one measurement Where Where d = the average difference between methods A and B n = number of pairs of data tcalc > ttable at 95% confidence level  difference between results is significant.

Are the two methods used comparable? Example: (di) Are the two methods used comparable?

F TEST COMPARISON OF TWO STANDARD DEVIATIONS Fcalc > Ftable at 95% confidence level  the std dev.’s are considered to be different  the difference is significant.

Q TEST FOR BAD DATA The range is the total spread of the data. The gap is the difference between the “bad” point and the nearest value. Example: 12.2 12.4 12.5 12.6 12.9 Gap Range If Qcalc > Qtable  discarded questionable point

Arrange in increasing order: 0.1050M 0.1066M 0.1067M 0.1071M EXAMPLE: The following replicate analyses were obtained when standardising a solution: 0.1067M, 0.1071M, 0.1066M and 0.1050M. One value appears suspect. Determine if it can be ascribed to accidental error at the 90% confidence interval. Arrange in increasing order: 0.1050M 0.1066M 0.1067M 0.1071M Q = Gap Range = 0.7619 BUT these values are very close  rather do another analysis to confirm!!!

ANALYTICAL VARIANCE + SAMPLING VARIANCE STATISTICS OF SAMPLING A chemical analysis can only be as meaningful as the sample! Sampling – process of collecting a representative sample for analysis OVERALL VARIANCE = ANALYTICAL VARIANCE + SAMPLING VARIANCE

Where does the sampling variance come from? Consider a powder mixture containing nA particles of type A and nB particles type B. Probability of drawing A: p = Probability of drawing B: q = nA nA+ nB nB = 1 - p If n particles are randomly drawn, the expected number of A particles will be np and standard deviation of many drawings will be:

How much of the sample should be analysed? Std dev. Where p, q – fractions of each kind of particles present Relative Std Dev. Relative Variance  nR2 = pq The mass of sample (m) is proportional to number of particles (n) drawn, therefore: Ks = mR2 Where R = RSD as a % and Ks (sampling constant) = mass of sample required to reduce the relative sampling standard deviation to 1%

How many samples/replicates to analyse? Rearranging Student’s t equation: Required number of replicate analyses: e µ = true population mean x = measured mean n = number of samples needed ss2 = variance of the sampling operation e = sought-for uncertainty Since degrees of freedom is not known at this stage, the value of t for n → ∞ is used to estimate n. The process is then repeated a few times until a constant value for n is found.

Example: In analysing a lot with random sample variation, there is a sampling deviation of 5%. Assuming negligible error in the analytical procedure, how many samples must be analysed to give 90% confidence that the error in the mean is within 4% of the true value? For 90% confidence: t = n = 6

SAMPLE STORAGE Not only is the sampling and sample preparation important, but the sample storage is also critical. The composition of the sample may change with time due to, for example, the following: reaction with air reaction with light absorption of moisture interaction with the container Glass is a notorious ion exchanger which can alter the concentration of trace ions in solution. Thus plastic (especially Teflon) containers are frequently used. Ensure all containers are clean to prevent contamination.

EXAMPLE: (for you to do) Consider a random mixture containing 4.00 g of Na2CO3 ( = 2.532 g/ml) and 96.00 g of K2CO3 ( = 2.428 g/ml) with an approximated uniform spherical radius of 0.075 mm. How many particles of Na2CO3 are in the mixture? And K2CO3? Na2CO3: 4.00 g at 2.532 g/ml V = m  = 1.58 ml = 1.58 cm3 K2CO3: 96.00 g at 2.428 g/ml V = m 

VNa2CO3 = 1.58 cm3 VK2CO3 = 39.54 cm3 Particles: r = 0.075 mm = 0.0075 cm nNa2CO3 = 8.94x105 particles nK2CO3 = 2.24x107 particles

What is the expected number of particles in 0.100 g of the mixture? EXAMPLE: Consider a random mixture containing 4.00 g of Na2CO3 ( = 2.532 g/ml) and 96.00 g of K2CO3 ( = 2.428 g/ml) with an approximated uniform spherical radius of 0.075 mm. What is the expected number of particles in 0.100 g of the mixture? 8.94x102 particles of Na2CO3 and 2.24x104 particles of K2CO3 in a 0.1 g sample

EXAMPLE: Calculate the relative standard deviation in the number of particles for each type in the 0.100 g sample of the mixture.