Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quality Assurance & Quality Control University of New Mexico Kristin Vanderbilt William Michener James Brunt Troy Maddux University of South Carolina Don.

Similar presentations


Presentation on theme: "Quality Assurance & Quality Control University of New Mexico Kristin Vanderbilt William Michener James Brunt Troy Maddux University of South Carolina Don."— Presentation transcript:

1 Quality Assurance & Quality Control University of New Mexico Kristin Vanderbilt William Michener James Brunt Troy Maddux University of South Carolina Don Edwards

2 Instruction Materials Primary References Michener and Brunt (2000) Ecological Data: Design, Management and Processing. Blackwell Science. Edwards (2000) Ch. 4 Brunt (2000) Ch. 2 Michener (2000) Ch. 7 Dux, J.P. 1986. Handbook of Quality Assurance for the Analytical Chemistry Laboratory. Van Nostrand Reinhold Company Mullins, E. 1994. Introduction to Control Charts in the Analytical Laboratory: Tutorial Review. Analyst 119: 369 – 375.

3 Outline: Define QA/QC QC procedures Designing data sheets Data entry using validation rules, filters, lookup tables QA procedures Graphics and Statistics Outlier detection Samples Simple linear regression

4 QA/QC “mechanisms [that] are designed to prevent the introduction of errors into a data set, a process known as data contamination” Brunt 2000

5 Errors (2 types) Commission: Incorrect or inaccurate data in a dataset Can be easy to find Malfunctioning instrumentation Sensor drift Low batteries Damage Animal mischief Data entry errors Omission Difficult or impossible to find Inadequate documentation of data values, sampling methods, anomalies in field, human errors

6 Quality Control “mechanisms that are applied in advance, with a priori knowledge to ‘control’ data quality during the data acquisition process” Brunt 2000

7 Quality Assurance “mechanisms [that] can be applied after the data have been collected, entered in a computer and analyzed to identify errors of omission and commission” graphics statistics Brunt 2000

8 QA/QC Activities Defining and enforcing standards for formats, codes, measurement units and metadata. Checking for unusual or unreasonable patterns in data. Checking for comparability of values between data sets. Brunt 2000

9 Outline: Define QA/QC QC procedures Designing data sheets Data entry using validation rules, filters, lookup tables QA procedures Graphics and Statistics Outlier detection Samples Simple linear regression

10 Flowering Plant Phenology – Data Entry Form Design Four sites, each with 3 transects Each species will have phenological class recorded

11 PlantLife Stage ______________ _______________ What’s wrong with this data sheet? Data Collection Form Development:

12 PlantLife Stage ardiP/G V B FL FR M S D NP arpuP/G V B FL FR M S D NP atcaP/G V B FL FR M S D NP bamuP/G V B FL FR M S D NP zigrP/G V B FL FR M S D NP P/G V B FL FR M S D NP PHENOLOGY DATA SHEET Collectors:_________________________________ Date:___________________ Time:_________ Location: black butte, deep well, five points, goat draw Transect: 1 2 3 Notes: _________________________________________ P/G = perennating or germinatingM = dispersing V = vegetatingS = senescing B = buddingD = dead FL = floweringNP = not present FR = fruiting

13 Outline: Define QA/QC QC procedures Designing data sheets Data entry using validation rules, filters, lookup tables QA procedures Graphics and Statistics Outlier detection Samples Simple linear regression

14

15 Other methods for preventing data contamination Double-keying of data by independent data entry technicians followed by computer verification for agreement Randomly check 10% of data to calculate frequency of errors Use text-to-speech program to read data back Filters for “illegal” data Computer/database programs Legal range of values Sanity checks Edwards 2000

16 Table of Possible Values and Ranges Raw Data File Illegal Data Filter Report of Probable Errors Flow of Information in Filtering “Illegal” Data Edwards 2000

17 Illegal-data filter in SAS Data Checkum; Set all; Message=repeat(“ ”,39); If Y1 1 then do; message=“Y1 is not on the interval [0,1]”; output; end; If Y3>Y4 then do; message=“Y3 is larger than Y4”; output; end; …….(add as many checks as desired) If message NE repeat(“ ”, 39); Keep ID message; Proc Print Data=Checkum;

18 Outline: Define QA/QC QC procedures Designing data sheets Data entry using validation rules, filters, lookup tables QA procedures Graphics and Statistics Outlier detection Samples Simple linear regression

19 Identifying Sensor Errors: Comparison of data from three Met stations, Sevilleta LTER

20 Identification of Sensor Errors: Comparison of data from three Met stations, Sevilleta LTER

21 QA/QC in the Lab: Using Control Charts

22 Laboratory quality control using statistical process control charts: Determine whether analytical system is “in control” by examining Mean Variability (range)

23 All control charts have three basic components: a centerline, usually the mathematical average of all the samples plotted. upper and lower statistical control limits that define the constraints of common cause variations. performance data plotted over time.

24 Time Mean LCL UCL X-Bar Control Chart

25 Constructing an X-Bar control chart Each point represents a check standard run with each group of 20 samples, for example UCL = mean + 3*standard deviation

26 Things to look for in a control chart: The point of making control charts is to look at variation, seeking patterns or statistically unusual values. Look for: 1 data point falling outside the control limits 6 or more points in a row steadily increasing or decreasing 8 or more points in a row on one side of the centerline 14 or more points alternating up and down

27 Linear trend

28 Outline: Define QA/QC QC procedures Designing data sheets Data entry using validation rules, filters, lookup tables QA procedures Graphics and Statistics Outlier detection Samples Simple linear regression

29 Outliers An outlier is “an unusually extreme value for a variable, given the statistical model in use” The goal of QA is NOT to eliminate outliers! Rather, we wish to detect unusually extreme values. Edwards 2000

30 Outlier Detection “the detection of outliers is an intermediate step in the elimination of [data] contamination” Attempt to determine if contamination is responsible and, if so, flag the contaminated value. If not, formally analyse with and without “outlier(s)” and see if results differ. Edwards 2000

31 Methods for Detecting Outliers Graphics Scatter plots Box plots Histograms Normal probability plots Formal statistical methods Grubbs’ test Edwards 2000

32 X-Y scatter plots of gopher tortoise morphometrics Michener 2000

33 Example of exploratory data analysis using SAS Insight® Michener 2000

34 Box Plot Interpretation Upper quartile Median Lower quartile Upper adjacent value Pts. > upper adjacent value Lower adjacent value Pts. < lower adjacent value Inter- quartile range

35 Box Plot Interpretation Inter- quartile range IQR = Q(75%) – Q(25%) Upper adjacent value = largest observation < (Q(75%) + (1.5 X IQR)) Lower adjacent value = smallest observation > (Q(25%) - (1.5 X IQR))

36 Box Plots Depicting Statistical Distribution of Soil Temperature

37 Statistical tests for outliers assume that the data are normally distributed…. CHECK THIS ASSUMPTION!

38 Normal density and Cumulative Distribution Functions Edwards 2000

39 Normal Plot of 30 Observations from a Normal Distribution Edwards 2000

40 Normal Plots from Non-normally Distributed Data

41 Grubbs’ Text is a Statistical Test to determine if a point is an outlier.

42 Grubbs’ test for outlier detection: T n = (Y n – Y bar )/S where Y n is the possible outlier, Y bar is the mean of the sample, and S is the standard deviation of the sample Contamination exists if T n is greater than T.01[n]

43 Example of Grubbs’ test for outliers: rainfall in acre-feet from seeded clouds (Simpson et al. 1975) 4.17.717.531.4 32.740.692.4115.3 118.3119.0129.6198.6 200.7242.5255.0274.7 274.7302.8334.1430.0 489.1703.4978.0 1656.01697.82745.6 T 26 = 3.539 > 3.029; Contaminated Edwards 2000 But Grubbs’ test is sensitive to non-normality……

44 Edwards 2000 Checking Assumptions on Rainfall Data Contaminated Normally distributed

45 Simple Linear Regression: check for model-based…… Outliers Influential (leverage) points

46 Influential points in simple linear regression: A “leverage” point is a point with an unusual regressor value that has more weight in determining regression coefficients than the other data values. Edmonds 2000

47 Edwards 2000 Influential Data Points in a Simple Linear Regression

48 Edwards 2000 Influential Data Points in a Simple Linear Regression

49 Edwards 2000 Influential Data Points in a Simple Linear Regression

50 Edwards 2000 Influential Data Points in a Simple Linear Regression

51 Edwards 2000 Brain weight vs. body weight, 63 species of terrestrial mammals Leverage pts. “Outliers”

52 Edwards 2000 Logged brain weight vs. logged body weight “Outliers”

53 Outliers in simple linear regression Observation 62

54 Outliers: identify using studentized residuals Contamination may exist if |r i | > t  /2, n-3  = 0.01

55 Simple linear regression: Outlier identification n = 86 t  /2,83 = 1.98

56 Simple linear regression: detecting leverage points h i = (1/n) + (x i – x) 2 /(n-1)S x 2 A point is a leverage point if h i > 4/n, where n is the number of points used to fit the regression

57 Regression with leverage point: Soil nitrate vs. soil moisture

58 Regression without leverage point Observation 46

59 Output from SAS: Leverage points n = 336 h i cutoff value: 4/336=0.012

60 Process from raw data to verified data to validated data (reviewed by a qualified scientist) is called data maturity, and implies increasing confidence in the quality of the data through time

61 Data Quality (Confidence in the Data) Data Maturity Raw Data Verified Data Validated Data

62 Questions?


Download ppt "Quality Assurance & Quality Control University of New Mexico Kristin Vanderbilt William Michener James Brunt Troy Maddux University of South Carolina Don."

Similar presentations


Ads by Google