AACB ASM 2003 THE POWER OF ERROR DETECTION OF WESTGARD MULTI-RULES: A RE-EVALUATION Graham Jones Department of Chemical Pathology St Vincent’s Hospital, Sydney
AACB ASM 2003 Background Westgard multi-rules are claimed to increase the power of error detection of laboratory QC procedures. Power Function Charts can quantify the ability of these rules to detect changes in assay performance. Examples of Power Function Charts are available on the Westgard QC website ( In this poster I re-evaluate the power of error detection of QC rules which require data from more than one QC run (Multi-run rules).
AACB ASM 2003 Hypothesis That the correct model for assessing the Power of Error Detection for multi-run QC rules should only show benefit for these rules if the error has not already been detected in QC runs required to gather data for those rules. This hypothesis was modelled and compared to data on the Westgard website.
AACB ASM 2003 Nomenclature Single-run rules –All data is contained in a single QC run For n=2 includes 1 3s and 2 2s For n=4 includes 1 3s and 2 2s and 4 1s Multi-Run Rules –Requires data from more than one QC run For n=2 includes 4 1s and 10 x For n=4 includes 8 x
AACB ASM 2003 Methods Power Function Charts were produced using a Microsoft Excel spreadsheet. QC results were simulated using a random number generator with a normal distribution. Changes in bias were modelled by adding various constants to the output. QC rules were evaluated by the frequency with which they were triggered at changes in bias. Westgard multi-rules with n=2 were evaluated for bias detection: 1 3s /2 2s /4 1s /10 x. Changes in random error were not modelled.
AACB ASM 2003 Hypothesis - Graphical Display Mean -3SD +3SD +2SD -2SD QC run - within-run rules evaluate performance (1 3s /2 2s ) - multi-run rule evaluates performance (10 x across both materials) - Only adds benefit if shift NOT detected by QC events 1-4 Change in assay bias This display uses 10 x as an example of a multi-run rule
AACB ASM 2003 Probability for Rejection Shift in Bias (multiples of SD) Results A B Graph A - Original data from Westgard website Graph B - Model of data from Westgard website. - Multi-run rules fire even if shift would have been detected previously.
AACB ASM 2003 Probability for Rejection D C Graph C - Westgard data adjusted for hypothesis. - Multi-run Rules fire only if shift would NOT have been detected previously. Graph D - Model of individual rules from Graph C - Multi-run Rules fire only if shift would NOT have been detected previously. Shift in Bias (multiples of SD) Shifts detected with 90% certainty from full multi-rules Shifts detected with 90% certainty from within-run rules.
AACB ASM 2003 Results A power Function Chart from the Westgard website showing multi-rules for bias detection with n=2 is shown in graph A. The change in bias which Westgard claims full Multi- rules can detect with 90% certainty is about 2.0 times the SD of the assay (Graph A). My model of the Westgard data, with multi-run rules triggered even if the change in bias would been previously detected, agrees well with the website data (Graph B). In the Westgard model the multi-run rules (10 x and 4 1s ) enhance the error detection over the within-run rules (graphs A and B)
AACB ASM 2003 The model excluding multi-run rules when a shift would have been detected previously is shown in Graph C. When these previously-detected shifts are removed from the data, the assay bias which can be detected with 90% certainty is reduced to about 3.3 times the assay SD (Graph C). With this model the multi-run rules do not add to the within-run rules for confident error detection. When the individual rules are plotted it can been seen that the multi-run rules never add to the error detection with 90% certainty. The multi-run rules can be considered warning rules.
AACB ASM 2003 Conclusion The multi-run rules, as described on the Westgard website, give a falsely low estimate of the change in bias which can be detected with 90% certainty. The 10 x and 4 1s rules add little to the overall error detection at the 90% confidence level with 2 QC samples per run. Multi-run rules are similarly non- contributory with 4 QC samples per run (data not shown).