The Role of the Tukey algorithm in validation procedures for prices data in a consumer price index: the UK experience on this and more general aspects.

Slides:



Advertisements
Similar presentations
Katherine Jenny Thompson
Advertisements

June 10, Representative products In ICP 2005 price collectors were asked to identify “representative” products among all the products for household.
Statistics Versus Parameters
How do we measure inflation? Some measurement problems Rósmundur Guðnason, Statistics Iceland Seventh meeting of the International Working Group on Price.
1 Editing Administrative Data and Combined Data Sources Introduction.
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
Monitoring and Pollutant Load Estimation. Load = the mass or weight of pollutant that passes a cross-section of the river in a specific amount of time.
OAG Office of the Auditor-General Promoting Accountability in the Public Sector Using Audit to Oversee Public Procurement Edward Ouko Auditor-General Kenya.
Experimental Evaluation
OECD Short-Term Economic Statistics Working PartyJune Analysis of revisions for short-term economic statistics Richard McKenzie OECD OECD Short.
Certification of Market Values STEB PROGRAM Briefing Points 2011 Pennsylvania Department of the Auditor General Thomas E. Marks, CPA Deputy Auditor General.
MEASURING UK INFLATION Presentation to Civil Service Pensioners’ Alliance conference on price indexation November 23 rd 2010 Jill Leyland Vice President,
Revise lecture 8.
{ Measuring Inflation Warning: May not be suitable for SL students.
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc.
Chapter 11 Measuring the Cost of Living
Principles of Macroeconomics: Ch 11 Second Canadian Edition Chapter 11 Measuring The Cost of Living © 2002 by Nelson, a division of Thomson Canada Limited.
Public Policy and Efficiency: Some Lessons from the Reform of the Australian Gas Industry Kevin Davis Colonial Professor of Finance Department of Accounting.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Consumer Behaviour in UK Price Indices Joe Winton, Robert O’Neill & Duncan Elliott Office For National Statistics 1.
A generic tool to assess impact of changing edit rules in a business survey – SNOWDON-X Pedro Luis do Nascimento Silva Robert Bucknall Ping Zong Alaa Al-Hamad.
Editing and Validation Prices and PPPs* ICP-Western Asia Regional Meeting Beirut, Lebanon May 26, 2005 Yonas Biru *Presentation adopted.
QUALITY OF EVIDENCE FRCC Compliance Workshop September/October 2008.
A WORKSHEET FOR A SERVICE BUSINESS Accounting – Chapter 6.
© British Gas Further Analysis on Meter Reading Validation Tolerances proposed by Project Nexus March 2014.
Stop the Madness: Use Quality Targets Laurie Reedman.
Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.
Chapter 09 Audit Sampling McGraw-Hill/IrwinCopyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Workshop on Price Index Compilation Issues February 23-27, 2015 Data Collection Issues Gefinor Rotana Hotel, Beirut, Lebanon.
1 UNC Modification 429 Customer Settlement Error Claims Process – Guidance Document.
 2004 Prentice Hall Business Publishing, Accounting Information Systems, 9/e, by Bodnar/Hopwood 13 – 1 Chapter 13 Auditing Information Technology.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Ab Rate Monitoring Steven Petlick CAS Underwriting Cycle Seminar October 5, 2009.
Index numbers Value-, price-, quantity indices. Measuring changes in time Indices: relative change in price, quantity and value of products or services.
ICP-Africa Regional Workshop Pretoria, South Africa June 2011.
Yield Cleaning Software and Techniques OFPE Meeting
IMPACT 3-5th November 20044th IMPACT Project Workshop Zaragoza 1 Investigation of extreme flood Processes and uncertainty IMPACT Investigation of Extreme.
De-anonymizing Genomic Databases Using Phenotypic Traits Humbert et al. Proceedings on Privacy Enhancing Technologies 2015 (2) :
9-1 Copyright © 2016 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Specialized Audit Tools: Sampling and Generalized Audit Software
Chapter 9 Audit Sampling – Part a.
DETECTION OF OUTLIERS IN THE CANADIAN CONSUMER PRICE INDEX (CPI) DETECTION OF OUTLIERS IN THE CANADIAN CONSUMER PRICE INDEX (CPI) ABDELNASSER SAÏDI AND.
National PE Cycle of Analysis. Fitness Assessment + Gathering Data Why do we need to asses our fitness levels?? * Strengths + Weeknesses -> Develop Performance.
CPI/HICP price collection Validation of prices Twinning project: Development of new statistical methodologies and indicators in selected areas of statistics.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
ASA East Region Swimming Officials Group Prepared by David Robinson 1.
CIMA P2 Advanced Management Accounting
PROCESSING DATA.
Liquidity and Efficiency
Statistics in Management
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.
Forecasting Methods Dr. T. T. Kachwala.
The scope and focus of the Research
Auditing Information Technology
Audit of the Inventory and Warehousing Cycle
Measuring The Cost of Living
Problem 1 For the examination of the financial statements of Scotia Inc., Rosa Schellenberg, a public accountant, has decided to apply non-statistical.
Audit of the Inventory and Warehousing Cycle
Actuaries Climate Index™
Session 7: Adjustment for quality changes
Introduction Second report for TEGoVA ‘Assessing the Accuracy of Individual Property Values Estimated by Automated Valuation Models’ Objective.
Effects of IT on Consideration of Internal Control in a Financial Statement Audit Dr. Donald McConnell Jr. 12/1/2018.
The Norwegian CPI Data Validation and Editing
PRODCOM SURVEY IN THE UNITED KINGDOM
Data processing German foreign trade statistics
Quality Adjustment Review of UK Consumer Price Statistics
Presentation transcript:

The Role of the Tukey algorithm in validation procedures for prices data in a consumer price index: the UK experience on this and more general aspects of data editing David Fenwick

Editing of CPI prices data:Central Premise 1. Editing is a non-trivial issue. It can have a systematic numerical impact on measured inflation which can lead to bias 2. Automated editing can improve the quality of the index and increase operational efficiency 3. Both of the above statements apply to the Tukey algorithm Auditing and editing needs to be in “real time” because shop prices can change quickly

Some background on ONS editing procedures – salient points Most prices are collected by handheld computer with the facility for interactive editing in real time Two distinct algorithms operate at HQ to identify outliers amongst price quotes, often operating in parallel –Scrutiny, two tests by reference to “average” price of same/similar items (2k from 100k) The minimum-maximum test The percentage change test –Tukey, in essence a more sophisticated version of scrutiny. This is applied to price quotes not identified by scrutiny as outliers (4k from 100k)

Some background on ONS editing procedures – salient points At the time of study the presumption was that an outlier was incorrect, and therefore declared invalid, unless positively verified by reference to metadata from the price collector or by checking the quote with the shop keeper –All “scrutiny” & half of Tukey outliers were subject to positive verification (most were correct). –The rest, that is those not positively verified were assumed incorrect. –The number explicitly accepted after verification was 100 times the number explicitly rejected (indicates imbalance & potential bias)

Some background on ONS editing procedures – salient points Automated filtering mechanisms avoid manual examination of large numbers of prices over a short time But automated filtering mechanisms need to be supported by “well-informed” manual editing –Especially when there can be unpredictable variations in prices (e.g. seasonal goods, sales) Two main issues for ONS & motivation for the research –The efficiency of its editing procedures –The impact on the accuracy of the RPI/CPI.

The Tukey Algorithm Price quotes are ordered by the corresponding price ratios Highest and lowest 5 per cent are flagged for further investigation and excluded –Price ratios equal to 1 are excluded (i.e. no price change) Arithmetic mean of remaining price ratios used to divide remaining price ratios and their lower/upper trimmed means calculated The upper and lower Tukey limits used to flag those price observations which warrant attention are then calculated as follows: –TU =AM (AMU – AM) –TL = AM – 2.5 (AM – AML) where AML is the lower trimmed mean and AMU is the upper trimmed mean. Tukey maximises use of immediate price history –Can be used for monthly or annual change –AML & AMU can be regularly recalculated

Issue 1: efficiency of editing procedures Considerable overlap between –Interactive editing in field Min max test, metadata (S=sale; R=recovery from sale), logistical checks (R must follow S) –“scrutiny” Not real time & less sophisticated than “scrutiny” but quick to identify extreme outliers & can be run without many body of prices data –Tukey Sophisticated but not interactive More efficient than “scrutiny” which identifies many “outliers” which are valid price quotes “scrutiny” reduces efficiency of Tukey by prior exclusion of “scrutiny” outliers from Tukey –“Scrutiny retained because quick & simple but the presumption of an outlier being “wrong until proven right” challenged

Issue 2: impact on index & potential bias Set of outliers may not necessarily adequately overlap set of “incorrect” prices –Initial investigations showed number of Tukey (& “Scrutiny”) outliers which were incorrect was small. Better to focus on –Extreme outliers –Prices that have not changed for many months (editing ignores) Study undertaken of clothing sub-index of RPI

Clothing: underlying analysis Were clothes prices really lower than 15 years ago?

Clothing: underlying analysis

Deeper sales & shallower recoveries? –But not uniform across clothing

The research questions Is the data editor at HQ more reliable than the price collector? –Automated editing automatically over-riding price quotes for replacement items from “non-comparable” to “comparable” where small price difference –“Scrutiny” can overturn in a few seconds a considered decision by price collector “odd” results- only 10% as many “recoveries” as sales Correct categorisation of replacement item as comparable or non- comparable is important. –Tukey doesn’t test this –“Scrutiny” An examination of original & final indicator codes shows over half of “non-comparable” replacements were reclassified at HQ (a half to “comparable”) & 10% of “C”’s were changed to “N”’s Analysis of price relatives indicate latter might have an undue influence on “Scrutiny” with a cumulative effect on index

Research question: what has been the numerical impact of edited changes in “indicator” codes? ONS records original and final (after all editing) “indicator” codes –Original indicator codes gave higher index

Research question: what has been the numerical impact of edited changes in “indicator” codes?

Research question: is the price collectors judgement better than HQs? Auditor back check of indicator codes –Collectors decisions generally accurate & better then HQs

Further thoughts on Tukey Tukey is robust –The implicit thresholds defined by Tukey are not subject to substantial revision on receipt of new data Tukey could be used more effectively –Explicit parameters can be adjusted to identify extreme outliers –Double application First apply Tukey to initial prices data and then to full set Validate but suspend validation decisions from first application until confirmed by second Tukey cannot be applied to centrally collected prices or centrally calculated indices –Too few quotes –Rely on scrutiny –But “centrals” represent the biggest inherent risk to the index (small number of quotes, large weight)

Conclusions Data editing “filtering” mechanisms –Can increase efficiency. But can exclude “correct” prices do not guarantee a better index Tukey can be undermined –by other editing procedures –By the presumption that an outlier is an invalid price unless subsequently positively validated by editing –By automatic re-coding of non-comparable “replacements” as comparable “replacements” e.g. where the price difference is small and there is no reference to the indicator code The judgement of the price collector is generally to be preferred Centrally collected prices and centrally calculated indices account for 40% of the basket and Tukey doesn’t help these

END OF PRESENTATION