Download presentation
Presentation is loading. Please wait.
Published byBarry Hudson Modified over 9 years ago
1
Study of Editing and Imputation Practices at Statistics Finland Janika Konnu and Pauli Ollila Statistics Finland Q2010: Editing session Wednesday 5 th of May, 11.00-12.30
2
Editing Project of Statistics Finland 5 May 20102Janika Konnu, Pauli Ollila INTERNAL E&I STUDY OF STATFI EXTERNAL E&I STUDY DEVELOPMENTAL WORK FOR THE NEEDS OF STATFI INFORMATION AND EDUCATION Development project of two years Targets: to provide good E&I practices, help in making statistics more effective, improve quality, diminish work load, save costs.
3
Internal E&I Study of StatF i Forms the basis for the work of the project. Describes the current E&I situation at StatFi. Reveals points where the developmental resources should be allocated in later phases of the project. 5 May 20103Janika Konnu, Pauli Ollila INTERNAL E&I STUDY OF STATFI SURVEY OF E&I PRACTICES AT STATFI DETAILED STUDIES OF E&I IN SOME STATISTICS OTHER STUDIES (e.g. auditing reports) Part 2 Janika Konnu Part 1 Pauli Ollila
4
Survey of E&I Practices at StatFi Conducted in January 2010. A web questionnaire was used. Directed to all statistics of StatFi, providing information from all relevant statistics (exceptions: statistics were finished, were to be finished, were in transition etc.) Equivalence = one response equals also one or more other statistics 5 May 20104Janika Konnu, Pauli Ollila SURVEY OF E&I PRACTICES AT STATFI STATISTICS DEPARTMENTRESPONSESEQUIVALENCESSTATISTICS IN ALL Population Statistics 341751 Social Statistics 180 Prices and Wages 17724 Economic Statistics 201131 Business Trends 20424 Business Structures 251237 ALL 13451185
5
Topics of E&I Survey The survey tried to cover all important aspects connected to editing and imputation. The question pattern was commented and tested with E&I and survey experts together with subject matter people. The structure allowed open- space commenting on every page. This proved to be a very valuable asset. 5 May 20105Janika Konnu, Pauli Ollila SURVEY OF E&I PRACTICES AT STATFI SURVEYS, REGISTERS, SOURCE DATA DATA COLLECTION METHODS PRELIMINARY OPERATIONS ERROR RECOGNITION PRACTICES MISSING VALUE PRINCIPLES ERROR CORRECTION AND IMPUTATION REPORTING DATA ARCHIVING
6
Analysing and Utilising the Results 5 May 20106Janika Konnu, Pauli Ollila SURVEY OF E&I PRACTICES AT STATFI DATA BASE OF PRACTICES IN STATISTICS DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS MAKING “STATISTICS TYPES” BY COMMON PRACTICES STUDYING E&I PROCESSES (string of practices, descriptions) PROVIDES GOOD BASIS FOR THE DEVELOPMENTAL WORK OF EDITING PROJECT VALUABLE INFORMATION FOR PLANS OF STATISTICS DEPARTMENTS AND OTHER INSTANCES
7
Example 1: Work time spent for editing and imputation in statistics (%) 5 May 20107Janika Konnu, Pauli Ollila STATISTICS DEPARTMENT Mis- sing 0 - 1011-2021-3031-4041-5051-6061-7071-80 ALL Population Statistics 223714410951 Social Statistics 09412020018 Prices and Wages 111323112024 Economic Statistics 88420330331 Business Trends 011322015024 Business Structures 2131521011237 ALL 13752213 98824185 DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS
8
Example 2: Type of data in making statistics at Statistics Finland 5 May 20108Janika Konnu, Pauli Ollila STATISTICS DEPARTMENT SURREGSOUSUR REG SUR SOU REG SOU SUR REG SOU ALL Population Statistics 01274491551 Social Statistics 1221011118 Prices and Wages 0112120824 Economic Statistics 4084411132 Business Trends 0201001922 Business Structures 4124152138 ALL 9182034221765185 SUR = survey, REG = register, SOU = source data DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS
9
5 May 20109Janika Konnu, Pauli Ollila Example 3: Technical editing at the unit level Statistics with unit-level processing Pop. Stat. (44) Soc. Stat. (15) Pric. & Wages (22) Econ. Stat. (24) Busin. Trends (22) Busin. Struct. (32) ALL (159) Unit-level examination with a computer 191017231829116 Logical checks using a program or otherwise 37138211325117 Defining non-valid variable values 3112814111995 Listing extreme values of variables 1311910112478 Comparing with previous or other values 341014221323116 Ratio of values of two variables or different time points, other functions 16851341965 DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS
10
5 May 201010Janika Konnu, Pauli Ollila Example 4: Model editing at the unit level Statistics with unit-level processing Pop. Stat. (44) Soc. Stat. (15) Pric. & Wages (22) Econ. Stat. (24) Busin. Trends (22) Busin. Struct. (32) ALL (159) Defining the certainty of different variables to be right in the case of conflicting variables (reliability weight, minimum change Fellegi- Holt -principle) 63206017 Comparing modelled value and observed value 01481115 Modelling variable values / observations risk to be erroneous (e.g. selective editing) 1110003 Finding problematic values with defining the importance of the observation or so called sensitivity function (reveals the effect of the observation to the estimate) 051207630 DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS
11
5 May 201011Janika Konnu, Pauli Ollila Example 5: Macro editing Statistics with unit-level processing Pop. Stat. (44) Soc. Stat. (15) Pric. & Wages (22) Econ. Stat. (24) Busin. Trends (22) Busin. Struct. (32) ALL (159) Studying distributions and cross- tabulations 32156 62397 Information from calculating preliminary estimates (e.g. mean, total, correlation, deviation) 2314101572695 Controlling the joint effect of survey weights and exceptional values 05401515 Comparing with estimates from previous occasion(s), valid limits for estimates (e.g. time series) 15111518102695 Using graphical methods 8851371556 Studying aggregated data 25619 1728114 Comparing with other possible data 281081872798 DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS
12
5 May 201012Janika Konnu, Pauli Ollila Example 6: Treatment types (not imputation) Statistics with unit-level processing Pop. Stat. (44) Soc. Stat. (15) Pric. & Wages (22) Econ. Stat. (24) Busin. Trends (22) Busin. Struct. (32) ALL (159) Getting contact to the respondent and asking the value or getting it from the paper questioinnaire of the postal enquiry 27517201630115 Fetching the previous value (cold- deck) 62131182060 Getting the value from another observation or another source 1251314 2583 Getting the real value by reasoning based on the information of the observation in question 2778211327103 Correcting automatically with program lines including conditions or based on a list of erroneuos values (e.g. ‘america’ = ‘United States’) 378614101893 Correcting automatically based on risk functions (e.g. selective editing) 0010607 DISTRIBUTIONS OF PRACTICES AT VARIOUS LEVELS
13
Example 1: Statistics with no unit-level processing 5 May 201013Janika Konnu, Pauli Ollila Collecting statistics utilises statistics and tabulations from several sources, and after gathering information the required form of the statistics is reached (6 statistics). Strict processing statistics are based on one or more data (statistical data, external source data or register), which are used strictly without changes in order to make the statistics (9 statistics). Calculation model statistics lean on existing, already edited data and/or tabulations/statistics in such way that with using them one can realise a mathematical or statistical calculation model required by the statistics (11 statistics). MAKING “STATISTICS TYPES” BY COMMON PRACTICES
14
Example 2: Different types of utilising statistics (i.e. estimates from other sources) 5 May 201014Janika Konnu, Pauli Ollila MAKING “STATISTICS TYPES” BY COMMON PRACTICES Direct use of statistics: statistics (estimates) are directed straight to the process of making statistics, or it goes through a standard treatment before the process. Additions and checks: statistics (estimates) are used for treating missing values and errors and/or for various checks. Making expansion weights: statistics (estimates) and distributions are utilised for making weights expanding the results to the population level (e.g. calibration). Index calculation Account calculation A part of calculating results: all purposes of using statistics (estimates) in calculating the results (excluding index and account calculation).
15
Example 3: Types of data collection 5 May 201015Janika Konnu, Pauli Ollila MAKING “STATISTICS TYPES” BY COMMON PRACTICES STATISTICS DEPARTMENT Only statistics with data collection Pop. Stat. Soc. Stat. Pri. Wag Econ. Stat. Busin. Tren. Busin. Struct. ALL Full Blaise-based data collection 07021010 Paper questionnaire collection only 0100001 Diary surveys 0200002 XCOLA-based data collection 20021510 XCOLA and paper combination 0010304 XCOLA and Excel combination 00823215 Other web collection made in StatFi 0010607 Web collection via external server 513411933 Excel-based data delivery 30331414 Other data delivery or transfer 100233018 YHTEENSÄ 201118161930114
16
Detailed interviews with statistics Interviews with different type of statistics from production and editing point of view Informal discussions with 1-2 interviewers and 1-2 persons from the statistic Reports finalised with the interview persons and made available for everyone in StatFi 5 May 201016Q2010 Konnu and Ollila DETAILED STUDIES OF E&I IN SOME STATISTICS
17
5 May 201017Q2010 Konnu and Ollila DETAILED STUDIES OF E&I IN SOME STATISTICS Most common methods for editing and imputation Editing deterministic checking rules local checking distributional checking use of other sources or historical data Imputation manual cold deck average hot deck automatic imputation (checking lists)
18
DETAILED STUDIES OF E&I IN SOME STATISTICS General impression of editing and imputation in StatFi Usually we take new contact to the respondent Deduction is used if it’s possible Personnel has strong contentual knowledge and awareness of current events Personnel is very interested in and willing to work for methodological improvements 5 May 201018Q2010 Konnu and Ollila
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.