Antonio Bernardi - Fulvia Cerroni - Viviana De Giorgi (Istat) An application to the Tax Authority Source (Sector Studies) Session: Administrative data.

Slides:



Advertisements
Similar presentations
I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Advertisements

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Fulvia Cerroni - Serena Migliardo - Enrica Morganti Italian National Institute of Statistics Session 27: Use of administrative sources I Helsinki 5 May.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Research Design After: finding an interesting research question; finding an interesting research question; reviewing the literature on the topic area;
Review Chapter 1-3. Exam 1 25 questions 50 points 90 minutes 1 attempt Results will be known once the exam closes for everybody.
Statistical Evaluation of Data
Business Statistics - QBM117 Statistical inference for regression.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Discriminant Analysis Testing latent variables as predictors of groups.
Slide 1 SOLVING THE HOMEWORK PROBLEMS Simple linear regression is an appropriate model of the relationship between two quantitative variables provided.
Understanding Research Results
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on.
● Midterm exam next Monday in class ● Bring your own blue books ● Closed book. One page cheat sheet and calculators allowed. ● Exam emphasizes understanding.
©2006 Prentice Hall Business Publishing, Auditing 11/e, Arens/Beasley/Elder Audit Sampling for Tests of Details of Balances Chapter 17.
Multivariate Statistical Data Analysis with Its Applications
©2003 Prentice Hall Business Publishing, Auditing and Assurance Services 9/e, Arens/Elder/Beasley Audit Sampling for Tests of Details of Balances.
©2010 Prentice Hall Business Publishing, Auditing 13/e, Arens//Elder/Beasley Audit Sampling for Tests of Details of Balances Chapter 17.
©2012 Pearson Education, Auditing 14/e, Arens/Elder/Beasley Audit Sampling for Tests of Details of Balances Chapter 17.
Quality issues on the way from survey to administrative data: the case of SBS statistics of microenterprises in Slovakia Andrej Vallo, Andrea Bielakova.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
Assessing the Capacity of Statistical Systems Development Data Group.
The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.
1 Review of ANOVA & Inferences About The Pearson Correlation Coefficient Heibatollah Baghi, and Mastee Badii.
Cristina Casciano, Viviana De Giorgi, Filippo Oropallo Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices.
Using cluster analysis for Identifying outliers and possibilities offered when calculating Unit Value Indices OECD NOVEMBER 2011 Evangelos Pongas.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Impact of updating weights on tracking performance and volatility: Industry survey G. Bruno, L. Crosilla, P. Margani, A. Righi EU Workshop on Recent Developments.
1 Inferences About The Pearson Correlation Coefficient.
The challenge of a mixed-mode design survey and new IT tools application: the case of the Italian Structure Earning Surveys Fabiana Rocci Stefania Cardinleschi.
I. Introduction to Data and Statistics A. Basic terms and concepts Data set - variable - observation - data value.
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
1 UNIT 13: DATA ANALYSIS. 2 A. Editing, Coding and Computer Entry Editing in field i.e after completion of each interview/questionnaire. Editing again.
Lecture №4 METHODS OF RESEARCH. Method (Greek. methodos) - way of knowledge, the study of natural phenomena and social life. It is also a set of methods.
Bio-Statistic KUEU 3146 & KBEB 3153 Bio-Statistic Prof Madya Dr W Mohd Azhar Wan Ibrahim
Fulvia Cerroni - Viviana De Giorgi (Istat) Session 11: Use of administrative data in the statistical system 15 October 2008 The Tax Authority Source as.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Principles of Biostatistics Chapter 17 Correlation 宇传华 网上免费统计资源(八)
©2012 Prentice Hall Business Publishing, Auditing 14/e, Arens/Elder/Beasley Audit Sampling for Tests of Details of Balances Chapter 17.
Lecture 3: Skewness and Kurtosis
Simultaneous Reconciliation of a Large Disaggregated Time Series of Accounts with an Application to the U.S. Input-Output Accounts Baoline Chen1 Tommaso.
Chapter 2: Modeling Distributions of Data
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Audit Sampling for Tests of Details of Balances
Audit Sampling for Tests of Details of Balances
AN INTRODUCTION TO EDUCATIONAL RESEARCH.
Chapter 2: Modeling Distributions of Data
Stats Club Marnie Brennan
saklviTüal½y esAs_GuIsf_eGyeso UNIVERSITY OF SOUTH-EAST ASIA
Chapter 2: Modeling Distributions of Data
BA 275 Quantitative Business Methods
Chapter 2: Modeling Distributions of Data
Unit XI: Data Analysis in nursing research
Chapter 2: Modeling Distributions of Data
15.1 The Role of Statistics in the Research Process
Review for Exam 1 Ch 1-5 Ch 1-3 Descriptive Statistics
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
2.7 Annex 3 – Quality reports
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Presentation transcript:

Antonio Bernardi - Fulvia Cerroni - Viviana De Giorgi (Istat) An application to the Tax Authority Source (Sector Studies) Session: Administrative data 10 July 2008 A methodological process for assessing variables coming from administrative sources

10 July Agenda A methodological process for assessing variables coming from administrative sources Part 1 - Scheme for assessing administrative sources for statistical use Part 2 - The process for assessing variables: the theory Part 3 - An application to the Tax Authority Source - Sector Studies (SS)

10 July Agenda A methodological process for assessing variables coming from administrative sources Part 1 - Scheme for assessing administrative sources for statistical use Part 2 - The process for assessing variables: the theory Part 3 - An application to the Tax Authority Source - Sector Studies (SS)

10 July Background and motivations A methodological process for assessing variables coming from administrative sources use of administrative archives in place of statistical surveys much more information on small medium enterprises reducing the statistical burden development of a general scheme for validating administrative data as statistical ones focus on the process of assessing quantitative variables with benchmark Sector Studies (SS) compared with the statistical survey on SMEs as a benchmark source

10 July Scheme for assessing administrative sources 1/2 A methodological process for assessing variables coming from administrative sources Part 1

10 July Scheme for assessing administrative sources 2/2 A methodological process for assessing variables coming from administrative sources Part 1 Preliminary judgement on an administrative archive Is it possible to identify a well defined universe?yes/no Reference population for coverageyes (specify) Mean coverage level(specify percentage) Coverage level (by existing disaggregation)between … and … (specify) Are there any benchmark variables?yes (specify)/no Can data be imported in a SAS format?yes/no Data delivery timeliness(specify) Does it need a formal request for data releasing?yes/no Variables’ classificationsspecify existing problems Judgement we can/can not go on processing the source

10 July A methodological process for assessing variables coming from administrative sources Part 1 - General scheme for assessing administrative sources for statistical use Part 2 - The process for assessing variables: the theory Part 3 - An application to the Tax Authority Source - Sector Studies (SS)

10 July Scheme for assessing quantitative variables having a benchmark A methodological process for assessing variables coming from administrative sources Part 2 QUANTITATIVE ASSESSMENT QUALITATIVE ASSESSMENT INPUT: DATA (ARCHIVES) OUTPUT: VARIABLE’S ASSESSMENT FOR STATISTICAL USE

10 July Qualitative and quantitative assessment of a variable 1/2 A methodological process for assessing variables coming from administrative sources 1.Outlier detection: irregular values/outliers Irregular values: legal and economic constraints are taken into account inexistence of a systematic scheme for them Outliers: 2 out of 3 criterions should be satisfied i.statistical/probabilistic (Bienaymé–Tchebicev) ii.computational/explorative (k-mean clustering method) iii.deterministic (relative differences within the threshold values of  5%,  2% or  1%) inexistence of a systematic scheme for them Part 2

10 July A methodological process for assessing variables coming from administrative sources 2. Standard validation: For both the source variable and its benchmark calculation of the main descriptive statistics (mean, std, median, asymmetry, kurtosis) and check whether the distance between the two variables decreases from the raw to the trimmed distribution through the kernel histogram check whether the series have the same graphical shape and the distribution of the deviations is symmetric, leptokurtic and with a zero mean. 3. Practical validation: It is useful for specific surveys and studies to check a level of concordance between the variable and its benchmark Frequency validation: concordance by class frequencies, simple index of dissimilarity, Cohen coefficient, relative weights of frequencies on the main diagonal, verification of correspondence by log-linear model adjusting test By group validation: per group concordance by checking the linearity of the groups’ means Micro-data validation: robust point to point correspondence through regression techniques Quantitative assessment of a variable 2/2 Part 2

10 July A methodological process for assessing variables coming from administrative sources Part 1 - General scheme for assessing administrative sources for statistical use Part 2 - The process for assessing variables: the theory Part 3 - An application to the Tax Authority Source - Sector Studies

10 July A methodological process for assessing variables coming from administrative sources Part 3 Assessing the source: The accounting table of Sector Studies Preliminary judgement on the accounting table of Sector Studies Is it possible to identify a well defined universe?yes Reference population for coverage Italian Business Register (ASIA) Mean coverage level79.4% Coverage level (by existing disaggregation)between 65% and 90% Are there any benchmark variables?yes (SME survey) Can data be imported in a SAS format?yes Data delivery timeliness15-months time lag Does it need a formal request for data releasing?yes Variables’ classifications some differences exist but they can be overcome Judgement the accounting table can be processed through the procedure for assessing variables

10 July A methodological process for assessing variables coming from administrative sources Part 3 Qualitative assessment First hypothesis: assess each cost variable of Sector Studies with its own SME survey benchmark Results: comparison of definitions is not effective for each variable. Even forcing the definition, the numerical evaluation is not effective: an appropriate combination of variables and its new benchmark should be taken into account Second hypothesis: assess total cost of Sector Studies with the total cost of SME survey Total cost of SS = Total cost of SME survey Assessing the variable: the total cost 1/5

10 July A methodological process for assessing variables coming from administrative sources Part 3 Quantitative assessment Outlier detection and standard validation Assessing the variable: the total cost 2/5

10 July A methodological process for assessing variables coming from administrative sources Part 3 Fig 1. Distribution of the deviations of SS from SME survey values Assessing the variable: the total cost 3/5

10 July A methodological process for assessing variables coming from administrative sources Part 3 Practical validation Frequency validation the independence between the two sources does not exist: the percentage of frequencies on the main diagonal (79.8%) plus the percentage found on its contiguous lines achieves 95.8% By group validation Assessing the variable: the total cost 4/5

10 July A methodological process for assessing variables coming from administrative sources Part 3 Micro-data validation Correlation coefficient (Pearson): Linear regression: TC (SS) = α + β×TC (SMEs) a ≈ 0 b ≈ 1 R 2 = Point to point correspondence (through the robust regression method) : 87,8% Conclusion Judgment on the total cost: the variable is reliable at an individual level Assessing the variable: the total cost 5/5

10 July Summary of the overall process A methodological process for assessing variables coming from administrative sources Part 3

10 July Thank you for your attention For further information: Antonio Bernardi: Fulvia Cerroni: Viviana De Giorgi: