New estimation system in Danish Intrastat Mr Søren Rich, Senior Adviser, Statistics Denmark ESTP course 'Advanced issues in international trade in goods.

Slides:



Advertisements
Similar presentations
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Advertisements

1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Objectives (BPS chapter 24)
1 BIS APPLICATION MANAGEMENT INFORMATION SYSTEM Advance forecasting Forecasting by identifying patterns in the past data Chapter outline: 1.Extrapolation.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 10 Simple Regression.
BA 555 Practical Business Analysis
Slides by JOHN LOUCKS St. Edward’s University.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
OECD Short-Term Economic Statistics Working PartyJune Analysis of revisions for short-term economic statistics Richard McKenzie OECD OECD Short.
Simple Linear Regression Analysis
Objectives of Multiple Regression
Chapter 13: Inference in Regression
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Ch4 Describing Relationships Between Variables. Pressure.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1 Calculation of unit value indices at Eurostat Training course on Trade Indices Beirut, December 2009 European Commission, DG Eurostat Unit G3 International.
Chapter 13 Multiple Regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Recent work on revisions in the UK Robin Youll Director Short Term Output Indicators Division Office for National Statistics United Kingdom.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
MBF1413 | Quantitative Methods Prepared by Dr Khairul Anuar 8: Time Series Analysis & Forecasting – Part 1
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
4-6 September 2013, Vilnius Quality in Statistics: Administrative Data and Official Statistics USING ADMINISTRATIVE DATA SOURCES IN OFFICIAL.
Chapter 13 Simple Linear Regression
Chapter 15 Multiple Regression Model Building
The simple linear regression model and parameter estimation
Chapter 11: Linear Regression and Correlation
EXCEL: Multiple Regression
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Linear Regression.
Carsten Boldsen Hansen Economic Statistics Section, UNECE
Presentation by Eurostat
Essentials of Modern Business Statistics (7e)
Multiple Regression and Model Building
International Trade in Goods Statistics in the EU Lídia Bassó
L. Isella, A. Karvounaraki (JRC) D. Karlis (AUEB)
Some elements on compliance actions and threshold setting in the Italian Intrastat system ADVANCED ISSUES IN INTERNATIONAL TRADE IN GOODS STATISTICS ESTP.
ESTP COURSE ON PRODCOM STATISTICS
Structural Business Statistics Data validation
ADMINISTRATIVE DATA IN ANNUAL BUSINESS STATISTICS OF LATVIA
MBF1413 | Quantitative Methods Prepared by Dr Khairul Anuar
Multiple Regression Models
The computation of the first estimates
WinTIM, Indices methodology and tool Wiking Althoff, CESD Communautaire External trade experts meeting on the CARDS Programme, Luxembourg, May.
Estimation techniques for missing intra-EU trade
A New Business Statistics in Finland - Quarterly Investments
Quarterly National Accounts - Orientation
© 2017 by McGraw-Hill Education
Prodcom ESTP course October 2010
DEVELOPMENT OF IMPUTATION MODEL FOR SMALL ENTERPRISES
ANALYSIS OF POSSIBILITY TO USE TAX AUTHORITY DATA IN STS
DSS-ESTIMATING COSTS Cost estimation is the process of estimating the relationship between costs and cost driver activities. We estimate costs for three.
Regression Forecasting and Model Building
Data validation handbook
Section 6.2 Prediction.
ANALYSIS OF POSSIBILITY TO USE TAX AUTHORITY DATA IN STS. RESULTS
Chapter 13 Additional Topics in Regression Analysis
Linking trade statistics with business statistics
ESTP course on 'Advanced issues in International Trade in Goods Statistics' 2-4 April 2014 QUALITY HANDBOOK.
Threshold setting in UK trade statistics compilation
Chapter 13 Excel Extension: Now You Try!
Forecasting Plays an important role in many industries
Presentation transcript:

New estimation system in Danish Intrastat Mr Søren Rich, Senior Adviser, Statistics Denmark ESTP course 'Advanced issues in international trade in goods statistics‘ Eurostat, Luxembourg, 3 April 2014

Overview Background The old estimation system A need for a single new estimation system The new estimation system Experience from using the new estimation system

Background Good and early access to VAT data in Denmark In the past, VAT data have been used extensively in estimation of intra-EU adjustments … … and a good and extensive validation of VAT/Intrastat differences at PSI level has taken place However, contents of VAT had changed in recent years… … and a review of old practice and the two-string estimation system was needed. New system developed 2011-2012.

The old estimation system (1) Two systems, both using a bottom-up approach in the estimation A FLASH estimation system for estimation of missing trade in the aggregated statistics (latest month) The system forecasts using a growth rate approach, calculating growth rates using data from PSIs in same 4-digit NACE group Estimates both non-response and below threshold trade Use the face value of Intrastat and check for partial non-response is only done manually for the largest and most obvious examples of underreported value. A system for estimation of missing trade in the releases of detailed statistics. The system is named MESTER (M for the Danish word for VAT (Moms) and ESTer for estimation)…

The old estimation system (2) MESTER takes an approach where each month is handle independently and where VAT is used extensively to compensate for missing Intrastat. Estimates both non-response and below threshold trade If Intrastat is missing, VAT value is applied and if VAT is larger than Intrastat a partial non-response adjustment is added: in arrivals the adjustment is calculated as 43 per cent of the Intrastat/VAT difference – in dispatches the correction factor is 53 per cent. no adjustments are made if Intrastat value is higher than VAT. the correction factors are based on a survey of revision patterns in Intrastat and VAT. However, the correction factors have not been updated in recent years.

The old estimation system (3) MESTER needs a good validation of differences between VAT/Intrastat to work properly… The prerequisite for the partial non response practice in MESTER has been a good and extensive validation of VAT/Intrastat differences at PSI level… ... so that that large differences between VAT/Intrastat had been validated and that remaining differences was caused by underreporting in Intrastat. But validation of VAT/Intrastat in recent years has pointed toward that in most cases Intrastat turn out to be correct: the differences are mainly explained by wrongful declaration of VAT or that the VAT value include transactions which is not part of Intrastat, i.e. triangular trade, installation services, etc…. … this change could be caused by change of VAT reporting behavior or gradually change in contents of the VAT statements.

Need for a single estimation system Two systems with different methodology can create revisions when the compilation of a specific reference month change from one to the other estimation system Too inadequate handling of underreported trade (partial non-response) in FLASH Too ‘efficient’ estimation of underreported trade (partial non-response) in MESTER… A single system with a common methodology was needed…

The new estimation system: The basic principles We use Intrastat if it is available, but we check for underreported Intrastat or partial non response If we historically have a good and stable relation between Intrastat and VAT for a specific PSI, then we will used VAT in case of missing Intrastat In case of unstable relation between Intrastat and VAT, a limited range of forecast tools are applied, based on the optimal set of input data. All EU traders are divided in 3 groups, according to the availability of Intrastat: Exempted enterprises Intrastat PSIs with short period of historical Intrastat data available (less than 13 months historical Intrastat) Intrastat PSIs with long period of historical Intrastat data available (13 or more months with historical Intrastat)

Below threshold trade In cases of exempted trader (below threshold traders), we use the reported VAT value of the enterprises in question In case of voluntary Intrastat declarations, we use the value of Intrastat In case of missing VAT declarations (small enterprises can declare VAT at quarterly or semi-annual basis), we forecast the missing VAT In cases of traders above the threshold without any historical Intrastat – mostly enterprises which have not yet become Intrastat PSIs – we also use the value from VAT statements.

Estimation of non-response We divide all the PSI with non-response according to the availability of Intrastat: Intrastat PSIs with short period of historical Intrastat data available (less than 13 months historical Intrastat) Intrastat PSIs with long period of historical Intrastat data available (13 or more months with historical Intrastat) This is done to group the PSIs according to the estimation methods, which can produce the best results: ad 1): We use a fixed effect model ad 2): We use the fixed effect model or corrected VAT in case of stable relationship between VAT and Intrastat.

Estimation of non-response: Fixed effect model (1) We have tested various panel estimation models and regression models as forecast methods for missing Intrastat. Our analyses have shown that a fixed effect model produce the best prediction of Intrastat. Fixed effect models are regression models where the constant term is dependent on the incoming independent (explanatory) variables, e.g. the dependent variable can be modelled by…

Estimation of non-response: Fixed effect model (2) Yi = Xij βj + εi i = 1,…,n where Yi is the dependent variable Xij are the independent variables (character or numeric variables) β1 ,…, βp are fixed effect parameters and ε1 ,…, εn are independent identical normal-distributed stochastic variables with mean 0 and variance σ2.

Estimation of non-response: Fixed effect model (3) Having tested different models, the following model is considered to provide the best description of the data material by the use of the following set of dependent variables (Xij j = 1 … p): Model: FORECAST= Level + UHO + MOMS + UHO*YM_NUM UHO (PSI identifier), character variable, a dummy variable is created for each value of UHO with the value 1 for the current UHO and the value 0 for all others, MOMS, numeric variable, the observed VAT YM_NUM, numeric time variables, here only the values 1, 2, 13 and 14 are used, i.e. the current month (14), the previous month (13) and the two corresponding months of the previous year (1 and 2).  

Estimation of non-response: Fixed effect model (4) That is, FORECAST is modelled as a function of the independent variables UHO (the PSI), MOMS (the reported VAT) and UHO*YM_NUM (the interplay of PSI and year/month). The estimation uses the current month, the previous month and the two corresponding months of the previous year.

Estimation of non-response: Fixed effect model (5) The estimation is done using the SAS procedure PROC MIXED The fixed effect model makes the best prediction when VAT is available for the month with missing Intrastat value and when Intrastat is available in the two corresponding months of the previous year. Because of less good results for Intrastat PSIs with short period of historical Intrastat data available, we are planning to look at these possibility to introduce trend-correction methods for these PSI.

Estimation of non-response: Stable relation between VAT and Intra (1) Some PSI has a very stable relationship between Intrastat and VAT. In these cases VAT value can be a good proxy for a missing Intrastat value. All PSI with non-response where long series of historical Intrastat and VAT data are available, we test for stable relationship. This is done using a simple linear regression model: I = +β V+ ε   where I is the dependent variable, Intrastat value, V is the independent variable, VAT value,  is a constant, i.e. the intercept value, β is parameter giving the slope of the linear relation and e is a independent identical normal-distributed stochastic variable with mean 0 and variance σ2.

Estimation of non-response: Stable relation between VAT and Intra (2) To detect stable relation we rank the PSIs according to their individual value of the coefficient of determination R2. In case of the simple linear regression the R2 is simply the square of the sample correlation coefficient between Intrastat and VAT. R2 will have a value between 0 and 1, and the greater value, the greater is the correlation.   In case of high value of R2 and   1 and   0 then Intrastat = VAT and VAT value can be applied in case of missing Intrastat. In case of   1 and   0 then Intrastat can be calculated at  + V. In case that both   1 and   0 no stable relationship exists.

Estimation of non-response: stable relation between VAT and Intra (3)   1   1   0 VAT value V is used Compare VAT value V with predicted value of fixed effect model*   0  + V is used, but the estimate is compared with predicted value of fixed effect model* Use panel estimation with fixed effect model * A manual check of the largest adjustments will be carried out

Partial non-response test All month where a PSI has reported Intrastat are checked for potential partial non-response. Declarations of each PSI is tested for outliers (small values) using this score function: (Monthly obs. – Q2 value) (Q3 value – Q1 value) where Q2 is the median value of the PSI and Q1 and Q3 is lower and upper quartiles at PSI level. The test is done on monthly Intrastat values and monthly values of the Intrastat/VAT relation In cases where both outlier tests indicates a low value, an adjustment equal to the differences between VAT and Intrastat (if VAT > Intrastat) is added to the reported value of Intrastat.

False nil declarations Sometimes PSIs are wrongly declaring zero trade (nil declarations) to avoid penalties for late reporting or to avoid the burden of reporting. To adjust for this, all monthly nil declarations are matched with the corresponding VAT declarations at PSI level. In the cases where the VAT or Intrastat data has not been corrected in our VAT/Intrastat Quality Control System, an adjustments of Intrastat is made if a positive value has been declared in the VAT declaration. The false nil declarations identified in the estimation system are checked afterwards in the Quality Control System.

Experience gained from using the new estimation system The new estimation system, called ESTER (as in ESTimation) was introduced in October 2012 in the monthly production of 2012 data. The system has also been used for revising 2009-2011. The system has resulted in an expected lowering of intra-EU arrivals of 2 per cent and 1 per cent of intra-EU dispatches, producing a minor data break between 2008 and 2009. The new system has lowered the level of revisions compared with the old estimation system.

Implementation of complete data compilation system In connection with the development of new estimation system, Statistics Denmark has also redesigned the system for compilation of the statistics. This has resulted in: consolidation of the basic data extracted for the compilation more coherent and user friendly system for the data compilation more coherency in the data deliveries to Eurostat, Balance of Payment/National Account and our own dissemination and the One Button system!!...

The One Button system ”We have custom made your key board so it only includes the buttons, which is strictly necessary for your job. Smart, right?”

The end Thanks a lot for your attention! Mr Søren Rich, sri@dst.dk