ESSnet Big Data Dissemination Workshop, Sofia

Slides:



Advertisements
Similar presentations
United Nations Statistics Division/DESA
Advertisements

Building Up a Real Sector Confidence Index for Turkey Ece Oral Dilara Ece Türknur Hamsici CBRT.
Indicator of Economic Sentiment and Confidence Indicator in Services Statistical Office of the Slovak Republic Dagmar Blahova, Edita Holickova.
Mainstream Enterprise Statistics Richard McMahon Head of STS Division Central Statistics Office, Ireland.
Session 2 : The Downturn & Irish Business Richard McMahon Central Statistics Office.
28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee.
OECD Short-Term Economic Statistics Working PartyJune Impact and timing of revisions for seasonally adjusted series relative to those for the.
OECD Short-Term Economic Statistics Working PartyJune Analysis of revisions for short-term economic statistics Richard McKenzie OECD OECD Short.
Chapter 5 DEMAND FORECASTING Prepared by Mark A. Jacobs, PhD
WORKING PARTY ON NATIONAL ACCOUNTS Paris, 3-5 October 2007 Revisions in Quarterly GDP of OECD Countries: An Update Document STD/CSTAT/WPNA(2007)15 Richard.
ACCT: 742-Advanced Auditing
Compilation Methodology Highlighted GDP
Georgian Quarterly National Accounts and time-series of Short Term Statistics indicators Levan Gogoberishvili Head of National Accounts Division, Geostat.
Business Forecasting Used to try to predict the future Uses two main methods: Qualitative – seeking opinions on which to base decision making – Consumer.
Workshop on the Methodological Review of Benchmarking, Rebasing and Chain-linking of Economic Indicators August 2011, Vientiane, Lao People’s Democratic.
Inflation Report November Output and supply.
12 October 2009 EU-OECD Workshop Introducing NACE rev 2 in EU Short-term business Statistics Brian Newson Head of STS unit Eurostat.
Time-Series Forecasting Overview Moving Averages Exponential Smoothing Seasonality.
Data Liberation Initiative Seasonal Adjustment Gylliane Gervais March 2009.
German Federal Ministry of Economics German Federal Ministry of Finance Short-term economic indicators for business cycle analysis and forecasts as a basis.
Performance Indicators Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May.
Round Table Round Table Current State of Seasonal Adjustment in Countries/ UNECE Workshop on Short-Term Statistics (STS) and Seasonal Adjustment 14 – 17.
Recent work on revisions in the UK Robin Youll Director Short Term Output Indicators Division Office for National Statistics United Kingdom.
Long Time Series for Germany 1 Presentation at the Euroindicators Working Group 10 th Meeting – 3 & 4 December 2007 Luxembourg Stephanus Arz Statistics.
Big Data activities at SURS Statistical Office of the Republic of Slovenia DIME/ITDG meeting, February 2016.
Chapter 15 Inference for Regression. How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.
The Process of Merchandise Planning
WEB SCRAPING FOR JOB STATISTICS
Demand Estimation and Forecasting
13th OECD-NBS Workshop on National Accounts
Basic Estimation Techniques
Artur Andrysiak Economic Statistics Section, UNECE
Carsten Boldsen Hansen Economic Statistics Section, UNECE
Presentation by Eurostat
Statistics for Managers using Microsoft Excel 3rd Edition
Rudi Seljak, Aleš Krajnc
Item 5а National Accounts of Ukraine: Current Status and Development Perspectives Irina N. Nikitina Director of Macroeconomic Statistics Workshop on the.
Regional Workshop on Short-term Economic Indicators and Service Statistics September 2017 Chiba, Japan Alick Nyasulu SIAP.
Estimation of Flash GDP at T+30 days for EU28 and EA18/EA19
WP8 Methodology (SGA2) Piet Daas NL, AT, BG, IT, PT, PL, SL.
Henri Luomaranta, Statistics Finland
OECD SHORT-TERM ECONOMIC STATISTICS WORKING PARTY (STESWP) MEETING
Module 2: Demand Forecasting 2.
Big Data Econometrics: Nowcasting and Early Estimates
Macroeconomic heatmap taking the temperature of the Estonian economy
INDEX OF SERVICE PRODUCTION
Joint EU-OECD Workshop on International Development of Business and Consumer tendency Surveys Use of Business and Consumer pinion Survey Data in the ECB.
Estimation of Flash GDP at T+30 days for EU28 and EA18/EA19
Principles of Supply Chain Management: A Balanced Approach
Goals and objectives of Work package 2 of the ESSnet on Consistency of concepts and applied methods of business and trade-related statistics Norbert Rainer,
Dissemination Workshop ESSnet Big Data Sofia, February 2017
SHORT TERM INDICATORS IN THE SERVICE SECTOR
The computation of the first estimates
Global Assessment on Tendency Surveys
Statistical Office of the Republic of Slovenia
European Central Bank – DG Statistics
Quarterly National Accounts - Orientation
United Nations Statistics Division
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Ossi Nurmi 15th Global Forum on Tourism Statistics, Cusco, Peru
STATISTICS KAZAKHSTAN’S RESPONSE TO CHANGES IN ECONOMIC SITUATION MESHIMBAYEVA ANAR CHAIRPERSON, AGENCY OF THE REPUBLIC OF KAZAKHSTAN ON STATISTICS.
WP 6 Combining big data: early estimates
Global Trends: Global Financial Crisis
Unit for business structure
Short Term Statistics in National Accounts
Demand Management and Forecasting
Big Data in Official Statistics: Generalities
GDP Growth Forecasts for 2019 Have Been Weakened in Recent Months
Forecasting Plays an important role in many industries
Presentation transcript:

ESSnet Big Data Dissemination Workshop, Sofia WP6- Early estimates Boro Nikic ESSnet Big Data Dissemination Workshop, Sofia 23-24. 2. 2017

WP6-Goals WP 6 (and 7). The aim of this pilot is to investigate multiple big data, administrative and other existing sources in order to produce early estimates for statistical purposes. The project aims for WP 6 (combining sources) at implementing the phases ‘data access’ and ‘data handling’ during the first 12 months of the project. The phases ‘methodology and techniques’ and ‘statistical outputs’ are carried out in the second SGA-period. The exception to this rule is the quick-win on turnover estimates. Partners SI FI NL PL Period Work. Days 100 60 10 40 1.2.2016- 28.2.2017

Work done in period of SGA1 (1) At the Statistical Office of the Republic of Slovenia several brainstorming meetings of ideas were organized (March 2016) Collaboration with WP7 team Detailing the path of the pilot of now-casting the turnover indicators April 2016) Investigation of sources and methodology for calculation of the Consumer Confidence Indices (April 2016) Searching for the additional ideas for the „quick wins“ (newsfeeds, google trends,…, Maj 2016)

Work done in period of SGA1 (2) Joint meeting of WP6 and WP7 memebrs in Warsaw (June,2016) Finalising the proposal for SGA2 (November 206) Results of nowcasting experiment (Statistics Finland , January 2017) Results of nowcasting experiment (SURS, January 2017)

Proposed pilot for SGA2 (1) Title of the pilot: Erly estimates of economic indicators Main economic indicators: Gross domestic product (GDP) Consumer price index (CPI) Retail sale Balance of payments Economic sentiment indictors New leading economic indicators

Proposed pilot for SGA2 (2) Aim of the pilot: Investigate multiple Big data and other existing sources for purposes of early estimates of at least one of the main economic indicators (partly in SGA1) Create and test the methodology of creating early estimates for at least one of the main economic indicators. Define and test the quality measures which assess quality of the sources, statistical production and statistical results Multinational dimension: Many of the sources are available in most of the countries so it is possible to test them and create the results for more than one country. Even if the country does not have access to any Big data source it is still possible to test methods and processes on administrative and other existing sources.

Big Data sources (2) Big Data Availability (SURS) Job Vacancies Ads from job portals Yes Traffic loops Social media data (Twitter, Facebook,…) ? Data from supermarket chains Mobile phone data Transaction data from banks

Traffic loops in Slovenia Around 660 traffic loops 10 categories of vehicles Frequencies of certain type of vehicle are available in 15 min interval Data since 2005 Sample data already at SURS

SURS survey / administrative sources Availability of majority of data Existing sources (2) SURS survey / administrative sources (monthly) Dissemination Availability of majority of data Business tendencies t-5 Short term statistics (industry, construction, services, trade) t+30-60 t+20-30 Foreign trade t+40 Building permits t+20 (2017) t+5 Demography of enterprises (SBR) t+20-25 VATdata (FURS) t+45 T+20 (rok za oddajo) Wages …

Nowcasting turnover indices One of the pilots that was started in WP6 Statistics Finland (basic proposal) , Statistical Office of teh Republic of Slovenia Interesting methodological suggestions for estimating early economic indicators → SURS decided to start with this idea Modeling isn‘t new, but is very often used in connection with big data sources. Modeling is very useful for estimations of early economic indicators based on many different data sources

Nowcasting model (1) The idea and a short example of the model came from partners from Statistics Finland). The nowcasting model consists of 2 stages: Principal Component Analysis (PCA) is used to extract principal components from enterprise data. For each enterprise included in the model, time series of data without any missing values is needed. Then, first few principal components are chosen. Linear regression is used: the time series of interest (e.g. GDP) is the dependent variable (Y) and the chosen principal components are the predictors (X1, …, Xn). Seasonal component and other predictors can be added

Nowcasting model (2) Possibilities considered for the model: Time series of interest: GDP in constant prices (chain-linked volumes, reference year 2010) from 2008Q1 to 2015Q4. 8 different testing spans: the first period is always 2008Q1; the last period is 2014Q1 or 2014Q2 or … or 2015Q4. 3 different sets of enterprise data: D1: turnover in industry; D2: turnover in retail trade; D3: turnover in industry and retail trade (i.e. D1 and D2 together).

Nowcasting model (3) 5 different conditions for choosing enterprise data: Condition Meaning s10 choose only raw enterprise data that are available sooner than 10 days (i.e. until the end of the 9th day) after the end of the last period s20 choose only raw enterprise data that are available sooner than 20 days (i.e. until the end of the 19th day) after the end of the last period s30 choose only raw enterprise data that are available sooner than 30 days (i.e. until the end of the 29th day) after the end of the last period s46 choose only raw enterprise data that are available sooner than 46 days (i.e. until the end of the 45th day) after the end of the last period u choose edited enterprise data

Nowcasting model (4) 11 different conditions for choosing principal components: Condition Meaning Last5 take every p. c., whose eigenvalue's share among all eigenvalues is greater or equal to 5% Po7 take only as many p. c. to have at least 7 cases (time periods) per independent variable later in the linear regression Po8 take only as many p. c. to have at least 8 cases (time periods) per independent variable later in the linear regression po10 take only as many p. c. to have at least 10 cases (time periods) per independent variable later in the linear regression po15 take only as many p. c. to have at least 15 cases (time periods) per independent variable later in the linear regression po20 take only as many p. c. to have at least 20 cases (time periods) per independent variable later in the linear regression 70 take enough p. c. to explain 70% (or a bit more) of the variability of the enterprise data 75 take enough p. c. to explain 75% (or a bit more) of the variability of the enterprise data 80 take enough p. c. to explain 80% (or a bit more) of the variability of the enterprise data 85 take enough p. c. to explain 85% (or a bit more) of the variability of the enterprise data 90 take enough p. c. to explain 90% (or a bit more) of the variability of the enterprise data

Nowcasting model (5) Seasonality can be added as an additional predictor or not Sentiment indicator can be added as an additional predictor or not Alltogether 5280 models are made. Comments on data preparation: Enterprise data are prepared using SAS. The data sets used are a good approximation of the real state. It is impossible for us to get a true state for a certain data set at a certain time in the past, but we can estimate the state well. Since we started using e-questionares (2013M04 in industry, 2014M01 in retail trade), we have the data for some enterprises available only a few days after the end of the reference period. So we are able to get early estimates based on these data.

SURS experiment results (1) Data: Real turnover of 973 industrial enterprises in period 2008 – 2015 Period P001 P002 … P973 2008M01 3526 214 66519 2008M02 4252 332 36012 2008M03 4111 411 52447 2015M12 5241 412 71025

SURS experiment results (2) Data: Real turnover of of 973 industrial enterprises in period 2008 – 2015 (quater = average of quaterly months) Period P001 (Ent1) P002 (Ent2) … P973 (Ent973) 2008Q1 3963 319 51659.33 2015Q4 5119 422.67 72549

SURS experiment results (3) Statistic which is „nowcasted“: GDP at constant prices Period GDP at constant prices 2008Q1 9359.9963 2008Q2 10112.15698 2008Q3 9901.924772 … 2015Q4 9401.752916

SURS experiment results (4) Principal component analysis: 8 chosen principal components explain around 80.2 % of the variablity of enterprise data Linear regression: Y: GDP_CP X1,...,X8: principal components 97% of variability of real GDP_CP index is explained Comperison of GDP_CP and estimates of GDP_CP Avg/Max absolute errors: 43 / 117,2 Avg/Max absolute relative errors: 0,48% / 1,34% Original value 2015Q4: 9401,7 Estimate: 9458 Error: 56,2 Relative error: 0.60% Metoda glavnih komponent: na podatkih od 2008Q1 do 2015Q4. Linearna regresija: - Model: izbranim glavnim komponentam odvzamemo zadnje obdobje, tako da so od 2008Q1 do 2015Q3. - Koeficiente modela uporabimo za izračun napovedi za 2015Q4.

Methods (ongoing work SGA1&SGA2) Test at least one alternative method for nowcasting of economic indicators Include data from multiple sources (construction, services,...) Test forecasting based on available data Prepare an inventory of nowcasting methods

Early estimates (ongoing work) Inventory of current practices in other countries/institutions Prepare a list of possible „new leading economic indicators“

IT tools involved in nowcasting of turnover indices Data preparation Modeling Results STATISTICAL PRODUCTION

IT infrasctructure in SGA2 Sandbox for insensitive big data (e.g. traffic loops data) Internal IT environments for sensitive data

Thank you for your attention! boro.nikic@gov.si