FORECASTING METHODS OF NON- STATIONARY STOCHASTIC PROCESSES THAT USE EXTERNAL CRITERIA Igor V. Kononenko, Anton N. Repin National Technical University.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Chapter 16 Inferential Statistics
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
POINT ESTIMATION AND INTERVAL ESTIMATION
Statistics for Managers Using Microsoft® Excel 5th Edition
Objectives (BPS chapter 24)
STAT 497 APPLIED TIME SERIES ANALYSIS
PSY 307 – Statistics for the Behavioral Sciences
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Evaluation.
Sampling Distributions
Evaluating Hypotheses
Prediction and model selection
Class notes for ISE 201 San Jose State University
Statistical Background
Chapter 8 Estimation: Single Population
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
5-3 Inference on the Means of Two Populations, Variances Unknown
The Lognormal Distribution
Business Forecasting Chapter 5 Forecasting with Smoothing Techniques.
Radial Basis Function Networks
Business Statistics: Communicating with Numbers
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Simulation Output Analysis
2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
Statistical inference. Distribution of the sample mean Take a random sample of n independent observations from a population. Calculate the mean of these.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #5 Jose M. Cruz Assistant Professor.
Yaomin Jin Design of Experiments Morris Method.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
CpSc 881: Machine Learning Evaluating Hypotheses.
Machine Learning Chapter 5. Evaluating Hypotheses
1 CSI5388 Current Approaches to Evaluation (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Chapter 10 The t Test for Two Independent Samples
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Analysis of Experimental Data; Introduction
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Estimating standard error using bootstrap
ESTIMATION.
STATISTICAL INFERENCE
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Unfolding Problem: A Machine Learning Approach
CONCEPTS OF ESTIMATION
Econ 3790: Business and Economics Statistics
Warmup To check the accuracy of a scale, a weight is weighed repeatedly. The scale readings are normally distributed with a standard deviation of
Unfolding with system identification
16. Mean Square Estimation
Presentation transcript:

FORECASTING METHODS OF NON- STATIONARY STOCHASTIC PROCESSES THAT USE EXTERNAL CRITERIA Igor V. Kononenko, Anton N. Repin National Technical University “Kharkiv Polytechnic Institute”, Ukraine The 28 th Annual International Symposium on Forecasting June 22-25, 2008, Nice, France

INTRODUCTION While forecasting the development of socio-economic systems there often arise the problems of forecasting the non-stationary stochastic processes having a scarce number of observations (5- 30), while the repeated realizations of processes are impossible. To solve such problems there have been suggested a number of methods, in which unknown parameters of the model are estimated not at all points of time series, but at a certain subset of points, called a learning sequence. At the remaining points not included in the learning sequence and called the check sequence, the suitability of the model for describing the time series is determined.

PURPOSE OF WORK The purpose of this work is to create and study an effective forecasting method of non-stationary stochastic processes in the case when observations in the base period are scarce

H-CRITERION METHOD (I. Kononenko, 1982) Retrospective information Vector of values of the predicted variable q – number of significant variables including the predicted variable n – number of points in the time base of forecast

H-CRITERION METHOD (2)

H-CRITERION METHOD (3) The parameters of all formed models are estimated using the learning submatrix – the vector of estimated parameters for j-th model  – the vector of weighting coefficients considering the error variance or importance for building the model

H-CRITERION METHOD (4) For each j - th model at all points of past history we calculate Calculating

H-CRITERION METHOD (5) New learning and check submatrices are chosen. The number of rows in Г L is decreased by one. The process of estimation of model parameters and calculation of  3 is repeated. The learning submatrix is used as the check one and the check submatrix - as the learning one,  4 is calculated and similarly we continue using the bipartitioning. The process is stopped after a set number of iteration g. H-criterion:

METHOD, THAT USES THE BOOTSTRAP EVALUATION (I. Kononenko, 1990) Retrospective information Vector of values of the predicted variable q – number of significant variables including the predicted variable n – volume of past history

L – the number of a model in the set of test models Testing the model N i – vector of independent variables B – vector of estimated parameters  i – independent errors having the same and symmetrical density of distribution METHOD, THAT USES THE BOOTSTRAP EVALUATION (2)

1. The parameters of the model we estimate using matrix  basing on the condition – loss function, Determining the deviation from points of  Numbers bias i form the BIAS vector METHOD, THAT USES THE BOOTSTRAP EVALUATION (3)

2. We divide the matrix  into two submatrices – learning submatrix  L and check submatrix  C. First n-1 columns of matrix  are included in submatrix  L and n-th column is included in  C. Estimating the parameters B of the test model and obtaining Calculating the deviation of the model from the statistics Let k=1, where k – number of iteration, which performs the bootstrap evaluation METHOD, THAT USES THE BOOTSTRAP EVALUATION (4)

3. We perform bootstrap evaluation, which consists in the following. We randomly (with equal probability) select numbers from the BIAS vector and add them to the values of model. As a result we obtain “new” statistics which looks like the following We divide the matrix  k (a new one this time) into  L,k and  C,k. We estimate the unknown parameters basing on  L,k as earlier and calculate the model deviation from  C,k METHOD, THAT USES THE BOOTSTRAP EVALUATION (5)

4. If k<K-1 then we suppose that k:=k+1 and return to step 3 (where K – number of bootstrap iterations), otherwise proceed to step 5 5. Evaluating 6. If L<z then we suppose that L:=L+1 and move to step 1 (where z – number of models in the list), otherwise we stop. The model with minimal D L is considered to be the best one. METHOD, THAT USES THE BOOTSTRAP EVALUATION (6)

ANALYSIS OF METHODS Mathematical modelsAdditive noise The loss function

ANALYSIS OF METHODS (2) Relative PMAD, evaluated at the estimation period PMAD, evaluated at the estimation period PMAD, evaluated at the forecasting period Relative MSE, evaluated at the forecasting period MSE, evaluated at the forecasting period

ANALYSIS OF METHODS (3) Average (across noise realizations):

ANALYSIS OF METHODS (4) We compared the characteristics with the two-sample t-test assuming the samples were drawn from the normally distributed populations:

ANALYSIS OF METHODS (5) PMAD, evaluated at the forecasting period

RESULTS OF ANALYSIS When the number of partitions increases in case of using matrices R 1 and R 2 we observe the downward trend of PMAD with some fluctuations in this trend that depend on the ways of data partition The partition according to the cross-validation procedure, in which the check points fall into the observation interval, produces significantly less accurate forecasts. The comparison of the efficiency of different partitions with randomly generated matrix R 4 has shown that the reasonable choice of partition sequences permits to get a more accurate longer-term forecast

RESULTS OF ANALYSIS (2) The method that uses bootstrap evaluation produces the more accurate forecast than the cross-validation procedure The comparison of two suggested methods enables to state that the method that uses bootstrap evaluation makes it possible to obtain more accurate longer-term forecasts as compared with H- criterion method only in case of a small number of partitions. Otherwise the usage of selected matrices R 1 or R 2 permits to get more accurate forecasts. Nevertheless, the method that uses bootstrap evaluation turned out to be more accurate than the H- criterion method when using matrix R 4

PRODUCTION VOLUME OF BREAD AND BAKERY IN KHARKIV REGION Mean relative error for is 5,91 %

COMBINED USE OF TWO METHODS In the real-life problems the method that use the bootstrap evaluation might turn out to be more accurate in some cases. It is recommended to use the given methods together. In such case every result obtained by means of these methods must be assigned some weight on the basis of the a priori estimates of the methods accuracy. The final forecast will be received in the form of the weighted average value of individual forecasts.

Igor V. Kononenko – Professor, Doctor of Technical Sciences, Head of Strategic Management Department; Anton N. Repin – Post-graduate of Strategic Management Department STRATEGIC MANAGEMENT DEPARTMENT NATIONAL TECHNICAL UNIVERSITY “KHARKIV POLITECHNIC INSTITUTE” 21, Frunze St., Kharkiv, 61002, Phone: +38(057) Fax: +38(057)