Download presentation
Presentation is loading. Please wait.
Published byJocelyn Symons Modified over 9 years ago
1
Time series
3
Characteristics Non-independent observations (correlations structure) Systematic variation within a year (seasonal effects) Long-term increasing or decreasing level (trend) Irregular variation of small magnitude (noise)
4
Where can time series be found? Economic indicators: Sales figures, employment statistics, stock market indices, … Meteorological data: precipitation, temperature,… Environmental monitoring: concentrations of nutrients and pollutants in air masses, rivers, marine basins,…
5
Time series analysis Purpose: Estimate different parts of a time series in order to –understand the historical pattern –judge upon the current status –make forecasts of the future development
6
Methodologies: MethodThis course? Time series regressionYes Classical decompositionYes Exponential smoothingYes ARIMA modelling (Box-Jenkins)Yes Non-parametric testsNo Transfer function and intervention modelsNo State space modellingNo Spectral domain analysisNo
7
Time series regression? Let y t =(Observed) value of times series at time point t and assume a year is divided into L seasons Regession model (with linear trend): y t = 0 + 1 t+ j sj x j,t + t where x j,t =1 if y t belongs to season j and 0 otherwise, j=1,…,L-1 and { t } are assumed to have zero mean and constant variance ( 2 )
8
The parameters 0, 1, s1,…, s,L-1 are estimated by the Ordinary Least Squares method: (b 0, b 1, b s1, …,b s,L-1 )=argmin { (y t – ( 0 + 1 t+ j sj x j,t ) 2 } Advantages: Simple and robust method Easily interpreted components Normal inference (conf.intervals, hypothesis testing) directly applicable Drawbacks: Fixed components in model (mathematical trend function and constant seasonal components) No consideration to correlation between observations
9
Example: Sales figures jan-9820.33jan-9923.58jan-0026.09jan-0128.43 feb-9820.96feb-9924.61feb-0026.66feb-0129.92 mar-9823.06mar-9927.28mar-0029.61mar-0133.44 apr-9824.48apr-9927.69apr-0032.12apr-0134.56 maj-9825.47maj-9929.99maj-0034.01maj-0134.22 jun-9828.81jun-9930.87jun-0032.98jun-0138.91 jul-9830.32jul-9932.09jul-0036.38jul-0141.31 aug-9829.56aug-9934.53aug-0035.90aug-0138.89 sep-9830.01sep-9930.85sep-0036.42sep-0140.90 okt-9826.78okt-9930.24okt-0034.04okt-0138.27 nov-9823.75nov-9927.86nov-0031.29nov-0132.02 dec-9824.06dec-9924.67dec-0028.50dec-0129.78
10
Construct seasonal indicators: x 1, x 2, …, x 12 January (1998-2001): x 1 = 1, x 2 = 0, x 3 = 0, …, x 12 = 0 February (1998-2001):x 1 = 0, x 2 = 1, x 3 = 0, …, x 12 = 0 etc. December (1998-2001):x 1 = 0, x 2 = 0, x 3 = 0, …, x 12 = 1 Use 11 indicators, e.g. x 1 - x 11 in the regression model sales time x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 20.331100000000000 20.962010000000000 23.063001000000000 24.484000100000000 IIIIIIIIIIIIII 32.0247000000000010 29.7848000000000001
12
Regression Analysis: sales versus time, x1,... The regression equation is sales = 18.9 + 0.263 time + 0.750 x1 + 1.42 x2 + 3.96 x3 + 5.07 x4 + 6.01 x5 + 7.72 x6 + 9.59 x7 + 9.02 x8 + 8.58 x9 + 6.11 x10 + 2.24 x11 Predictor Coef SE Coef T P Constant 18.8583 0.6467 29.16 0.000 time 0.26314 0.01169 22.51 0.000 x1 0.7495 0.7791 0.96 0.343 x2 1.4164 0.7772 1.82 0.077 x3 3.9632 0.7756 5.11 0.000 x4 5.0651 0.7741 6.54 0.000 x5 6.0120 0.7728 7.78 0.000 x6 7.7188 0.7716 10.00 0.000 x7 9.5882 0.7706 12.44 0.000 x8 9.0201 0.7698 11.72 0.000 x9 8.5819 0.7692 11.16 0.000 x10 6.1063 0.7688 7.94 0.000 x11 2.2406 0.7685 2.92 0.006 S = 1.087 R-Sq = 96.6% R-Sq(adj) = 95.5%
13
Analysis of Variance Source DF SS MS F P Regression 12 1179.818 98.318 83.26 0.000 Residual Error 35 41.331 1.181 Total 47 1221.150 Source DF Seq SS time 1 683.542 x1 1 79.515 x2 1 72.040 x3 1 16.541 x4 1 4.873 x5 1 0.204 x6 1 10.320 x7 1 63.284 x8 1 72.664 x9 1 100.570 x10 1 66.226 x11 1 10.039
14
Unusual Observations Obs time sales Fit SE Fit Residual St Resid 12 12.0 24.060 22.016 0.583 2.044 2.23R 21 21.0 30.850 32.966 0.548 -2.116 -2.25R R denotes an observation with a large standardized residual Predicted Values for New Observations New Obs Fit SE Fit 95.0% CI 95.0% PI 1 32.502 0.647 ( 31.189, 33.815) ( 29.934, 35.069) Values of Predictors for New Observations New Obs time x1 x2 x3 x4 x5 x6 1 49.0 1.00 0.000000 0.000000 0.000000 0.000000 0.000000 New Obs x7 x8 x9 x10 x11 1 0.000000 0.000000 0.000000 0.000000 0.000000
15
What about serial correlation in data?
16
Positive serial correlation: Values follow a smooth pattern Negative serial correlation: Values show a “thorny” pattern How to obtain it? Use the residuals.
17
Residual plot from the regression analysis: Smooth or thorny?
18
Durbin Watson test on residuals: Thumb rule: If d 3, the conclusion is that residuals (and original data) are correlated. Use shape of figure (smooth or thorny) to decide if positive or negative) (More thorough rules for comparisons and decisions about positive or negative correlations exist.)
19
Durbin-Watson statistic = 2.05 (Comes in the output ) Value > 1 and < 3 No significant serial correlation in residuals!
20
Decompose – Analyse the observed time series in its different components: –Trend part(TR) –Seasonal part(SN) –Cyclical part(CL) –Irregular part(IR) Cyclical part: State-of-market in economic time series In environmental series, usually together with TR
21
Multiplicative model: y t =TR t ·SN t ·CL t ·IR t Suitable for economic indicators Level is present in TR t or in TC t =(TR∙CL) t SN t, IR t (and CL t ) works as indices Seasonal variation increases with level of y t
22
Additive model: y t =TR t +SN t +CL t +IR t More suitable for environmental data Requires constant seasonal variation SN t, IR t (and CL t ) vary around 0
23
Example 1: Sales data
24
Example 2:
25
Estimation of components, working scheme 1.Seasonally adjustment/Deseasonalisation: SN t usually has the largest amount of variation among the components. The time series is deseasonalised by calculating centred and weighted Moving Averages: where L=Number of seasons within a year (L=2 for ½-year data, 4 for quaerterly data och 12 för monthly data)
26
– M t becomes a rough estimate of (TR∙CL) t. –Rough seasonal components are obtained by y t /M t in a multiplicative model y t – M t in an additive model –Mean values of the rough seasonal components are calculated for eacj season separetly. L means. –The L means are adjusted to have an exact average of 1 (i.e. their sum equals L ) in a multiplicative model. Have an exact average of 0 (i.e. their sum equals zero) in an additive model. – Final estimates of the seasonal components are set to these adjusted means and are denoted:
27
–The time series is now deaseasonalised by in a multiplicative model in an additive model where is one of depending on which of the seasons t represents.
28
2. Seasonally adjusted values are used to estimate the trend component and occasionally the cyclical component. If no cyclical component is present: Apply simple linear regression on the seasonally adjusted values Estimates tr t of linear or quadratic trend component. The residuals from the regression fit constitutes estimates, ir t of the irregular component If cyclical component is present: Estimate trend and cyclical component as a whole (do not split them) by i.e. A non-weighted centred Moving Average with length 2m+1 caclulated over the seasonally adjusted values
29
–Common values for 2m+1: 3, 5, 7, 9, 11, 13 –Choice of m is based on properties of the final estimate of IR t which is calculated as in a multiplicative model in an additive model –m is chosen so to minimise the serial correlation and the variance of ir t. –2m+1 is called (number of) points of the Moving Average.
30
Example, cont: Home sales data Minitab can be used for decomposition by Stat Time series Decomposition Val av modelltyp Option to choose between two models
32
Time Series Decomposition Data Sold Length 47,0000 NMissing 0 Trend Line Equation Yt = 5,77613 + 4,30E-02*t Seasonal Indices Period Index 1 -4,09028 2 -4,13194 3 0,909722 4 -1,09028 5 3,70139 6 0,618056 7 4,70139 8 4,70139 9 -1,96528 10 0,118056 11 -1,29861 12 -2,17361 Accuracy of Model MAPE: 16,4122 MAD: 0,9025 MSD: 1,6902
36
Deseasonalised data have been stored in a column with head DESE1. Moving Averages on these column can be calculated by Stat Time series Moving average Choice of 2m+1
37
MSD should be kept as small as possible TC component with 2m +1 = 3 (blue)
38
By saving residuals from the moving averages we can calculate MSD and serial correlations for each choice of 2m+1. 2m+1MSDCorr(e t,e t-1 ) 31.817-0.444 51.577-0.473 71.564-0.424 91.602-0.396 111.542-0.431 131.612-0.405 A 7-points or 9-points moving average seems most reasonable.
39
Serial correlations are simply calculated by Stat Time series Lag and further Stat Basic statistics Correlation Or manually in Session window: MTB > lag ’RESI4’ c50 MTB > corr ’RESI4’ c50
40
Analysis with multiplicative model:
41
Time Series Decomposition Data Sold Length 47,0000 NMissing 0 Trend Line Equation Yt = 5,77613 + 4,30E-02*t Seasonal Indices Period Index 1 0,425997 2 0,425278 3 1,14238 4 0,856404 5 1,52471 6 1,10138 7 1,65646 8 1,65053 9 0,670985 10 1,02048 11 0,825072 12 0,700325 Accuracy of Model MAPE: 16,8643 MAD: 0,9057 MSD: 1,6388
42
additive
44
Classical decomposition, summary Multiplicative model: Additive model:
45
Deseasonalisation Estimate trend+cyclical component by a centred moving average: where L is the number of seasons (e.g. 12, 4, 2)
46
Filter out seasonal and error (irregular) components: –Multiplicative model: -- Additive model:
47
Calculate monthly averages Multiplicative model: for seasons m=1,…,L Additive model:
48
Normalise the monhtly means Multiplicative model: Additive model:
49
Deseasonalise Multiplicative model: Additive model: where sn t = sn m for current month m
50
Fit trend function, detrend (deaseasonalised) data Multiplicative model: Additive model:
51
Estimate cyclical component and separate from error component Multiplicative model: Additive model:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.