Download presentation
Presentation is loading. Please wait.
Published byAllen Lang Modified over 8 years ago
1
Data analyses 2008 Lecture 2 16-10-2008
2
Last Lecture Basic statistics Testing Linear regression parameters Skill
3
What is a time series/random data? Time series may be seen as randomly selected finite section of infinitely long sequences of random numbers (Storch and Zwiers) Time series is a stochastic random process Ordered as random samples X t A single time history is called a sample function or record Simplest example is white noise (random data no deterministic signal)
4
Stationary process Definition: all stochastic properties are independent of time StationaryNonstationary Random ErgodicNonergodic
5
Ensemble (collection of sample functions) time lag
6
Stationarity (2) Mean: Summing all t 1 values and divide by ensemble size (first moment) and autocovariance:
7
Stationarity (3) If (x)t 1 and C xx (t 1,t 1 + ) vary as a function of time then: X t non stationary If x (t 1 ) and C xx (t 1,t 1 + ) do not vary as a function of time then: weakly stationary Note that autocovariance depends on time lag ( ) so:
8
Strongly stationary Definition:if an infinite collection of high-order moments (complete probability distribution function) are time invariant. Also called:stationary in the strict sense Often unknown for most analysing purposes weakly stationary sufficient
9
Ergodic So far we considered ensembles to calculate mean, autocovariance But if we consider the k th sample: If x (k) and C xx (k) independent of k then time series ergodic
10
Ergodic (2) So we can write: (note that only a stationary process can be ergodic) Advantage only one single sample function needed In practice we have to assume ergodicity Sometimes tested by splitting up the data set in subsamples
11
Time series X t is the time series D t is the deterministic component N t is the stochastic component (noise) Purpose of time series analyses detect and describe the deterministic component
12
Time series (2) DtDt XtXt NtNt White noise Deterministic oscillation
13
Time series (3) DtDt XtXt NtNt White noise Quasi-oscillatory signal Dynamical component changed by the noise
14
Autocorrelation (1) (AC.f) Purpose: to see whether there are repetitions in a time series Each point in time can be compared with a previous point in time or any previous points in time and similarity can be studied Data available at regular time spacing!
15
Recalling lecture 1 Focus on mutual variability of pairs of properties Covariance:is the joint variation of two variables about their common mean Now: (because then t=t+ )
16
Autocorrelation (2) If =0 C xx =s 2 Autocorrelation: Autocovariance: xx refers to same variable but can be replaced by x 1 and x 2 if two variables are considered Stationary processes:
17
Recalling lecture 1 Correlation coefficient (r) defined as: ratio of the covariance divided by the product of the standard deviations Scaled quantity: 1 is a perfect correlation 0 is no correlation -1 is a perfect inverse correlation Correlation coefficient analogue of autocorrelation
18
Autocorrelation (2b) Properties: Note that autocorrelation function is not unique many processes might have similar autocorrelation function Not invertible!
19
Autocorrelation (6) X X X
20
Autocorrelation (7) Most cases autocorrelation not analytical solved: Autocorrelation retrieved from a simple calculation: Basically: five summation over time domain
21
Cross-correlation (1) Two time series different variables Cross correlation:covariance divided by product of variances
22
Partial autocorrelation function An autocorrelation function that identifies the k lag magnitude of the autocorrelation between the t and t-k, controlling for all intervening autocorrelations From regression analysis: X 1 partially varies because of variation in X 2 and partially due to variation in X 3
23
Partial autocorrelation (2) 2 variable which is kept constant In time series analysis we find analogue: Partial autocorrelation used for assessing order of stochastic models
24
Autoregressive models and moving average models Stochastical models partly deterministic partly random Tools autocorrelation and partial autocorrelation
25
White Noise Non-stationary process because variance increases tµtµ
26
Random walk (1) Random walk example of autoregressive model Autoregressive model order 1 with 1 =1
27
Random walk (3) Distribution of air pollution; results from advection and random contribution Storch and Zwiers 1999
28
Basic idea time series analysis Autoregressive model:value at t depends on previous values t-i plus some random perturbation Moving average model:value at t depends on the random per- turbation of previous values t-i plus some random perturbation See if you can learn something from a data set which looks noisy
29
Basic formulation of Autoregressive integrated Moving average models ARIMA (p,d,q) includes autoregressive process integrated process moving average process If d=0 stationary If d=1 non stationary first to be transformed to stationarity (stationary mean and variance constant)
30
AR(1)-process AR(1) process ~ ARIMA (1,0,0) Markov process General formulation: 1 is the first-order autoregressive coefficient z t white noise (Xt to one side rest to other side)
31
AR(2)-process AR(2) ~ ARIMA(2,0,0) (dependent previous two time steps) ARIMA(p,0,0) now p is the order of the process
32
MA(1)-process Autoregressive models depend on previous observations Moving average models depend on innovations General formulation for a MA(1) model ARIMA(0,0,1) Z t is the innovation or shock 1 is the first order moving average coefficient
33
MA(2)-process ARIMA (0,0,2) So the current observation is a function of -mean, current innovation and two past innovations If ARIMA (0,0,q) then q is the order of the moving average process If ARIMA (p,0,q) then we have a autoregressive moving average model So the order learns us the number of previous observation of which the series is a significant function
34
(1) process different notation No need to bother about mean (it doesn’t influence the autocorrelation) Expression can be regarded as normalized version of a time series
35
Time series should be weakly stationary (constant mean and variance) Auto-regressive indicates that the process evolves by regressing past values towards the mean and then adding noise. AR process ~ discretized differential equation a 0,a 1,a 2 are constants Z t is external forcing AR Process as differential equation
36
Discretized version (backward in time) AR-process if z t is white noise AR process differential equation (2) Exercise: write AR(1) process as first order diff. equation
37
Red noise Autoregressive model p=1 0< 1 <1 Very common in climate research describes gradually changes 1st order diff. eq.
38
Blue Noise Autoregressive model with p=1 1 <0 Characteristic many sign changes Not very common in climate research, exception ice cores
39
AR process unstable 1 >1 Unstable explosive growth rate of the variance 1 =1 Random walk
40
AR process mean and variance
41
AR process mean
42
Autocorrelation AR process
43
AR process autocorrelation ( ) Recalling the general form AR (1) process ( 0 =0, p=1))
44
Variance of AR process
45
Variance (2)
46
AR(1) process autocorrelation Autocorrelation for different values of 1 Note that positive values of 1 equal negative values for even time lags
47
MA(1) processes RANDOM
48
Autocorrelation MA process Characteristic pattern sharp spikes up to and including the lag Consider MA(1) process (ARIMA (0,0,1)) Covariance: C(k) consider C(1):
49
Autocorrelation MA(1) process (2) Given the autocovariance we need the variance to obtain the autocorrelation Variance MA (1)
50
Autocorrelation MA(1) process (3) Autocovariance: Variance: Autocorrelation:
51
Autocorrelation MA(1) process (4) Autocovariance For lag 2: So the autocorrelation is 0! In other words the autocorrelation spikes at the lag of its order In this case 1 This implies a finite memory for the process, after the shock the Autocorrelation drops to zero
52
Summary identification ARIMA(1) ProcessAutocorrelationPartial autocorrelation White noise ARIMA(0,0,0)no spikesno spikes Random walkslow attenuationspike at order of differencing ARIMA(0,1,0)
53
Summary identification ARIMA(2) ProcessAutocorrelationPartial autocorrelation autoregressive ARIMA(1,0,0) 1 >0exp. decay1 pos. spike at lag 1 positive spikes ARIMA(1,0,0) 1 <0 oscillating decay1 neg. spike at lag 1 begins with neg. spike ARIMA(2,0,0) 1, 2 >0 exp. decay2 pos. spikes at lags 1 pos. spikesand 2 ARIMA(2,0,0) 1 0 oscillating exp.1 neg. spike at lag 1 decay1 pos. spike at lag 2
54
Summary identification ARIMA(3) ProcessAutocorrelationPartial autocorrelation Moving average processes ARIMA(0,0,1) 1 >01 neg. spike at lag 1exp. decay of neg. spikes ARIMA(0,0,1) 1 <0 1 pos. spike at lag 1 oscillating decay of pos. and neg. spikes ARIMA(0,0,2) 1, 2 >0 2 neg. spikes at lag 1exp. decay of neg. and lag 2 spikes ARIMA(0,0,2) 1, 2 <0 2 pos. spikes at lag 1oscillating decay of and lag 2 pos. and neg. spikes
55
Summary identification ARIMA(4) ProcessAutocorrelationPartial autocorrelation Mixed processes ARIMA(1,0,1) 1 >0, 1 >0exp. decay of pos.exp decay of pos.spikes ARIMA(1,0,1) 1 >0, 1 <0 exp. decay of pos.oscillating decay of spikespos. and neg. spikes ARIMA(1,0,1) 1 0 oscillating decayexp. decay of neg. spikes ARIMA(1,0,1) 1 <0 1 <0 oscillating decay ofoscillating decay of neg. and pos. spikes pos. and neg. spikes
56
Example the other way around Assume we measured the following; Can we describe this by a stochastic process?
57
Strategy for time series analysis -plotting the data -testing for stationairity -calculating autocorrelation and partial correlation -identifying the order (expertise and subjective) -recursive solution of the parameters -check whether residuals are white noise -further analysis (forecasting, extending the series)
58
Autocorrelation & partial correlation Only two lags non-zero One positive One negative Gradually decaying with oscillation superimposed Results from our time series
59
Recursive solution of AR- parameters via Yule Walker equations
60
Solution of 1 and 2 From autocorrelation we arrive at estimates of AR-parameters 1 =0.894 and 2 =-0.841 (generated with 0.9 and -0.8) As a result noise variance and spectra correct
61
Parameter estimation Don’t bother too much just brute force least square fitting
62
What did we learn today? -special cases: -white, red and blue noise, random walk General concepts of: -backward differencing provides relation differential equation and arima process -autoregressive models order 1 and 2 -moving average models 1 and 2 -autocorrelation for these models -estimating the order -estimating the coefficients
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.