Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data analyses 2008 Lecture 2 16-10-2008. Last Lecture Basic statistics Testing Linear regression parameters Skill.

Similar presentations


Presentation on theme: "Data analyses 2008 Lecture 2 16-10-2008. Last Lecture Basic statistics Testing Linear regression parameters Skill."— Presentation transcript:

1 Data analyses 2008 Lecture 2 16-10-2008

2 Last Lecture Basic statistics Testing Linear regression parameters Skill

3 What is a time series/random data? Time series may be seen as randomly selected finite section of infinitely long sequences of random numbers (Storch and Zwiers) Time series is a stochastic random process Ordered as random samples X t A single time history is called a sample function or record Simplest example is white noise (random data no deterministic signal)

4 Stationary process Definition: all stochastic properties are independent of time StationaryNonstationary Random ErgodicNonergodic

5 Ensemble (collection of sample functions)  time lag

6 Stationarity (2) Mean: Summing all t 1 values and divide by ensemble size (first moment) and autocovariance:

7 Stationarity (3) If  (x)t 1 and C xx (t 1,t 1 +  ) vary as a function of time then: X t non stationary If  x (t 1 ) and C xx (t 1,t 1 +  ) do not vary as a function of time then: weakly stationary Note that autocovariance depends on time lag (  ) so:

8 Strongly stationary Definition:if an infinite collection of high-order moments (complete probability distribution function) are time invariant. Also called:stationary in the strict sense Often unknown for most analysing purposes weakly stationary sufficient

9 Ergodic So far we considered ensembles to calculate mean, autocovariance But if we consider the k th sample: If  x (k) and C xx (k) independent of k then time series ergodic

10 Ergodic (2) So we can write: (note that only a stationary process can be ergodic) Advantage only one single sample function needed In practice we have to assume ergodicity Sometimes tested by splitting up the data set in subsamples

11 Time series X t is the time series D t is the deterministic component N t is the stochastic component (noise) Purpose of time series analyses detect and describe the deterministic component

12 Time series (2) DtDt XtXt NtNt White noise Deterministic oscillation

13 Time series (3) DtDt XtXt NtNt White noise Quasi-oscillatory signal Dynamical component changed by the noise

14 Autocorrelation (1) (AC.f) Purpose: to see whether there are repetitions in a time series Each point in time can be compared with a previous point in time or any previous points in time and similarity can be studied Data available at regular time spacing!

15 Recalling lecture 1 Focus on mutual variability of pairs of properties Covariance:is the joint variation of two variables about their common mean Now: (because then t=t+  )

16 Autocorrelation (2) If  =0 C xx =s 2 Autocorrelation: Autocovariance: xx refers to same variable but can be replaced by x 1 and x 2 if two variables are considered Stationary processes:

17 Recalling lecture 1 Correlation coefficient (r) defined as: ratio of the covariance divided by the product of the standard deviations Scaled quantity: 1 is a perfect correlation 0 is no correlation -1 is a perfect inverse correlation Correlation coefficient analogue of autocorrelation

18 Autocorrelation (2b) Properties: Note that autocorrelation function is not unique many processes might have similar autocorrelation function Not invertible!

19 Autocorrelation (6) X X X

20 Autocorrelation (7) Most cases autocorrelation not analytical solved: Autocorrelation retrieved from a simple calculation: Basically: five summation over time domain

21 Cross-correlation (1) Two time series different variables Cross correlation:covariance divided by product of variances

22 Partial autocorrelation function An autocorrelation function that identifies the k lag magnitude of the autocorrelation between the t and t-k, controlling for all intervening autocorrelations From regression analysis: X 1 partially varies because of variation in X 2 and partially due to variation in X 3

23 Partial autocorrelation (2) 2 variable which is kept constant In time series analysis we find analogue: Partial autocorrelation used for assessing order of stochastic models

24 Autoregressive models and moving average models Stochastical models partly deterministic partly random Tools autocorrelation and partial autocorrelation

25 White Noise Non-stationary process because variance increases tµtµ

26 Random walk (1) Random walk example of autoregressive model Autoregressive model order 1 with  1 =1

27 Random walk (3) Distribution of air pollution; results from advection and random contribution Storch and Zwiers 1999

28 Basic idea time series analysis Autoregressive model:value at t depends on previous values t-i plus some random perturbation Moving average model:value at t depends on the random per- turbation of previous values t-i plus some random perturbation See if you can learn something from a data set which looks noisy

29 Basic formulation of Autoregressive integrated Moving average models ARIMA (p,d,q) includes autoregressive process integrated process moving average process If d=0 stationary If d=1 non stationary first to be transformed to stationarity (stationary mean and variance constant)

30 AR(1)-process AR(1) process ~ ARIMA (1,0,0) Markov process General formulation:  1 is the first-order autoregressive coefficient z t white noise (Xt to one side rest to other side)

31 AR(2)-process AR(2) ~ ARIMA(2,0,0) (dependent previous two time steps) ARIMA(p,0,0) now p is the order of the process

32 MA(1)-process Autoregressive models depend on previous observations Moving average models depend on innovations General formulation for a MA(1) model ARIMA(0,0,1) Z t is the innovation or shock  1 is the first order moving average coefficient

33 MA(2)-process ARIMA (0,0,2) So the current observation is a function of -mean, current innovation and two past innovations If ARIMA (0,0,q) then q is the order of the moving average process If ARIMA (p,0,q) then we have a autoregressive moving average model So the order learns us the number of previous observation of which the series is a significant function

34 (1) process different notation No need to bother about mean (it doesn’t influence the autocorrelation) Expression can be regarded as normalized version of a time series

35 Time series should be weakly stationary (constant mean and variance) Auto-regressive indicates that the process evolves by regressing past values towards the mean and then adding noise. AR process ~ discretized differential equation a 0,a 1,a 2 are constants Z t is external forcing AR Process as differential equation

36 Discretized version (backward in time) AR-process if z t is white noise AR process differential equation (2) Exercise: write AR(1) process as first order diff. equation

37 Red noise Autoregressive model p=1 0<  1 <1 Very common in climate research describes gradually changes  1st order diff. eq.

38 Blue Noise Autoregressive model with p=1  1 <0 Characteristic many sign changes Not very common in climate research, exception ice cores 

39 AR process unstable  1 >1 Unstable explosive growth rate of the variance  1 =1 Random walk

40 AR process mean and variance

41 AR process mean

42 Autocorrelation AR process

43 AR process autocorrelation (  ) Recalling the general form AR (1) process (  0 =0, p=1))

44 Variance of AR process

45 Variance (2)

46 AR(1) process autocorrelation Autocorrelation for different values of  1 Note that positive values of  1 equal negative values for even time lags

47 MA(1) processes RANDOM

48 Autocorrelation MA process Characteristic pattern sharp spikes up to and including the lag Consider MA(1) process (ARIMA (0,0,1)) Covariance: C(k) consider C(1):

49 Autocorrelation MA(1) process (2) Given the autocovariance we need the variance to obtain the autocorrelation Variance MA (1)

50 Autocorrelation MA(1) process (3) Autocovariance: Variance: Autocorrelation:

51 Autocorrelation MA(1) process (4) Autocovariance For lag 2: So the autocorrelation is 0! In other words the autocorrelation spikes at the lag of its order In this case 1 This implies a finite memory for the process, after the shock the Autocorrelation drops to zero

52 Summary identification ARIMA(1) ProcessAutocorrelationPartial autocorrelation White noise ARIMA(0,0,0)no spikesno spikes Random walkslow attenuationspike at order of differencing ARIMA(0,1,0)

53 Summary identification ARIMA(2) ProcessAutocorrelationPartial autocorrelation autoregressive ARIMA(1,0,0)  1 >0exp. decay1 pos. spike at lag 1 positive spikes ARIMA(1,0,0)  1 <0 oscillating decay1 neg. spike at lag 1 begins with neg. spike ARIMA(2,0,0)  1,  2 >0 exp. decay2 pos. spikes at lags 1 pos. spikesand 2 ARIMA(2,0,0)  1 0 oscillating exp.1 neg. spike at lag 1 decay1 pos. spike at lag 2

54 Summary identification ARIMA(3) ProcessAutocorrelationPartial autocorrelation Moving average processes ARIMA(0,0,1)  1 >01 neg. spike at lag 1exp. decay of neg. spikes ARIMA(0,0,1)  1 <0 1 pos. spike at lag 1 oscillating decay of pos. and neg. spikes ARIMA(0,0,2)  1,  2 >0 2 neg. spikes at lag 1exp. decay of neg. and lag 2 spikes ARIMA(0,0,2)  1,  2 <0 2 pos. spikes at lag 1oscillating decay of and lag 2 pos. and neg. spikes

55 Summary identification ARIMA(4) ProcessAutocorrelationPartial autocorrelation Mixed processes ARIMA(1,0,1)  1 >0,  1 >0exp. decay of pos.exp decay of pos.spikes ARIMA(1,0,1)  1 >0,  1 <0 exp. decay of pos.oscillating decay of spikespos. and neg. spikes ARIMA(1,0,1)  1 0 oscillating decayexp. decay of neg. spikes ARIMA(1,0,1)  1 <0  1 <0 oscillating decay ofoscillating decay of neg. and pos. spikes pos. and neg. spikes

56 Example the other way around Assume we measured the following; Can we describe this by a stochastic process?

57 Strategy for time series analysis -plotting the data -testing for stationairity -calculating autocorrelation and partial correlation -identifying the order (expertise and subjective) -recursive solution of the parameters -check whether residuals are white noise -further analysis (forecasting, extending the series)

58 Autocorrelation & partial correlation Only two lags non-zero One positive One negative Gradually decaying with oscillation superimposed Results from our time series

59 Recursive solution of AR- parameters via Yule Walker equations

60 Solution of  1 and  2 From autocorrelation we arrive at estimates of AR-parameters  1 =0.894 and  2 =-0.841 (generated with 0.9 and -0.8) As a result noise variance and spectra correct

61 Parameter estimation Don’t bother too much just brute force least square fitting

62 What did we learn today? -special cases: -white, red and blue noise, random walk General concepts of: -backward differencing provides relation differential equation and arima process -autoregressive models order 1 and 2 -moving average models 1 and 2 -autocorrelation for these models -estimating the order -estimating the coefficients


Download ppt "Data analyses 2008 Lecture 2 16-10-2008. Last Lecture Basic statistics Testing Linear regression parameters Skill."

Similar presentations


Ads by Google