NY Times 25 November 2008.

Slides:



Advertisements
Similar presentations
FINANCIAL TIME-SERIES ECONOMETRICS SUN LIJIAN Feb 23,2001.
Advertisements

Time series modelling and statistical trends
Analysis of Sales of Food Services & Drinking Places Julianne Shan Ho-Jung Hsiao Christian Treubig Lindsey Aspel Brooks Allen Edmund Becdach.
Atmosphere & Climate Change
El Niño, La Niña and the Southern Oscillation
Time Series Analysis Autocorrelation Naive & Simple Averaging
Class #16 Monday, October 4, 2010 Class #16: Monday, October 4 Chapter 8 Oceanography and El Niño/La Niña/ENSO 1.
Long Term Temperature Variability of Santa Barbara Coutny By Courtney Keeney and Leila M.V. Carvalho.
Forecasting JY Le Boudec 1. Contents 1.What is forecasting ? 2.Linear Regression 3.Avoiding Overfitting 4.Differencing 5.ARMA models 6.Sparse ARMA models.
Cross-spectral analysis on Net Ecosystem Exchange: Dominant timescale and correlations among key ecosystem variables over the Ameriflux Harvard forest.
El Nino – Southern Oscillation - Southern Oscillation (atmosphere) - El Nino (ocean) - change in equatorial Pacific circulation - produces global climate.
NY Times 25 November Stat Nov 2008 D. R. Brillinger Chapter 14 - Examples continued Question Data Analyses Conclusions.
Time series analysis - lecture 4 Consumer Price Index - Clothing and Footwear.
(Mt/Ag/EnSc/EnSt 404/504 - Global Change) Observed Surface & Atmosphere (from IPCC WG-I, Chapter 3) Observed Changes in Surface and Atmosphere Climate.
SIO 210: ENSO conclusion Dec. 2, 2004 Interannual variability (end of this lecture + next) –Tropical Pacific: El Nino/Southern Oscillation –Southern Ocean.
BOX JENKINS METHODOLOGY
Non-continuous Relationships If the relationship between the dependent variable and an independent variable is non-continuous a slope dummy variable can.
Temperature correction of energy consumption time series Sumit Rahman, Methodology Advisory Service, Office for National Statistics.
TIME SERIES by H.V.S. DE SILVA DEPARTMENT OF MATHEMATICS
Are Exceptionally Cold Vermont Winters Returning? Dr. Jay Shafer July 1, 2015 Lyndon State College 1.
25.3 Weather Instruments Winds are always labeled from the direction from which they blow. Wind vane – most common, located on buildings; always point.
Assessing Predictability of Seasonal Precipitation for May-June-July in Kazakhstan Tony Barnston, IRI, New York, US.
Intervention models Something’s happened around t = 200.
Tutorial for solution of Assignment week 39 “A. Time series without seasonal variation Use the data in the file 'dollar.txt'. “
Introducing ITSM 2000 By: Amir Heshmaty Far. S EVERAL FUNCTION IN ITSM to analyze and display the properties of time series data to compute and display.
Objective Data  The outlined square marks the area of the study arranged in most cases in a coarse 24X24 grid.  Data from the NASA Langley Research Center.
Statistical Summary ATM 305 – 12 November Review of Primary Statistics Mean Median Mode x i - scalar quantity N - number of observations Value at.
STAT 497 LECTURE NOTE 9 DIAGNOSTIC CHECKS 1. After identifying and estimating a time series model, the goodness-of-fit of the model and validity of the.
Forecasting (prediction) limits Example Linear deterministic trend estimated by least-squares Note! The average of the numbers 1, 2, …, t is.
Seasonal ARMA forecasting and Fitting the bivariate data to GARCH John DOE.
Sea Level in month i =α 2 sin t i + α 3 cos t i + α 4 sin2 t i + α 5 cos2 t i seasonality + α 6 SOI i climate indices + α 1 + α 7 time i + (deviations.
Seasonal ARIMA FPP Chapter 8.
2/25/ lecture 121 STATS 330: Lecture 12. 2/25/ lecture 122 Diagnostics 4 Aim of today’s lecture To discuss diagnostics for independence.
An Example. The question Data Analyses Conclusions.
Time series analysis. Example Objectives of time series analysis.
Daiwen Kang 1, Rohit Mathur 2, S. Trivikrama Rao 2 1 Science and Technology Corporation 2 Atmospheric Sciences Modeling Division ARL/NOAA NERL/U.S. EPA.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Complication in Climate Change
25.3 Weather Instruments Wind vane – most common, located on
El Niño / Southern Oscillation
Case study 4: Multiplicative Seasonal ARIMA Model
An Example. The question Data Analyses Conclusions.
Climate and Global Dynamics Laboratory, NCAR
Challenges of Seasonal Forecasting: El Niño, La Niña, and La Nada
Institut für Meteorologie
Air-Sea Interactions The atmosphere and ocean form a coupled system, exchanging heat, momentum and water at the interface. Emmanuel, K. A. 1986: An air-sea.
Lecture 8 ARIMA Forecasting II
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Time-Series Evaluation of an Experimental
Chapter 4: Seasonal Series: Forecasting and Decomposition
ATM 305 – 16 November 2017 Lance Bosart and Philippe Papin
KUDAKWASHE KELVIN CHIKUKWA R169999D
El Niño and La Niña.
STAT 497 LECTURE NOTE 9 DIAGNOSTIC CHECKS.
Climate Changes.
El Nino and La Nina An important atmospheric variation that has an average period of three to seven years. Goes between El Nino, Neutral, and La Nina (ENSO.
EL NINO Figure (a) Average sea surface temperature departures from normal as measured by satellite. During El Niño conditions upwelling is greatly.
With joint time series data and either a bivariate model, data (X(t),Y(t); t = 0,...,T-1) or a regression/transfer function model the following R functions.
Simple Linear Regression
Modelling Climate and Water Isotope Signatures of El Niño in the Pliocene Julia Tindall and Alan Haywood AGU Fall Meeting 15th December 2015.
The Atmosphere: Part 9: Short term climate variability
The importance of prefiltering.
Global Climate Change.
Climate Changes due to Natural Processes
Vector AutoRegression models (VARs)
Linear Filters.
Section 4 Earth’s Changing Climate
BOX JENKINS (ARIMA) METHODOLOGY
IT Applications in Business Analytics
Chap 7: Seasonal ARIMA Models
Presentation transcript:

NY Times 25 November 2008

Stat 153 - 24 Nov 2008 D. R. Brillinger Chapter 14 - Examples continued Question Data Analyses Conclusions

Why? Chatfield's example 14.1 Monthly mean air temperature at Recife 1953-1962 Table 14.1 - doesn't indicate source ???? Chatfield's objective - "to describe and understand the data" Objectives here - to extend Chatfield's series - to layout analyses - to explore the data for surprises - predicted values - signal + noise? - ...

Finding the data. Google with various key words: temperature, Recife, ... "Eventually lead" to: cdiac.ornl.gov/ftp/ndp041 Carbon dioxide information analysis center! Had to discover Recife Curado station id - 3068290000 Years 1949-1988 Searched an inappropriate ste for a long time (Looked at Brasil sites too, but that didn't turn up the data)

notice -9999 replace by NA file: recifecurado The web data notice -9999 replace by NA file: recifecurado 30682900001941 274 279 268 267 260 250 246 245 256 260 262 266 30682900001942 273 268 270 270 256 247 236 233 252 260 270 265 30682900001943 270 270 273 261 253 245 236 238 247 260 264 268 30682900001944 269 272 275 263 254 247 239 232 241 256 267 273 30682900001945 278 268 278 271 256 240 236 243 251 258 268 267 30682900001946 268 277 271 259 258 247 247 247 249 257 261 266 30682900001947 269 270 268-9999 253 246 244 245 249 262 268-9999 30682900001948 273 271 270 269 255 249 243 240 248-9999 264 270 30682900001949 272 274 278 266 252 246 240 236 250 261 265 271 30682900001950 273 280 269 251 248 243 235 234 248 258 264-9999 30682900001951 269 267 275 265 254 238 233 238 247 259 263-9999 30682900001952 272 278 267 262 252 245 240 238 253 256 263 268

How to handle missing values? Interpolate? Model? ...? junk<-scan("recifecurado") junk1<-matrix(junk,ncol=48) junk2<-junk1[2:13,] # years in first row series<-c(junk2)/10 # for degrees centigrade length(series[is.na(series)]) #17 - need to understand missingness Interpolation series1<-series for(i in 2:(length(series)-1)){if(is.na(series[i]))series1[i]<-.5*series[i-1] +.5*series[i+1]} Some values did not agree with Chatfield's

plot(xaxis,series1,type="l",xlab="year",ylab="mean temp (degrees C)",las=1) title("Mean monthly temperatures 1949-88 Recife Curado") abline(h=mean(series1))

There is seasonality and variability Restricted range in mid-sixties - nonconstant mean level? ylim<-range(series1) par(mfrow=c(2,1)) plot(lowess(xaxis,series1),type="l",ylim=ylim,xlab="year",ylab="degrees C",main="Smoothed Recife series") abline(h=mean(series1)) junk20<-lowess(xaxis,series1) plot(xaxis,series1-junk20$y,type="l",xlab="year",ylab="degrees C",main="Residuals") abline(h=mean(series1-junk20$y))

par(mfrow=c(1,1)) acf(series1,las=1,xlab="lag(mo)",ylab="",main="autocorrelation recife temperatures",lag.max=50,ylim=c(-1,1))

More confirmation of period 12 Remember the interpretation of the error lines Note that nearby values are highly correlated

spectrum(series1,xlab="frequency (cycles/month)",las=1)

Note peaks at frequency 1/12 and harmonics Further confirmation of period 12 Note log scale for y-axis Note vertical line in upper right Gives uncertainty

What is the shape of the seasonal? junk4<-matrix(series1,nrow=12) junk5<-apply(junk4,1,mean) plot(junk5,type="l",las=1) abline(h=mean(junk5))

Cooler in July-Aug Southern Hemisphere Uncertainty?

Cooler in July-August. Southern hemisphere Part of a longer cycle? El Nino explanatory? After "removing" trend middle has been pulled up Need uncertainties Back to original data

Remove seasonal series2<-series1 for(i in 1:48){ for(j in 1:12){ series2[(i-1)*12+j]<-series1[(i-1)*12+j]-junk5[j] } par(mfrow=c(2,1)) plot(xaxis,series2,type="l",xlab="year",ylab="residual",main="Series after removing seasonal",las=1) abline(h=0) ylim<-range(series2) plot(xaxis,series1-mean(series1),type="l",xlab="year",ylab="degreesC",main="Mean removed series",las=1,ylim=ylim) abline(h=mean(series1-mean(series1)))

original variance 1.342 adjusted .248

par(mfrow=c(2,1)) acf(series2,lag.max=50,las=1,xlab="lag (mo)",main="Ajusted by removing monthly means",las=1) acf(diff(series1,lag=12),lag.max=50,xlab="lag (mo)",main="Order 12 differenced series")

Work remains Frequency domain analysis. par(mfrow=c(2,1)) junk9<-spec.pgram(series1,taper=0,detrend=F,demean=F,spans=5,plot=F) ylim<-range(junk9$spec) junk9<-spec.pgram(series1,taper=0,detrend=F,demean=F,spans=5,xlab="frequency (cycles/mo)",las=1,main="Original series") junk10<-spec.pgram(series2,taper=0,detrend=F,demean=F,spans=5,ylim=ylim,main="Monthly means removed",las=1)

Work remains on seasonal Residual "not" white noise

Time domain distributions

par(mfrow=c(2,2)) boxplot(series2,main="Seasonally adjusted temps") hist(series2,breaks=15,xlab="temperature",las=1,ylab="count",main="Seasonally adjusted temps") plot(density(series2),las=1,xlab="temperature",ylab="density",las=1,main="") library(MASS) Junk1<-kde2d(series2[1:575],series2[2:576]) image(Junk1) points(series2[1:575],series2[2:576])

Parametric model. SARIMA ? Thinking about prediction, consider Yt = αYt-1 + βYt-12 + Nt with some ARMA for Nt Check seasonal residuals for normality Hope to end up with white noise

Junk<-arima(series1,order=c(1,0,1),seasonal=list(order=c(1,0,1),period=12)) Call: arima(x = series1, order = c(1, 0, 1), seasonal = list(order = c(1, 0, 1), period = 12)) Coefficients: ar1 ma1 sar1 sma1 intercept 0.7604 -0.3344 0.9990 -0.9201 25.6117 s.e. 0.0610 0.0968 0.0007 0.0220 0.6100 sigma^2 estimated as 0.1771: log likelihood = -337.48, aic = 686.97

tsdiag(Junk,gof.lag=25)

Junk<-arima(series1,order=c(1,0,1),seasonal=list(order=c(1,0,1),period=12)) postscript(file="recifeplots1a.ps",paper="letter",hor=T) Junk2<-predict(Junk,n.ahead=24) Junk3<-c(series1,Junk2$pred) Junk3a<-c(rep(0,576),2*Junk2$se) Junk3b<-c(rep(0,576),-2*Junk2$se) Junk4a<-Junk3+Junk3a;Junk4b<-Junk3+Junk3b ylim<-range(Junk4a,Junk4b) par(mfrow=c(1,1)) xaxis1<-1941+(1:length(Junk3)/12) plot(xaxis1[xaxis1>1983],Junk4a[xaxis1>1983],type="l",las=1,ylim=ylim,col="red",xlab="year",ylab="degrees C",main="Data + predictions") lines(xaxis1[xaxis1>1983],Junk4b[xaxis1>1983],col="red") lines(xaxis1[xaxis1>1983],Junk3[xaxis1>1983],col="blue") lines(xaxis[xaxis>1983],series1[xaxis>1983])

Chatfield. "... we have found little of interest apart from what is evident in the time plot." He worked with years 1953 - 1962

Two series Bivariate case {Xt, Yt} - jointly distributed Linear time invariant / transfer function model nonparametric/parametric approaches

Southern Oscillation Index El Niño: global coupled ocean-atmosphere phenomenon. The Pacific ocean signatures, El Niño and La Niña are important temperature fluctuations in surface waters of the tropical Eastern Pacific Ocean

Southern Oscillation reflects monthly or seasonal fluctuations in the air pressure difference between Tahiti and Darwin www.cpc.ncep.noaa.gov/data/soi

junk<-scan("recifecurado") junk1<-matrix(junk,ncol=48) junk6<-junk1[1,] junk1<-junk1[,junk6>1950] junk2<-junk1[2:13,] series<-c(junk2)/10 length(series[is.na(series)]) #13 xaxis<-1951+(1:length(series)/12) series1<-series junk4<-matrix(series1,nrow=12) junk5<-apply(junk4,1,mean) for(i in 2:(length(series)-1)){if(is.na(series[i]))series1[i]<-.5*series[i-1]+.5*series[i+1]} series2<-series1 for(i in 1:38){ for(j in 1:12){ series2[(i-1)*12+j]<-series1[(i-1)*12+j]-junk5[j]}}

kunk<-scan("SOIa.dat") kunk1<-matrix(kunk,ncol=58); kunk6<-kunk1[1,] kunk1<-kunk1[,kunk6<1989] kunk2<-kunk1[2:13,] teries<-c(kunk2) length(teries[is.na(teries)]) #0 teries1<-teries; teries2<-teries1 postscript(file="recifeplots3.ps",paper="letter",hor=T) par(mfrow=c(2,1)) plot(xaxis,series2,type="l",las=1,xlab="year",ylab="",main="Seasonally adjusted Recife temps") plot(xaxis,teries2,type="l",las=1,xlab="year",ylab="",main="Southern Oscillation Index")

postscript(file="recifeplots2.ps",paper="letter",hor=F) par(mfrow=c(1,1)) acf(cbind(series2,teries2))

junk10<-cbind(series2,teries2) junk11<-spec.pgram(junk10,plot=F,taper=0,detrend=F,demean=F,spans=11) par(mfcol=c(2,2)) plot(junk11$freq,10**(.1*junk11$spec[,2]),log="y",main="SOIspectrum", xlab="frequency", ylab="", las=1,type="l") plot(junk11$freq,junk11$coh,main="Coherence",xlab="frequency",ylab="",las=1,ylim=c(0,1),type="l") junkh<-1-(1-.95)**(1/(.5*junk11$df-1)) abline(h=junkh) plot(junk11$freq,10**(.1*junk11$spec[,1]),log="y",main="Seasonally corrected Recife spectrum",xlab="frequency", ylab="",las=1, type="l")

SARIMAX Yt = αYt-1 + βYt-12 + γXt + Nt ar1 ma1 sar1 sma1 intercept teries1 0.7885 -0.3717 0.9996 -0.9474 25.5572 -0.0255 s.e. 0.0610 0.1014 0.0006 0.0321 0.6403 0.0149 sigma^2 estimated as 0.1792: log likelihood = -275.68, aic = 565.35 Junk1<-arima(series1,order=c(1,0,1),seasonal=list(order=c(1,0,1),period=12),xreg=teries1)

The Box-Jenkins Airline Data International airline passengers Monthly totals (thousands) January 1949-December 1960 n = 144 Brown, R. G. (1962) Smoothing, Forecasting and Prediction of Didcrete Time Series. Prentice-Hall.

postscript(file="airline.ps",paper="letter") data(AirPassengers) y<-c(AirPassengers) xaxis<-1949+c(1:length(y))/12 plot(xaxis,y,type="l",xlab="year",ylab="Passengers in thousands",main="BJ airline data",las=1) plot(xaxis,y,type="l",xlab="year",ylab="Passengers in thousands",main="BJ airline data",las=1,log="y") Y<-log(y) plot(xaxis,Y,type="l",xlab="year",ylab="log(Passengers)",main="BJ airline data",las=1)

Y1<-diff(Y) plot(xaxis[1:length(Y1)],Y1,type="l",xlab="year",ylab="diff(log(Passengers))",main="BJ airline data",las=1) abline(h=0,lty=3) Y2<-diff(Y1,12) plot(xaxis[1:length(Y2)],Y2,type="l",xlab="year",ylab="diff12(diff(log(Passengers)))",main="BJ airline data",las=1)

fit <- arima(Y, order=c(0,1,1),seasonal = list(order=c(0,1,1),period=12)) arima(x = Y, order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12)) ma1 sma1 -0.4018 -0.5569 s.e. 0.0896 0.0731 sigma^2 estimated as 0.001348: log likelihood = 244.7, aic = -483.4

junk<-spectrum(fit$resid) junkh<-10*log10(mean(10^(junk$spec/10))) abline(h=junkh,col="red") acf(fit$resid)

tsdiag(fit)

fit1<-arima(Y[1:108],order=c(0,1,1),seasonal = list(order=c(0,1,1),period=12)) pred<-predict(fit1,n.ahead=36) Y3<-Y Y3[109:144]<-pred$pred Y4<-Y Y4[109:144]<-pred$pred+2*pred$se Y5<-Y Y5[109:144]<-pred$pred-2*pred$se plot(xaxis,Y4,type="n",xlab="year",ylab="log(passengers in thousands)",main="BJ airline data",las=1) lines(xaxis,Y5,col="red") lines(xaxis,Y3,col="blue") lines(xaxis,Y) lines(xaxis,Y4,col="red")