Local Prediction of a Spatio-Temporal Process with Application to Wet Sulfate Deposition Presented by Isin OZAKSOY
Benefits of Spatial-Temporal over Spatial : Use larger sample size to support model estimation Spatio-temporal drift estimates Location- specific forecasts Temporal correlation estimates Goal of article : Estimate a model of sulfate deposition that captures the major effects of location, time, and season on the mean value and covariance structure Provide original-scale process predictions and prediction standard error estimates that are both negligibly biased. Develop a predictor capable of estimating the parameter of a substantive science-based pollutant deposition model.
Data : 5039 seasonal sulfate decomposition 160 monitoring sites Summer 1979 – Summer 1992 Constructing seasonal deposition observations : (percip. mean weighted average of weekly obs.) * (total percip. of season) Include only observations meeting second-highest data completeness criterion Note that, seasonal data is less noisy but still can estimate seasonal effect.
Definitions : (x,y) : spatial coordinates of locations in spatio-temporal space t : temporal coordinates X=(x,y,t)’ : spatio-temporal location n : total number of spatio-temporal observations f c : fraction of n used for prediction n c =n* f c : number of observations used to predict at x o Prediction Cylinder : Spatio-temporal space that holds the n c observations used to predict the process at x o. NOTE : Spatial and temporal dimensions of cylinder are defined separately.
Step 1 : m T ≤ t latest - t earliest m T = t U - t L t L = max ( t earliest, t U - m T ) t U = min ( t O, t O + (m T /2) ) n I observations within temp. interval m T large enough so n C < n I Step 2 : Sort n I observations according to ||(x O,y O )’ – (x,y)’|| Sort n I observations according to | t O – t | Step 3 : Cylinder`s obs. are first n C of sorted n I observations. Cylinder`s radious : Spatial distance between the n C th observation and (x O,y O )’ Determining Cylinder`s n C Observations t earliest t L t O t U t latest
Modelling : Spatio-Temporal Drift Model V C : Variance- covariance matrix between residuals at x O and the residuals at the observation locations.
Reasons for using Spatial-Temporal Drift Model : 1.Sulfate`s spatial drift exhibits an inverted bowl shape centered over the Ohio Valley. 2.Sulfate deposition exhibits strong seasonality. 3.Demonstrate feasibility of estimating nonlinear spatio-temporal drift models with MCSTK.
Spatio-Temporal Covariance : 1.Within Cylinder Covariance Function : E(R C (x)) is constant. Cov(R C (x 1 ), R C (x 2 ))=C S,T (g((x 1,y 1 )’, (x 2,y 2 )’), h(t 1,t 2 )’) where; C S,T (.,.)’ : Spatio-temporal covariance function, g((x 1,y 1 )’, (x 2,y 2 )’) = ||(x 1,y 1 )’ – (x 2,y 2 )’|| : spatial lag h(t 1,t 2 )’ = | t 1 – t 2 | : temporal lag
Spatio-Temporal Covariance : 2. Spatio-Temporal Semivariogram Estimation : Consider all combinations of m S spatial lags and m T+1 temporal lags. m S is determined to control the pair count per semivariogram estimate (semivar. estimate based on small number of pairs may have high variance). N kl is the number of spatio-temporal observation pairs separated by spatio- temporal lag class (g k,h l ) where a (.) is the nugget, s (.) is the partial sill and r (.) is the range. To weight semivariogram estimates by their number of pairs per lag class and to give more weight to small semivariogram estimates, use
MCSTK Algorithm (Prediction at x O ) STEP 1 : Set V C = I and x, C ),x) = 1 Compute OLS estimate and Estimate spatio-temporal semi-variogram model from these residuals and use the covariance function from the estimated semi-variogram to compute STEP 2 : Set V C = and x, C ),x) = 1 Compute GLS estimate and Estimate spatio-temporal semi-variogram model from these residuals and use the covariance function from the estimated semi-variogram to compute STEP 3 : Use covariance function model to predict kriging the residual process at x O via residual kriging.
MCSTK Algorithm (Prediction at x O ) w : kriging weights : lagrange multiplier : estimate of U C from covariance function in Step 2
Heteroscedastic Residual Variance Function Estimate ( ) IMPORTANT : Errors must NOT be dependent ! STEP 1 : Find n S observation locations within the cylinder from the same seasonality level and temporally closest to the prediction time. STEP 2 : Sort n S locations by. STEP 3 : Let be sample variance of the second-stage residuals computed at first n n of the sorted locations. Let be sample variance of the second-stage residuals computed at all n S locations. is sample variance of the n n closest residuals to the prediction location in terms of seasonality level, time, and estimated drift. STEP 4 : Calculate the heteroscedasticity function at x O by :
MCSTK Bias Assessment and f C Determination BAIS ASSESSMENT : MCSTK predictor and it's estimated standard error is bias for E(Y C (x O )) and s e (x O ) respectively. Case : If V C is known and R C (x) is homoscedastic, GLS drift estimate is found by finding that minimizes. Define predicted residuals to be. If is known, then the kriging and variance from the residual kriging equations is equal to : PREDICTION BIAS : For i=1,…,n CV define cross-validation residuals and the standardized cross-validation residuals be
MCSTK Bias Assessment and f C Determination PREDICTION BIAS : 1.Compute mean bias as fraction of data means; where is the mean of n CV observations. 2.Perform t-test on cross validation residuals of the hypothesis that PREDICTION STANDARD ERROR ESTIMATE BIAS : 1.If is negligibly bias for but estimated standard error is biased for true prediction standard error then and the sample standard deviation of indicates the direction and degree of the bias in the estimated standard errors. 2.If estimate standard errors are negligibly bias then where
MCSTK Bias Assessment and f C Determination REASON FOR BIAS IN STANDARD ERROR 1.As cylinder radius increases, cylinder model for spatial drift captures less of the increasing complicated drift surface which then is represented by residuals which inflates semivariogram estimates (resulting in positive bias). 2.If cylinder is too small, then the semivariogram estimates in MCSTK is seriously negative biased at larger lags. Since kriging is performed locally, estimated standard errors may not be significantly biased if kriging system's size is not too small. NOTE : Reliability of MCSTK estimated standard errors increases as reliability of semivariogram estimates increases. f C DETERMINATION Cross-validation is performed over a set of f C values and the smallest value of f C values selected for the prediction and standard error estimate bias are as small as possible.
PREDICTING SEASONAL SULFATE DEPOSITION
CONCLUSIONS : 1.MCSTK is flexible for modeling linear and/or nonlinear spatio-temporal process with residual covariance structure. Thus, can do full vs. reduced model analysis. 2.Data sets from environmental spatio-temporal processes which are first and second order non-stationary can be large enough so with appropriate cylinder size, MCSTK predictions and standard error estimates are negligible and small biased. 3.For separable spatio-temporal covariance function, ill-conditioned kriging system has inevitable instability. 4.In the time period , U.S. sulfate deposition decreased over East and West but increased in Southwest, Rocky Mountains and South Texas. 5.Assumption of zero-temporal covariance minimally effects seasonal sulfate deposition P.I. for this data. 6.MCSTK is computationally intensive.