A Bayesian hierarchical modeling approach to reconstructing past climates David Hirst Norwegian Computing Center
Temperature data Many locations Direct measure of temperature Annual or better resolution small (known?) error Not too many missing values Short series
Proxy data Long series Few (”strange”) locations Relationship with temperature unclear, may change over time Often coarse resolution Large (unknown) error Lots of missing values Pre-processing critical
Current reconstruction methods: 1)Choose proxies 2)Create matrix X of pre-processed proxy by time 3)Create matrix Y of instrumental temperatures. 4)Relate X to Y (by PCA of one or both, then regression of X on Y or Y on X) 5)Use X to predict Y back in time
Difficulties with existing methods: Missing data Spatial association between proxies and instruments lost PCA of proxy data dangerous Uncertainty in temperature data ignored Difficult to include proxies at different resolutions
Consequences: Underestimation of past climate variability Wrong uncertainty
An alternative approach Regard both instruments and proxies as observations of an underlying temperature process. Model all observations including appropriate error terms
In general: Model temperature as an underlying space-time field Model data (proxies and thermometers) as observations of this field Use appropriate functional relationship between proxies and temperature Use appropriate error terms
Specifically: True temperature T(t) an AR(1) process: Observations O = linear function of T plus AR(1) error E + measurement error For low resolution proxy replace T by mean over appropriate period
A simulation study 50 years of thermometer data 250 years of proxies True temperature AR1, coefficient=0.95, sd =1 10 thermometers, small AR1 error (coef=0.7, sd=0.1) 5 proxies, (coef=0.7, sd=1)
For comparison, regression estimator Find first pc of proxies Regress thermometer mean on pc predict ”temperature” (actually thermometer mean) using regression
Add uncertainty to proxies Only 2 proxies error sd = 2
The effect of missing data 5 proxies, error sd = 1 50% proxy data missing at random
Add a trend Only 150 years for proxies cosine trend, cycle 50 years, amplitude 4 (first 50 years) 8 (next 50) and 12 (last 50) AR1 model for temperature no longer correct
Add lots of ”bad” proxies 2 proxies linearly related to temperture 20 proxies unrelated to temperature
Some data from China Two proxies used in Moberg et at closest instrumental data sets
InstrumentalBeijingChina
Modelling conclusions A flexible model which can take account of many sources of uncertainty Theoretically easy to include spatial correlations Can include proxies at different resolutions Missing data not a problem Avoids underestimation of variability if model correct Functional form of temperature and error series very important
Other conclusions Impossible to work with proxies without help from appropriate scientists (preferably those who collected the data) Pre-processing crucial Selection of proxies important Some assumptions impossible to verify