Metrics, Bayes, and BOGSAT: Recognizing and Assessing Uncertainties in Earthquake Hazard Maps Seth Stein 1, Edward M. Brooks 1, Bruce D. Spencer 2 1 Department of Earth & Planetary Sciences and Institute for Policy Research, Northwestern University 2 Department of Statistics and Institute for Policy Research, Northwestern University
Hazard maps often do poorly 2010 map predicts probability of strong shaking in next 30 years But: 2011 M 9.1 Tohoku, 1995 Kobe M 7.3 & others in areas mapped as low hazard Geller 2011
Due to space-time variability of earthquakes & short record, hazard map bull’s-eyes reflect known past earthquakes, may not indicate future ones China GSHAP (1999) map PGA m/s 2 10% in 50 yr (1/500 yr) 2008 Wenchuan
Hazard maps are like ‘Whack-a-mole’ - you wait for the mole to come up where it went down, but it’s likely to pop up somewhere else.
NY Times 3/21/11
Issues Maps often do poorly Large uncertainties unquantified & not communicated to users Revising map after large earthquake may or may not predict future better Map performance not assessed: no agreed criteria for doing so How maps perform is unknown and involves complicated and not understood factors
Hazard maps are hard to get right: successfully predicting future shaking depends on accuracy of four assumptions over years Where will large earthquakes occur? When will they occur? How large will they be? How strong will their shaking be? Try to assess uncertainty via either -sensitivity analysis of effects of poorly-known parameters -evaluation of map performance, hard because of long times required to test forecasts
Newman et al., 2001 PREDICTED HAZARD DEPENDS GREATLY ON - Assumed maximum magnitude of largest events -Assumed ground motion model -Neither are well known since large earthquakes rare 180% 275% 2% in 50 yr (1/2500 yr)
154% %106 PREDICTED HAZARD ALSO DEPENDS ON RECURRENCE MODEL For New Madrid, usual time- independent model predicts higher hazard
Stein et al, 2012 Seismic hazard uncertainty at least factor of 3-4 At best partially reducible because some key parameters poorly known, unknown, or unknowable Large uncertainties not communicated to users
Options after an earthquake yields shaking larger than anticipated: Either regard the high shaking as a low- probability event allowed by the map Or – as usually done - accept that high shaking was not simply a low-probability event and revise the map
No formal or objective criteria used to decide whether to change map & how Done via BOGSAT (“Bunch Of Guys Sitting Around Table”) Challenge: a new map that better describes the past may or may not better predict the future
Deciding whether to remake a map is like deciding after a coin has come up heads a number of times whether to continue assuming that the coin is fair and the run is a low- probability event, or to change to a model in which the coin is assumed to be biased. Changing the model to match past may describe future worse ?
Bayes’ Rule – how much to change depends on one’s confidence in prior model Revised probability model = Likelihood of observations given the prior model x Prior probability model If you were confident that the coin was fair, you would probably not change your model. If you were given the coin at a magic show, your confidence would be lower and you would be more likely to change your model. ?
How good a baseball player was Babe Ruth? The answer depends on the metric used. In some seasons Ruth led the league in both home runs and in the number of times he struck out. By one metric he did very well, and by another, very poorly. Assessing performance
Metrics compare map and observations M0 – Exceedance Metric –M0 = | f – p | –Compares fraction (f) of sites exceeding map to predicted fraction of exceedances (p) - Implicit in PSHA M1 – Squared Misfit Metric –M1 = 1 / N ∑(s i – x i ) 2 –Quantifies magnitude of difference between observation and prediction –similar to visual comparison Predicted p% Observed f % 100-p% 100-f % Each metric gives different insight into map behavior Stein et al., 2015
A map can be nominally successful by the fractional exceedance metric, but significantly underpredict shaking at many sites and overpredict that at others. Or be nominally unsuccessful by the fractional site exceedance metric, but better predict shaking at most sites. p=0.1 f=0.1 M0=0 p=0.1 f=0.2 M0=0.1 underpredicted overpredicted
The short time period since hazard maps began to be made is a challenge for assessing how well they work. Hindcasts offer long records, but are not true tests, as they compare maps to data that were available when the map was made. Still, they give useful insight. Compare 510-year shaking record to Japanese National Hazard (JNH) maps Brooks et al., 2015 Miyazawa and Mori, 2009
Uniform & random maps Geller (2011) argued that “all of Japan is at risk from earthquakes, and the present state of seismological science does not allow us to reliably differentiate the risk level in particular geographic areas,” so a map showing uniform hazard would be preferable. Test: By the exceedance metric (M0) uniform and randomized maps do better, but by the square misfit metric (M1) JNH maps perform better. Brooks et al., 2015 JNH worse JNH better
Smoothed Maps Observed Original 50 km 100 km How detailed should a map be? –By exceedance metric (M0) map improves by smoothing over larger areas –By squared misfit (M1), map performs best if smoothed over km Brooks et al., 2015
Maps may be overparameterized (overfit). A high order polynomial fits past data better than linear or quadratic models, but this more detailed model predicts the future worse than the simpler models. Intermediate level of detail may be better for hazard maps.
Conclusions Maps often do poorly Large uncertainties unquantified & not communicated to users Map performance not assessed: no agreed criteria for doing so In some cases, by some metrics, uniform, random, or smoothed maps perform better Map performance involves complicated and not yet understood factors Better understanding of performance & uncertainties could improve hazard policy use