Goodness of fit testing for point processes with application to ETAS models, spatial clustering, and focal mechanisms (USGS) Frederic Paik Schoenberg,

Slides:



Advertisements
Similar presentations
Review of Catalogs and Rate Determination in UCERF2 and Plans for UCERF3 Andy Michael.
Advertisements

Estimating ETAS 1. Straightforward aspects. 2. Main obstacles. 3. A trick to simplify things enormously. 4. Simulations and examples. 1.
1 – Stress contributions 2 – Probabilistic approach 3 – Deformation transients Small earthquakes contribute as much as large earthquakes do to stress changes.
Statistics review of basic probability and statistics.
Earthquake swarms Ge 277, 2012 Thomas Ader. Outline Presentation of swarms Analysis of the 2000 swarm in Vogtland/NW Bohemia: Indications for a successively.
Extreme Earthquakes: Thoughts on Statistics and Physics Max Werner 29 April 2008 Extremes Meeting Lausanne.
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA , EARTHQUAKE PREDICTABILITY.
Earthquake spatial distribution: the correlation dimension (AGU2006 Fall, NG43B-1158) Yan Y. Kagan Department of Earth and Space Sciences, University of.
Simulation Modeling and Analysis
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA , GLOBAL EARTHQUAKE.
Discrete Event Simulation How to generate RV according to a specified distribution? geometric Poisson etc. Example of a DEVS: repair problem.
1 1.MLE 2.K-function & variants 3.Residual methods 4.Separable estimation 5.Separability tests Estimation & Inference for Point Processes.
Evaluating Hypotheses
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA , Full.
Epidemic Type Earthquake Sequence (ETES) model  Seismicity rate = "background" + "aftershocks":  Magnitude distribution: uniform G.R. law with b=1 (Fig.
Omori law Students present their assignments The modified Omori law Omori law for foreshocks Aftershocks of aftershocks Physical aspects of temporal clustering.
1 Some Current Problems in Point Process Research: 1. Prototype point processes 2. Non-simple point processes 3. Voronoi diagrams.
Nonstationary covariance structures II NRCSE. Drawbacks with deformation approach Oversmoothing of large areas Local deformations not local part of global.
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA , Global.
Separate multivariate observations
Omori law The modified Omori law Omori law for foreshocks Aftershocks of aftershocks Physical aspects of temporal clustering.
If we build an ETAS model based primarily on information from smaller earthquakes, will it work for forecasting the larger (M≥6.5) potentially damaging.
Statistics of Seismicity and Uncertainties in Earthquake Catalogs Forecasting Based on Data Assimilation Maximilian J. Werner Swiss Seismological Service.
Seismogenesis, scaling and the EEPAS model David Rhoades GNS Science, Lower Hutt, New Zealand 4 th International Workshop on Statistical Seismology, Shonan.
The interevent time fingerprint of triggering for induced seismicity Mark Naylor School of GeoSciences University of Edinburgh.
Stability and accuracy of the EM methodology In general, the EM methodology yields results which are extremely close to the parameter estimates of a direct.
Forecasting occurrences of wildfires & earthquakes using point processes with directional covariates Frederic Paik Schoenberg, UCLA Statistics Collaborators:
FULL EARTH HIGH-RESOLUTION EARTHQUAKE FORECASTS Yan Y. Kagan and David D. Jackson Department of Earth and Space Sciences, University of California Los.
Input Analysis 1.  Initial steps of the simulation study have been completed.  Through a verbal description and/or flow chart of the system operation.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Analysis of complex seismicity pattern generated by fluid diffusion and aftershock triggering Sebastian Hainzl Toni Kraft System Statsei4.
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA ,
Research opportunities using IRIS and other seismic data resources John Taber, Incorporated Research Institutions for Seismology Michael Wysession, Washington.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
A functional form for the spatial distribution of aftershocks Karen Felzer USGS Pasadena.
Using IRIS and other seismic data resources in the classroom John Taber, Incorporated Research Institutions for Seismology.
A (re-) New (ed) Spin on Renewal Models Karen Felzer USGS Pasadena.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
1 Chapter 6 – Analysis of mapped point patterns This chapter will introduce methods for analyzing and modeling the spatial distribution of mapped point.
Yan Y. Kagan & David D. Jackson Dept. Earth and Space Sciences, UCLA, Los Angeles, CA ,
1. Difficulty of point process model evaluation. 2. RELM and CSEP. 3. Numerical summaries (L-test, N-test, etc.). 4. Functional summaries (error diagrams,
Karen Felzer & Emily Brodsky Testing Stress Shadows.
Coulomb Stress Changes and the Triggering of Earthquakes
Spatial Statistics in Ecology: Point Pattern Analysis Lecture Two.
2. MOTIVATION The distribution of interevent times of aftershocks suggests that they obey a Self Organized process (Bak et al, 2002). Numerical models.
Relative quiescence reported before the occurrence of the largest aftershock (M5.8) with likely scenarios of precursory slips considered for the stress-shadow.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
1 Producing Omori’s law from stochastic stress transfer and release Mark Bebbington, Massey University (joint work with Kostya Borovkov, University of.
112/16/2010AGU Annual Fall Meeting - NG44a-08 Terry Tullis Michael Barall Steve Ward John Rundle Don Turcotte Louise Kellogg Burak Yikilmaz Eric Heien.
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
One Function of Two Random Variables
The Snowball Effect: Statistical Evidence that Big Earthquakes are Rapid Cascades of Small Aftershocks Karen Felzer U.S. Geological Survey.
A proposed triggering/clustering model for the current WGCEP Karen Felzer USGS, Pasadena Seismogram from Peng et al., in press.
California Earthquake Rupture Model Satisfying Accepted Scaling Laws (SCEC 2010, 1-129) David Jackson, Yan Kagan and Qi Wang Department of Earth and Space.
1 1.Definitions & examples 2.Conditional intensity & Papangelou intensity 3.Models a) Renewal processes b) Poisson processes c) Cluster models d) Inhibition.
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA ,
GNS Science Testing by hybridization – a practical approach to testing earthquake forecasting models David Rhoades, Annemarie Christophersen & Matt Gerstenberger.
Jiancang Zhuang Inst. Statist. Math. Detecting spatial variations of earthquake clustering parameters via maximum weighted likelihood.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Abstract The space-time epidemic-type aftershock sequence (ETAS) model is a stochastic process in which seismicity is classified into background and clustering.
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA ,
Global smoothed seismicity models and test results
Simulation-Based Approach for Comparing Two Means
Model evaluation for forecasts of wildfire activity and spread
Rick Paik Schoenberg, UCLA Statistics
Rick Paik Schoenberg, UCLA Statistics
R. Console, M. Murru, F. Catalli
Presentation transcript:

Goodness of fit testing for point processes with application to ETAS models, spatial clustering, and focal mechanisms (USGS) Frederic Paik Schoenberg, Ka Wong, and Robert Clements 1)Point process models and ETAS 2)Pixel-based methods 3)Numerical summaries 4)Error diagrams 5)Comparative methods, tessellation residuals 6)Residuals: rescaling, thinning, and superposition. 7)Applications: using focal mechanism in ETAS, and the Tapered Pareto Wrapped Exponential (TPWE) model

1.Some point process models in seismology. Point process: random (  -finite) collection of points in some space, S. N(A) = # of points in the set A. S = [0, T] x X. Simple: No two points at the same time (with probability one). Conditional intensity: (t,x) = lim  t,  x -> 0 E{N([t, t+  t) x B x,  x ) | H t } / [  t  x]. H t = history of N for all times before t, B x,  x = ball around x of size  x. * A simple point process is uniquely characterized by (t,x). (Fishman & Snyder 1976)‏ Poisson process: (t,x) doesn’t depend on H t. N(A 1 ), N(A 2 ), …, N(A k ) are independent for disjoint A i, and each Poisson. Stationary (homogeneous) Poisson process: (t,x) = 

Some point process models of clustering: Neyman-Scott process: clusters of points whose centers are formed from a stationary Poisson process. Typically each cluster consists of a fixed integer k of points which are placed uniformly and independently within a ball of radius r around each cluster’s center. Cox-Matern process: cluster sizes are random: independent and identically distributed Poisson random variables. Thomas process: cluster sizes are Poisson, and the points in each cluster are distributed independently and isotropically according to a Gaussian distribution. Hawkes process: parents are formed from a stationary Poisson process, and each produces a cluster of offspring points, and each of them produces a cluster of further offspring points, etc. (t, x) =  + ∑ g(t-t i, ||x-x i ||). t i < t

Aftershock activity typically follows the modified Omori law (Utsu 1971): g(t) = K/(t+c) p.

ETAS (Epidemic-Type Aftershock Sequence, Ogata 1988, 1998): (t, x) =  (x) + ∑ g(t - t i, ||x - x i ||, m i ), t i < t where g(t, x, m) could be, for instance, K exp{  m} (t+c) p (x 2 + d) q

2. Pixel-based methods. Compare N(A i ) with ∫ A (t, x) dt dx, on pixels A i. (Baddeley, Turner, Møller, Hazelton, 2005) Problems: * If pixels are large, lose power. * If pixels are small, residuals are mostly ~ 0,1. * Smoothing reveals only gross features.

(Baddeley, Turner, Møller, Hazelton, 2005)‏

Pearson residuals: Normalized difference between observed and expected numbers of events within each space-time pixel. (Baddeley et al. 2005)

3. Numerical summaries. a) Likelihood statistics (LR, AIC, BIC). Log-likelihood = ∑ log (t i,x i ) - ∫ (t,x) dt dx. b) Second-order statistics. * K-function, L-function (Ripley, 1977) * Weighted K-function (Baddeley, Møller and Waagepetersen 2002, Veen and Schoenberg 2005) * Other weighted 2nd-order statistics: R/S statistic, correlation integral, fractal dimension (Adelfio and Schoenberg, 2009)

Model Assessment using Weighted K-function Usual K-function: K(h) ~ ∑∑ i≠j I(|x i - x j | ≤ h), (Ripley 1979) Weight each pair of points according to the estimated intensity at the points: K w (h) ^ ~ ∑∑ i≠j w i w j I(|x i - x j | ≤ h), (Baddeley et al. 2002) where w i = (t i, x i ) -1. Asympt. normal, under certain regularity conditions. (Veen and Schoenberg 2005)

3. Numerical summaries. a) Likelihood statistics (LR, AIC, BIC). Log-likelihood = ∑ log (t i,x i ) - ∫ (t,x) dt dx. b) Second-order statistics. * K-function, L-function (Ripley, 1977) * Weighted K-function (Baddeley, Møller and Waagepetersen 2002, Veen and Schoenberg 2005) * Other weighted 2nd-order statistics: R/S statistic, correlation integral, fractal dimension (Adelfio and Schoenberg, 2009) c) Other test statistics (mostly vs. stationary Poisson). TTT, Khamaladze (Andersen et al. 1993) Cramèr-von Mises, K-S test (Heinrich 1991) Higher moment and spectral tests (Davies 1977) Problems: -- Overly simplistic. -- Stationary Poisson not a good null hypothesis (Stark 1997)

4. Error Diagrams Plot (normalized) number of alarms vs. (normalized) number of false negatives (failures to predict). (Molchan 1990; Molchan 1997; Zaliapin & Molchan 2004; Kagan 2009). Similar to ROC curves (Swets 1973). Problems: -- Must focus near axes. [consider relative to given model (Kagan 2009) ] -- Does not suggest where model fits poorly.

5. Comparative methods, tessellation. -- Can consider difference (for competing models) between residuals over each pixel. Problem: Hard to interpret. If difference = 3, is this because model A overestimated by 3? Or because model B underestimated by 3? Or because model A overestimated by 1 and model B underestimated by 2? Also, when aggregating over pixels, it is possible that a model will predict the correct number of earthquakes, but at the wrong locations and times. -- Better: consider difference between log-likelihoods, in each pixel. (Wong & Schoenberg 2010). Problem: pixel choice is arbitrary, and unequal # of pts per pixel…..

-- Alternative: use the Voronoi tessellation of the points as cells. Cell i = {All locations closer to point (x i,y i ) than to any other point (x j,y j ) }. Now 1 point per cell. If is locally constant, then cell area ~ Gamma (Hinde and Miles 1980)‏

6. Residuals: rescaling, thinning, superposing Rescaling. (Meyer 1971;Berman 1984; Merzbach &Nualart 1986; Ogata 1988; Nair 1990; Schoenberg 1999; Vere- Jones and Schoenberg 2004): Suppose N is simple. Rescale one coordinate: move each point {t i, x i } to {t i, ∫ o x i (t i,x) dx} [or to {∫ o t i (t,x i ) dt), x i }]. Then the resulting process is stationary Poisson. Problems: * Irregular boundary, plotting. * Points in transformed space hard to interpret. * For highly clustered processes: boundary effects, loss of power.

Thinning. (Westcott 1976): Suppose N is simple, stationary, & ergodic.

Thinning: Suppose inf (t i,x i ) = b. Keep each point (t i,x i ) with probability b / (t i,x i ). Can repeat many times --> many stationary Poisson processes (but not quite independent!)‏

Superposition. (Palm 1943): Suppose N is simple & stationary. Then M k --> stationary Poisson.

Superposition: Suppose sup (t, x) = c. Superpose N with a simulated Poisson process of rate c - (t, x). As with thinning, can repeat many times to generate many (non-independent) stationary Poisson processes. Problems with thinning and superposition: Thinning: Low power. If b = inf (t i,x i ) is small, will end up with very few points. Superposition: Low power if c = sup (t i,x i ) is large: most of the residual points will be simulated.

7. Example: using focal mechanisms in ETAS

(Kagan 1998) Motivation: Earthquakes occur primarily along faults. Aftershocks are thought to lie elliptically around previous mainshocks. Most current models such as ETAS, which are used to produce earthquake forecasts, do not take earthquake orientation into account. One may hope to improve existing models for earthquake forecasting by incorporating the orientations of previous earthquakes.

1906 SF earthquake damage (USGS) Main purpose: improved forecasts. Other purposes of ETAS * Baseline model for comparisons (Ogata and Zhuang 2006) * Comparing different zones or catalogs (Kagan et al. 2009) * Testing hypotheses about static or dynamic triggering (Hainzl and Ogata 2005) * Declustering (Zhuang et al. 2002) * Detecting anomolous seismic behavior (Ogata 2007)

ETAS: no use of focal mechanisms. Focal mechanisms summarize the principal direction of motion in an earthquake, as well as resulting stress changes and tension/pressure axes.

In ETAS (Ogata 1998), (t,x,m) = f(m)[  (x) + ∑ i g(t-t i, x-x i, m i )], where f(m) is exponential,  (x) is estimated by kernel smoothing, i.e. the spatial triggering component, in polar coordinates, has the form: g(r,  ) = (r 2 + d) q. Looking at inter-event distances in Southern California, as a function of the direction  i  of the principal axis of the prior event, suggests: g(r,  ;  i  ) = g 1 (r) g 2 (  -  i | r), where g 1 is the tapered Pareto distribution, and g 2 is the wrapped exponential. and

It is well known that aftershocks occur primarily near the fault plane of their associated mainshocks (William and Frohlich 1987, Michael 1989, Kagan 1992). One way to model anisotropic clustering is to * identify clusters * fit a skewed bivariate normal distribution to aftershock locations Problems: Cluster identification is difficult and often subjective. Useless for real-time hazard estimation. Bivariate normal does not fit well.

Instead, one may incorporate focal mechanisms in ETAS by letting the triggering function g depend on the angle relative to the estimated fault plane of the previous earthquake.

Distance to next event, in relation to nodal plane of prior event (So. CA strikeslips, , M≥3.0, quality A or B, SCEDC).

TPWE MODEL In ordinary ETAS (Ogata 1998), (t,x,m) = f(m)[  (x) + ∑ i g(t-t i, x-x i, m i )], where f(m) is the magnitude density (e.g. Gutenberg-Richter)  (x) is estimated by kernel smoothing, and i.e. the spatial triggering component, in polar coordinates, has the form:g(r,  ) = (r 2 + d) -q. Looking at inter-event distances in Southern California and relative angles   from the principal axis of the prior event suggests: g(r,  ) = g 1 (r) g 2 (  -   | r), where g 1 is the tapered Pareto distribution, and g 2 is the wrapped exponential.

tapered Pareto / wrapped exp.  biv. normal (Ogata 1998)  Cauchy/ ellipsoidal (Kagan 1996) 

Model Assessment: Akaike Information Criterion (AIC): = -2 log(likelihood) + 2p. [Lower is better. Approx.  2 distributed is larger model is correct, so difference of 2 is stat. sig.] Big (> 300 point) difference due to using the wrapped exponential for angular separations instead of uniform distribution. Also > 300 point difference due to tapered Pareto distribution of distances instead of Pareto.

TPWE vastly outperforms the normal model. Normal underpredicts density near origin (i.e. mainshock) Cauchy/ellipse Normal TPWE

Thinned residuals: Data  tapered Pareto / wrapped exp.  Cauchy/ ellipsoidal (Kagan 1996)  biv. normal (Ogata 1998) 

Tapered pareto / wrapped exp. Cauchy / ellipsoidal

Conclusions: Point process model evaluation is still an unsolved problem. Pixel-based methods have problems: non-normality, high variability, low power, arbitrariness in choice of pixel size. Comparative methods and tessellation residuals seem more promising. Numerical summary statistics and error diagrams provide very limited information. Rescaling, thinning, and superposition also have problems: can have low power, especially when the intensity is volatile. Focal mechanism estimates should be used to improve triggering functions in ETAS models. Distances appear to be approx. tapered Pareto (TP), not Pareto. Angular separations, relative to mainshock orientation, appear to be approximately wrapped exponential (WE). Bivariate normal model assigns much too little density far away from mainshocks. Cauchy/ellipsoidal model assigns too little density near the origin, i.e. near the mainshock.