Markov Chain Monte Carlo Limitations of the Model

Slides:

Advertisements

Similar presentations

Bayesian Estimation in MARK

Advertisements

Gibbs sampling in open-universe stochastic languages Nimar S. Arora Rodrigo de Salvo Braz Erik Sudderth Stuart Russell.

Automatic Inference in BLOG Nimar S. Arora University of California, Berkeley Stuart Russell University of California, Berkeley Erik Sudderth Brown University.

Markov-Chain Monte Carlo

Markov Chains 1.

Likelihood ratio tests

Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.

CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov

1 Vertically Integrated Seismic Analysis Stuart Russell Computer Science Division, UC Berkeley Nimar Arora, Erik Sudderth, Nick Hay.

Gibbs sampling for motif finding in biological sequences Christopher Sheldahl.

What if time ran backwards? If X n, 0 ≤ n ≤ N is a Markov chain, what about Y n = X N-n ? If X n follows the stationary distribution, Y n has stationary.

. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:

Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.

Today Introduction to MCMC Particle filters and MCMC

Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.

Monte Carlo Methods in Partial Differential Equations.

EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()

Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:

Inferring High-Level Behavior from Low-Level Sensors Don Peterson, Lin Liao, Dieter Fox, Henry Kautz Published in UBICOMP 2003 ICS 280.

Representational and inferential foundations for possible large-scale information extraction and question-answering from the web Stuart Russell Computer.

Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.

Comparing droplet activation parameterisations against adiabatic parcel models using a novel inverse modelling framework Warsaw: April 20 th 2015: Eulerian.

1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.

Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.

TEMPLATE DESIGN © Vertically Integrated Seismological Analysis II : Inference (S31B-1713) Nimar S. Arora, Stuart Russell,

MCMC reconstruction of the 2 HE cascade events Dmitry Chirkin, UW Madison.

Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,

Lecture #9: Introduction to Markov Chain Monte Carlo, part 3

Bayesian Travel Time Reliability

Representational and inferential foundations for possible large-scale information extraction and question-answering from the web Stuart Russell Computer.

Inference of Non-Overlapping Camera Network Topology by Measuring Statistical Dependence Date ：

TEMPLATE DESIGN © Vertically Integrated Seismological Analysis I : Modeling Nimar S. Arora, Michael I. Jordan, Stuart.

CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

TEMPLATE DESIGN © Approximate Inference Completing the analogy… Inferring Seismic Event Locations We start out with the.

STAT 534: Statistical Computing

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),

Monte Carlo Sampling to Inverse Problems Wojciech Dębski Inst. Geophys. Polish Acad. Sci. 1 st Orfeus workshop: Waveform inversion.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Canadian Bioinformatics Workshops

CS Fall 2011, Stuart Russell

Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.

Markov Chain Monte Carlo in R

Bayesian Treaty Monitoring

Introduction to Sampling based inference and MCMC

Reducing Photometric Redshift Uncertainties Through Galaxy Clustering

MCMC Output & Metropolis-Hastings Algorithm Part I

Making inferences from collected data involve two possible tasks:

GEOGG121: Methods Monte Carlo methods, revision

Chapter 6: Temporal Difference Learning

Haim Kaplan and Uri Zwick

(Very Brief) Introduction to Bayesian Statistics

Open universes and nuclear weapons

Multidimensional Integration Part I

Instructors: Fei Fang (This Lecture) and Dave Touretzky

ANALYST EVALUATION OF MODEL-BASED BAYESIAN SEISMIC MONITORING AT THE CTBTO Logic, Inc. Nimar S. Arora1, Jeffrey Given2, Elena Tomuta2, Stuart J. Russell1,3,

Conclusions and Further Work

EGU István Bondár1, Ronan Le Bras2, Nimar Arora3, Noriyuki Kushida2

Infrasound uses many new attributes such as Center Frequency

Robust Full Bayesian Learning for Neural Networks

Chapter 6: Temporal Difference Learning

Gibbs sampling in open-universe stochastic languages

Slides for Sampling from Posterior of Shape

Markov Networks.

Presentation transcript:

Markov Chain Monte Carlo Limitations of the Model Vertically Integrated Seismological Analysis II : Inference (S31B-1713) Nimar S. Arora, Stuart Russell, and Erik B. Sudderth nimar@cs.berkeley.edu, russell@cs.berkeley.edu, and sudderth@cs.brown.edu The Model Markov Chain Monte Carlo Example Continued… Evaluation # SeismicEvents ~ Poisson[TIME_DURATION*EVENT_RATE]; IsEarthQuake(e) ~ Bernoulli(.5); EventLocation(e) If IsEarthQuake(e) ~ EarthQuakeDistribution() Else ~ UniformEarthDistribution(); Magnitude(e) ~ Exponential(log(10)) + MIN_MAG; Distance(e,s) = GeographicalDistance(EventLocation(e), SiteLocation(s)); IsDetected(e,s) ~ Logistic[SITE_COEFFS(s)] (Magnitude(e), Distance(e,s),Distance(e,s)**2]; #Arrivals(site = s) ~ Poisson[TIME_DURATION*FALSE_RATE(s)]; #Arrivals(event=e, site=s) If IsDetected(e,s) = 1 Else = 0; Time(a) If (event(a) = null) ~ Uniform(0,TIME_DURATION) else = IASPEI-TIME(EventLocation(event(a)), SiteLocation(site(a))) + TimeRes(a); TimeRes(a) ~ Laplace(TIMLOC(site(a)), TIMSCALE(site(a))); Azimuth(a) If (event(a) = null) ~ Uniform(0, 360) else = AddAngle(GeographicalAzimuth(EventLocation(event(a)), + AzRes(a); AzRes(a) ~ Laplace(0, AZSCALE(site(a))); Slow(s) If (event(a) = null) ~ Uniform(0,20) else = IASPEI-SLOW(EventLocation(event(a)), + SlowRes(site(a)); SlowRes(a) ~ Laplace(0, SLOSCALE); The model combined with the actual observations of the arrivals defines a posterior probability density on the number, type, and locations of the seismic events – p(x), where x is a possible world. We use Markov Chain Monte Carlo (MCMC, Gilks et al., 1996) methods to infer p(x). In other words, we sample from a Markov Chain whose stationary distribution is p(x). To construct this Markov Chain, we design moves which transition between the hypothesis space. The birth and death moves create new events and destroy them, respectively. The switch arrival move changes the event associated with an arrival. The random walk move changes the location and other parameters of an event. We assume that LEB (human annotated bulletin) is the ground truth. We evaluate our system by comparing against the performance of SEL3 (the current automated bulletin) using the same arrivals as are available to SEL3. The predictions are evaluated by computing a min-cost max-cardinality matching of the predicted events with the ground truth events where the cost is the distance between the predicted and the true event location. Any edge with more that 50 seconds or 5 degrees of error is not included in the matching. We report precision (percentage of predicted events which are matched), recall (percentage of true events which are matched), F1 (harmonic mean of precision and recall), and the average cost of the matching. Over some iterations, all the events are proposed, but the locations may not be very good. Gradually, due to random walk and switch association moves, the locations of all the events are improved. Dataset 76 days of parametric data (i.e. arrivals marked by automated station processing) for training. 7 days of validation data (results below) 7 days of test data (not currently used). MCMC example Proposal density is constructed by inverting the arrivals. Results F1 Precision/ Recall Error/S.D. (km) Average Log-likelihood SEL3 (IDC Automated) 55.6 46.2 / 69.7 98 / 119 _ VISA (Best Start) 80.4 70.9 / 92.9 100 / 117 -1784 VISA (SEL3 Start) 55.2 44.3 / 73.4 104 / 124 -1791 VISA (Back projection Start) 50.6 49.1 / 52.0 126 / 139 -1818 The samples collected from the Markov Chain can be used to infer the posterior density Initial world has a number of spurious events. Limitations of the Model Analysis of Errors The death move quickly kills off most of the spurious events. Based on arrivals identified by automated station processing (i.e. not based on waveforms, yet!). Relies only on the first P-arrival. Markov chain is not converging fast enough. We need better moves to avoid local minima. Automated station processing has systematic bias in picking arrivals late. We need to build models on waveforms directly.