Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/0608679.

Slides:

Advertisements

Similar presentations

Introduction to Monte Carlo Markov chain (MCMC) methods

Advertisements

Monte Carlo Methods and Statistical Physics

Bayesian Estimation in MARK

Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.

Markov-Chain Monte Carlo

Computer Vision Lab. SNU Young Ki Baik An Introduction to MCMC for Machine Learning (Markov Chain Monte Carlo)

Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Bayesian Reasoning: Markov Chain Monte Carlo

Suggested readings Historical notes Markov chains MCMC details

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Journal Club 11/04/141 Neal, Radford M (2011). " MCMC Using Hamiltonian Dynamics. "

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

Bayesian Analysis of X-ray Luminosity Functions A. Ptak (JHU) Abstract Often only a relatively small number of sources of a given class are detected in.

Machine Learning CUNY Graduate Center Lecture 7b: Sampling.

Sérgio Pequito Phd Student

Nonlinear and Non-Gaussian Estimation with A Focus on Particle Filters Prasanth Jeevan Mary Knox May 12, 2006.

Particle filters (continued…). Recall Particle filters –Track state sequence x i given the measurements ( y 0, y 1, …., y i ) –Non-linear dynamics –Non-linear.

Today Introduction to MCMC Particle filters and MCMC

Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.

A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse.

Introduction to Monte Carlo Methods D.J.C. Mackay.

Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,

Priors, Normal Models, Computing Posteriors

SIS Sequential Importance Sampling Advanced Methods In Simulation Winter 2009 Presented by: Chen Bukay, Ella Pemov, Amit Dvash.

Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.

Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:

Probabilistic Robotics Bayes Filter Implementations.

1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.

Overview Particle filtering is a sequential Monte Carlo methodology in which the relevant probability distributions are iteratively estimated using the.

Monte Carlo Methods1 T Special Course In Information Science II Tomas Ukkonen

A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology.

Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.

Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.

Time-dependent Schrodinger Equation Numerical solution of the time-independent equation is straightforward constant energy solutions do not require us.

Bayesian Reasoning: Tempering & Sampling A/Prof Geraint F. Lewis Rm 560:

CAMELS CCDAS A Bayesian approach and Metropolis Monte Carlo method to estimate parameters and uncertainties in ecosystem models from eddy-covariance data.

Sequential Monte-Carlo Method -Introduction, implementation and application Fan, Xin

13. Extended Ensemble Methods. Slow Dynamics at First- Order Phase Transition At first-order phase transition, the longest time scale is controlled by.

1/18/2016Atomic Scale Simulation1 Definition of Simulation What is a simulation? –It has an internal state “S” In classical mechanics, the state = positions.

CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

Introduction to Sampling Methods Qi Zhao Oct.27,2004.

The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

6/11/2016Atomic Scale Simulation1 Definition of Simulation What is a simulation? –It has an internal state “S” In classical mechanics, the state = positions.

Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.

SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.

How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.

CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.

Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.

Markov Chain Monte Carlo in R

Introduction to Sampling based inference and MCMC

(joint work with Ai-ru Cheng, Ron Gallant, Beom Lee)

Reducing Photometric Redshift Uncertainties Through Galaxy Clustering

MCMC Output & Metropolis-Hastings Algorithm Part I

Markov Chain Monte Carlo methods --the final project of stat 6213

Optimization of Monte Carlo Integration

Advanced Statistical Computing Fall 2016

Probabilistic Robotics

Jun Liu Department of Statistics Stanford University

Markov chain monte carlo

Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.

Multidimensional Integration Part I

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Ch13 Empirical Methods.

Lecture 15 Sampling.

Slides for Sampling from Posterior of Shape

Presentation transcript:

Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/

Parameter estimation NASA/WMAP science team Fig. M. White 1997

The Problem Power Spectrum calculation takes a long time for large l Likelihood takes time too Lengthy chains are needed specially for – Curved distributions –Non-Gaussian distributions –High dimensional parameter spaces

Possible Solutions Speed up the calculations –Parallel computation –Power Spectrum CMBWarp, Jimenez et al (2004) Pico, Fendt & Wandelt (2006) CosmoNet, Auld et al (2006) –Likelihood Improve MCMC method –Reparametrization, e.g. Verde et al (2003) –Optimized step-size, e.g. Dunkley et al (2004) –Parallel chains –Use more efficient MCMC algorithms, e.g. CosmoMC, Cornish et al (2005), HMC.

Traditional (Random Walk) Metropolis Algorithm Current position p(x)

Traditional (Random Walk) Metropolis Algorithm Proposed position p(x) p(x*) p(x*) > p(x) : accept the step

Traditional (Random Walk) Metropolis Algorithm Proposed position p(x) p(x*) p(x*) < p(x) : accept the step with probability p(x*)/p(x) Otherwise take another sample at x

Traditional (Random Walk) Metropolis Algorithm

Issues with MCMC Long burn-in time Correlated samples Low efficiency in high dimensions Low acceptance rate

Hamiltonian Monte Carlo Proposed by Duan et al, Phys. Lett. B, 1987 Used by –condensed matter physicists, –particle physicists and –statisticians. Uses Hamiltonian dynamics to perform big uncorrelated jumps in the parameter space.

Hamiltonian Monte Carlo p(x) x Define the potential energy U(x) = -Log(p(x))

Hamiltonian Monte Carlo U(x) x

Hamiltonian Monte Carlo U(x) x u(x) Total energy: H(x)=U(x)+1/2*u 2 Give it an initial momentum

Hamiltonian Monte Carlo U(x) x u(x) H(x*) = U(x*) + K(x*) u(x*) Evolve the system for a given time: Hamiltonian dynamics

H conserved, only if done accurately

Hamiltonian dynamics (in practice) Discretized time-steps Leapfrog method 22 22  u(t) x(t) u(t+  /2) u(t+  ) x(t+  ) Total energy may not remain conserved Accept the proposed position according to the Metropolis rule

Extended Target Density Sample from H(x,u) Marginal distribution of x is p(x)

How does it work? Assume Gaussian distribution Trajectories in the phase space: Randomizing the momentum in the beginning of each leapfrog guarantees the coverage of the whole space Fig. K. Hanson, 2001

Hamiltonian Monte Carlo

Important questions Are we sampling from the distribution of interest? Are we seeing the whole parameter space? How many samples do we need to estimate the parameters of interest to a desired precision? How efficient is our algorithm?

Convergence Diagnostics Autocorrelation:

Convergence Diagnostics XiXiXiXi P(k) = |  k | 2 FFT 

Convergence Diagnostics Power spectrum ; P(k) Averaged: Flat Ideal sampler

Efficiency of MCMC sequence ratio of the number of independent draws from the target pdf to the number of MCMC iterations required to achieve the same variance in an estimated quantity. For a Gaussian distribution: Where P 0 =P(k=0) See Dunkley et al (2004) for more details

Example: Gaussian PDF Sampled with different chains Low efficiency Better efficiency

Example Simplest example: Gaussian distribution Energy:

Comparison: Acceptance Rate HMC ~ 100% MCMC ~ 25%

Comparison: Correlations

Comparison: distributions

Comparison: Efficiency Compare to 1/D behavior of the efficiency of traditional MCMC methods

Cosmological Applications

Flat 6-parameter LCDM model 0th approximation: –Approximate the –Lnlikelihood by Estimate the fit parameters from an exploratory MCMC run. Evaluate Gradients, Run HMC.

Result Acceptance rate boosted up to 81% while reducing the correlation in the chain. Good improvement, but can do better!

Better approximation for gradients Modified likelihood routine of Pico (Fendt and Wandelt, 2006) to evaluate the gradient.

Lico (Likelihood routine of Pico) x F(x) Cut the parameter space into pieces and fit a different function to each piece.

The Gradient

Flat 6-parameter LCDM model Acceptance rate 98%

Correlation lengths

Summary HMC is a simple algorithm that can improve the efficiency of the MCMC chains dramatically HMC can be easily added to popular parameter estimation softwares such as CosmoMC and AnalyzeThis! HMC can be used along with methods of speeding up power spectrum and likelihood calculations, HMC is ideal for curved, non-Gaussian and hard-to-converge distributions, Approximations made in evaluating the gradient just reduce the acceptance rate, but don’t get propagated into the results of parameter estimation. It is easy to get a non-optimized HMC, but hard to get a wrong answer!