Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/
Parameter estimation NASA/WMAP science team Fig. M. White 1997
The Problem Power Spectrum calculation takes a long time for large l Likelihood takes time too Lengthy chains are needed specially for – Curved distributions –Non-Gaussian distributions –High dimensional parameter spaces
Possible Solutions Speed up the calculations –Parallel computation –Power Spectrum CMBWarp, Jimenez et al (2004) Pico, Fendt & Wandelt (2006) CosmoNet, Auld et al (2006) –Likelihood Improve MCMC method –Reparametrization, e.g. Verde et al (2003) –Optimized step-size, e.g. Dunkley et al (2004) –Parallel chains –Use more efficient MCMC algorithms, e.g. CosmoMC, Cornish et al (2005), HMC.
Traditional (Random Walk) Metropolis Algorithm Current position p(x)
Traditional (Random Walk) Metropolis Algorithm Proposed position p(x) p(x*) p(x*) > p(x) : accept the step
Traditional (Random Walk) Metropolis Algorithm Proposed position p(x) p(x*) p(x*) < p(x) : accept the step with probability p(x*)/p(x) Otherwise take another sample at x
Traditional (Random Walk) Metropolis Algorithm
Issues with MCMC Long burn-in time Correlated samples Low efficiency in high dimensions Low acceptance rate
Hamiltonian Monte Carlo Proposed by Duan et al, Phys. Lett. B, 1987 Used by –condensed matter physicists, –particle physicists and –statisticians. Uses Hamiltonian dynamics to perform big uncorrelated jumps in the parameter space.
Hamiltonian Monte Carlo p(x) x Define the potential energy U(x) = -Log(p(x))
Hamiltonian Monte Carlo U(x) x
Hamiltonian Monte Carlo U(x) x u(x) Total energy: H(x)=U(x)+1/2*u 2 Give it an initial momentum
Hamiltonian Monte Carlo U(x) x u(x) H(x*) = U(x*) + K(x*) u(x*) Evolve the system for a given time: Hamiltonian dynamics
H conserved, only if done accurately
Hamiltonian dynamics (in practice) Discretized time-steps Leapfrog method 22 22 u(t) x(t) u(t+ /2) u(t+ ) x(t+ ) Total energy may not remain conserved Accept the proposed position according to the Metropolis rule
Extended Target Density Sample from H(x,u) Marginal distribution of x is p(x)
How does it work? Assume Gaussian distribution Trajectories in the phase space: Randomizing the momentum in the beginning of each leapfrog guarantees the coverage of the whole space Fig. K. Hanson, 2001
Hamiltonian Monte Carlo
Important questions Are we sampling from the distribution of interest? Are we seeing the whole parameter space? How many samples do we need to estimate the parameters of interest to a desired precision? How efficient is our algorithm?
Convergence Diagnostics Autocorrelation:
Convergence Diagnostics XiXiXiXi P(k) = | k | 2 FFT
Convergence Diagnostics Power spectrum ; P(k) Averaged: Flat Ideal sampler
Efficiency of MCMC sequence ratio of the number of independent draws from the target pdf to the number of MCMC iterations required to achieve the same variance in an estimated quantity. For a Gaussian distribution: Where P 0 =P(k=0) See Dunkley et al (2004) for more details
Example: Gaussian PDF Sampled with different chains Low efficiency Better efficiency
Example Simplest example: Gaussian distribution Energy:
Comparison: Acceptance Rate HMC ~ 100% MCMC ~ 25%
Comparison: Correlations
Comparison: distributions
Comparison: Efficiency Compare to 1/D behavior of the efficiency of traditional MCMC methods
Cosmological Applications
Flat 6-parameter LCDM model 0th approximation: –Approximate the –Lnlikelihood by Estimate the fit parameters from an exploratory MCMC run. Evaluate Gradients, Run HMC.
Result Acceptance rate boosted up to 81% while reducing the correlation in the chain. Good improvement, but can do better!
Better approximation for gradients Modified likelihood routine of Pico (Fendt and Wandelt, 2006) to evaluate the gradient.
Lico (Likelihood routine of Pico) x F(x) Cut the parameter space into pieces and fit a different function to each piece.
The Gradient
Flat 6-parameter LCDM model Acceptance rate 98%
Correlation lengths
Summary HMC is a simple algorithm that can improve the efficiency of the MCMC chains dramatically HMC can be easily added to popular parameter estimation softwares such as CosmoMC and AnalyzeThis! HMC can be used along with methods of speeding up power spectrum and likelihood calculations, HMC is ideal for curved, non-Gaussian and hard-to-converge distributions, Approximations made in evaluating the gradient just reduce the acceptance rate, but don’t get propagated into the results of parameter estimation. It is easy to get a non-optimized HMC, but hard to get a wrong answer!