Journées LISA-France, Meudon, May 15-16, 2006 1 Bayesian Parameter Estimation Techniques for LISA Nelson Christensen, Carleton College, Northfield, Minnesota,

Slides:



Advertisements
Similar presentations
Noise & Data Reduction. Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum.
Advertisements

A walk through some statistic details of LSC results.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Bayesian Estimation in MARK
Inspiral Parameter Estimation via Markov Chain Monte Carlo (MCMC) Methods Nelson Christensen Carleton College LIGO-G Z.
Pulsar Detection and Parameter Estimation with MCMC - Six Parameters Nelson Christensen Physics and Astronomy Carleton College GWDAW December 2003.
Lecture 3 Probability and Measurement Error, Part 2.
What is Statistical Modeling
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian statistics – MCMC techniques
BAYESIAN INFERENCE Sampling techniques
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Bayesian Analysis of X-ray Luminosity Functions A. Ptak (JHU) Abstract Often only a relatively small number of sources of a given class are detected in.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Machine Learning CUNY Graduate Center Lecture 7b: Sampling.
Environmental Data Analysis with MatLab Lecture 24: Confidence Limits of Spectra; Bootstraps.
Today Introduction to MCMC Particle filters and MCMC
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Lecture II-2: Probability Review
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Bayes Factor Based on Han and Carlin (2001, JASA).
Systematic effects in gravitational-wave data analysis
Hierarchical Search for MBH mergers in Mock LISA data  Search for tracks in time-frequency plane; get first estimates of chirp mass, reduced mass, and.
Computer vision: models, learning and inference Chapter 19 Temporal models.
Periodicity in gravitational waves
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
A coherent null stream consistency test for gravitational wave bursts Antony Searle (ANU) in collaboration with Shourov Chatterji, Albert Lazzarini, Leo.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
1SBPI 16/06/2009 Heterodyne detection with LISA for gravitational waves parameters estimation Nicolas Douillet.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
The Analysis of Binary Inspiral Signals in LIGO Data Jun-Qi Guo Sept.25, 2007 Department of Physics and Astronomy The University of Mississippi LIGO Scientific.
Introduction to Digital Signals
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California CC & M. Vallisneri, PRD.
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Detecting Gravitational Waves with a Pulsar Timing Array LINDLEY LENTATI CAMBRIDGE UNIVERSITY.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Baseband Receiver Receiver Design: Demodulation Matched Filter Correlator Receiver Detection Max. Likelihood Detector Probability of Error.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Systematics in Hfitter. Reminder: profiling nuisance parameters Likelihood ratio is the most powerful discriminant between 2 hypotheses What if the hypotheses.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Lecture 1.31 Criteria for optimal reception of radio signals.
MCMC Output & Metropolis-Hastings Algorithm Part I
Advanced Statistical Computing Fall 2016
Bayesian data analysis
The Q Pipeline search for gravitational-wave bursts with LIGO
Analysis of LIGO S2 data for GWs from isolated pulsars
Jun Liu Department of Statistics Stanford University
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
LISA Data Analysis & Sources
Search for gravitational waves from binary black hole mergers:
EE513 Audio Signals and Systems
Robust Full Bayesian Learning for Neural Networks
Presentation transcript:

Journées LISA-France, Meudon, May 15-16, Bayesian Parameter Estimation Techniques for LISA Nelson Christensen, Carleton College, Northfield, Minnesota, USA

Journées LISA-France, Meudon, May 15-16, Outline of talk  Bayesian methods - Quick Review –Fundamentals –Markov chain Monte Carlo (MCMC) Methods  LISA data analysis applications –LISA source confusion problem –Time Delay Interferometry variables –Parameter Estimation: Binary inspiral signals

Journées LISA-France, Meudon, May 15-16, MCMC Collaborators Glasgow Physics and Astronomy: Dr. Graham Woan, Dr. Martin Hendry, John Veitch Auckland Statistics: Dr. Renate Meyer, Richard Umstätter, Christian Röver

Journées LISA-France, Meudon, May 15-16, " The trouble is that what we [statisticians] call modern statistics was developed under strong pressure on the part of biologists. As a result, there is practically nothing done by us which is directly applicable to problems of astronomy." Jerzy Neyman, founder of frequentist hypothesis testing. Orthodox statistical methods are concerned solely with deductions following experiments with populations: Why Bayesian methods?

Journées LISA-France, Meudon, May 15-16, Why Bayesian methods? Bayesian methods explore the joint probability space of data and hypotheses within some global model, quantifying their joint uncertainty and consistency as a scalar function: means “given” There is only one algebra consistent with this idea (and some further, very reasonable, constraints), which leads to (amongst other things) the product rule:

Journées LISA-France, Meudon, May 15-16, Bayes’ theorem: the appropriate rule for updating our degree of belief (in one of several hypotheses within some world view) when we have new data: Evidence, or “global likelihood” Posterior We can usually calculate all these terms PriorLikelihood Why Bayesian methods?

Journées LISA-France, Meudon, May 15-16, Marginalisation We can also deduce the marginal probabilities. If X and Y are propositions that can take on values drawn from and then this gives us the probability of X when we don’t care about Y. In these circumstances, Y is known as a nuisance parameter. All these relationships can be smoothly extended from discrete probabilities to probability densities, e.g. where “p(y)dy” is the probability that y lies in the range y to y+dy. =1

Journées LISA-France, Meudon, May 15-16, Markov Chain Monte Carlo methods We need to be able to evaluate marginal integrals of the form The approach is to sample in the space so that the density of samples reflects the posterior probability. MCMC algorithms perform random walks in the parameter space so that the probability of being in a hypervolume dV is. The random walk is a Markov chain: the transition probability of making a step depends on the proposed location, and the current location MCMC - demonstrated success with problems with large parameter number. Used by Google, WMAP, Financial Markets, LISA???

Journées LISA-France, Meudon, May 15-16, Metropolis-Hastings Algorithm We want to explore. Let the current location be. 1)Choose a candidate state using a proposal distribution. 2)Compute the Metropolis ratio 3)If R>1 then make the step (i.e., ) if R<1 then make the step with probability R, otherwise set, so that the location is repeated. i.e., make the step with an acceptance probability 4)Choose the next candidate based on the (new) current position… r

Journées LISA-France, Meudon, May 15-16, {a t } form a Markov chain drawn from p(a), so a histogram of {a t }, or any of its components, approximates the (joint) pdf of those components. The form of the acceptance probability guarantees reversibility even for proposal distributions that are asymmetric. There is a burn-in period before the equilibrium distribution is reached: Metropolis-Hastings Algorithm

Journées LISA-France, Meudon, May 15-16, Application to LISA data analysis Source Confusion Problem TDI Variables Parameter Estimation for Signals

Journées LISA-France, Meudon, May 15-16, This has implications for the analysis of LISA data, which is expected to contain many (perhaps 50,000) signals from white dwarf binaries. The data will contain resolvable binaries and binaries that just contribute to the overall noise (either because they are faint or because their frequencies are too close together). Bayes can sort these out without having to introduce ad hoc acceptance and rejection criteria, and without needing to know the “true noise level” (whatever that means!). LISA source identification

Journées LISA-France, Meudon, May 15-16, Things that are not generally true “A time series of length T has a frequency resolution of 1/T.” Frequency resolution also depends on signal-to-noise ratio. We know the period of PSR to 1e-13 Hz, but haven’t been observing it for 3e5 years. In fact frequency resolution is “you can subtract sources piece-wise from data.” Only true if the source signals are orthogonal over the observation period. “frequency confusion sets a fundamental limit for low-frequency LISA.” This limit is set by parameter confusion, which includes sky location and other relevant parameters (with a precision dependent on snr).

Journées LISA-France, Meudon, May 15-16, Toy (zeroth-order LISA) problem: You are given a time series of 1000 data points comprising a number of sinusoids embedded in gaussian noise. Determine the number of sinusoids, their amplitudes, phases and frequencies and the standard deviation of the noise. We could think of this as comparing hypotheses H m that there are m sinusoids in the data, with m ranging from 0 to m max. Equivalently, we could consider this a parameter fitting problem, with m an unknown parameter within the global model. signal parameterised by giving data and a likelihood LISA source identification

Journées LISA-France, Meudon, May 15-16, With suitably chosen priors on m and a m we can write down the full posterior pdf of the model But this is (3m+2) dimensional, with m ~100 in our toy problem, so the direct evaluation of marginal pdfs for, say, m or  m or to extract the pdf of a component amplitude, is unfeasible. Explore this space using a modified Markov Chain Monte Carlo technique… LISA source identification

Journées LISA-France, Meudon, May 15-16, Trans-dimensional moves (changing m) cannot be performed in conventional MCMC. We need to make jumps from to dimensions Reversibility is guaranteed if the acceptance probability for an upward transition is where is the Jacobian determinant of the transformation of the old parameters [and proposal random vector r drawn from q(r)] to the new set of parameters, i.e.. We use two sorts of trans-dimensional moves: – ‘split and merge’ involving adjacent signals – ‘birth and death’ involving single signals Reversible Jump MCMC

Journées LISA-France, Meudon, May 15-16, Trans-dimensional split-and-merge transitions A split transition takes the parameter subvector from a k and splits it into two components of similar frequency but about half the amplitude: A ff A

Journées LISA-France, Meudon, May 15-16, Trans-dimensional split-and-merge transitions A merge transition takes two parameter subvectors and merges them to their mean: A ff A

Journées LISA-France, Meudon, May 15-16, Initial values A good initial choice of parameters greatly decreases the length of the ‘burn-in’ period to reach convergence (equilibrium). For simplicity we use a thresholded FFT: The threshold is set low, as it is easier to destroy bad signals that to create good ones.

Journées LISA-France, Meudon, May 15-16, Simulations 1000 time samples with Gaussian noise 100 embedded sinusoids of form As and Bs chosen randomly in [-1 … 1] fs chosen randomly in [ ] Noise Priors A m,B m uniform over [-5…5] f m uniform over [ ] has a standard vague inverse- gamma prior IG( ;0.001,0.001)

Journées LISA-France, Meudon, May 15-16, results teaser (spectral density) energy density energy density frequency

Journées LISA-France, Meudon, May 15-16, results teaser (spectral density) energy density energy density frequency

Journées LISA-France, Meudon, May 15-16, Strong, close signals 1/T A f f A B B

Journées LISA-France, Meudon, May 15-16, Signal mixing Two signals (red and green) approaching in frequency:

Journées LISA-France, Meudon, May 15-16, Marginal pdfs for m and 

Journées LISA-France, Meudon, May 15-16, Spectral density estimates

Journées LISA-France, Meudon, May 15-16, Joint energy/frequency posterior

Journées LISA-France, Meudon, May 15-16, Well-separated signals (~1/T) These signals (separated by ~1 Nyquist step) can be easily distinguished and parameterized.

Journées LISA-France, Meudon, May 15-16, Closely-spaced signals 95% contour Signals can be distinguished, but parameter estimation in difficult.

Journées LISA-France, Meudon, May 15-16, Source Confusion: Extensions to full LISA We have implemented an MCMC method of extracting useful information from zeroth-order LISA data under difficult conditions. Extension to orbital Doppler/source location information should improve source identification. This code extension is currently being tested. Extension to TDI variables is straightforward. Raw Doppler measurements could also be used, with a suitable data covariance matrix. There is nothing special about WD-WD signals here. Similar analyses could be performed for BH mergers, EMRI sources etc…

Journées LISA-France, Meudon, May 15-16, Time Delay Interferometry (TDI) variables Principal Component Analysis (PCA) See Romano and Woan gr-qc/ Estimate signal parameters and noise with MCMC All information is in the likelihood TDI variables fall right out

Journées LISA-France, Meudon, May 15-16, Simple Example with Correlated Noise Data from 2 detectors s 1 = p + n 1 + h 1 s 2 = p + n 2 + h 2 Astro-signal h 1 =2a and h 2 =a n 1 and n 2 uncorrelated noise, p common noise; all noise zero mean = =  n 2 and =  p 2 Uncorrelated = = = 0 Likelihood p(s1,s2|a)  exp[-Q/2] Q=  (s i -h i )C -1 ij (s j -h j )C-noise covariance matrix

Journées LISA-France, Meudon, May 15-16, Principal Component Analysis C- find eigenvectors, factorize likelihood p(s 1,s 2 |a)  p(s + |a)p(s - |a) s + =s 1 +s 2 and s - =s 1 -s 2 where p(s + |a)  exp[(s + -3a) 2 /(8  p  n 2 )] p(s - |a)  exp[(s - -a) 2 /(4  n 2 )] For LISA  p 2 >>  n 2, so there is no loss of information by doing statistical inference only on the s - term - TDI.

Journées LISA-France, Meudon, May 15-16, More Realistic LISA Data streams: s 1 = D 3 p 2 - p 1 + n 1 + h 1 s’ 1 = D 2 p 3 - p 1 + n’ 1 + h’ 1 … D - delay operator MCMC- Everything is in the likelihood. Toy problem - sinusoidal gravity wave from above LISA; 3 signal parameters and 9 noise levels data points at 1 Hz.  p 2 >>  n 2 and  p >>h

Journées LISA-France, Meudon, May 15-16, Trace Plots Markov Chains Posterior PDFs made from these

Journées LISA-France, Meudon, May 15-16, Parameter Estimation Posterior PDFs for signal parameters and noise levels.

Journées LISA-France, Meudon, May 15-16, TDI Variables: Summary TDI variables fall out of likelihood - matches well to MCMC approach Simplify calculation for MCMC too Incorporate LISA complexity, step by step. Realistic noise and signal terms, LISA orbit, arm length change, etc. MCMC methods handle large parameter number; computational time grows linearly Long-term effort to develop realistic LISA scenario, with good prospects for success

Journées LISA-France, Meudon, May 15-16, Parameter Estimation for Signals Binary Inspiral as an MCMC exercise Numerous parameters MCMC can provides means for estimates Applications for other types of signals too Demonstrated to work with a network of ground-based interferometers - extend this work to LISA

Journées LISA-France, Meudon, May 15-16, Interferometer Detection Single Detector - 5 parameters: m 1, m 2, effective distance d L, phase  c and time t c at coalescence Reparameterize mass: chirp mass m c and  For multi-detectors- Coherent addition of signals Parameters for estimation: m 1, m 2,  c, t c, actual distance d, polarization , angle of inclination of orbital plane , sky position RA and dec

Journées LISA-France, Meudon, May 15-16, Amplitude Correction Work with Inspirals We have already developed an inspiral (ground based interferometer) MCMC pipeline for signals that are 3.5 Post-Newtonian (PN) in phase, 2.5 PN in the amplitude Time domain templates, then FFT into frequency domain MCMC provides parameter estimation and statistics. Future work will include spin of masses

Journées LISA-France, Meudon, May 15-16, Likelihood for Inspiral Signal Work in the frequency domain Detector output z(t) is the sum of gravity wave signal s(t,  ), that depends on unknown parameters , and the noise n(t) z(t)=s(t,  )+n(t) Noise spectral density S n (f)

Journées LISA-France, Meudon, May 15-16, Ground based example: 2 LIGO sites and Virgo. Code works, but optimization is still in progress.

Journées LISA-France, Meudon, May 15-16,

Journées LISA-France, Meudon, May 15-16, MCMC LISA Summary LISA faces extremely complex data analysis challenges MCMC methods have demonstrated record of success with large parameter number problems MCMC for source confusion - binary background TDI variables, signal and noise estimation Parameter estimation: binary inspirals and other waveforms

Journées LISA-France, Meudon, May 15-16, Delayed rejection Sampling can be improved (beyond Metropolis Hastings) if a second proposal is made following, and based on, an initial rejected proposal. The initial proposal is only rejected if this second proposal is also rejected. Acceptance probability of the second stage has to be chosen to preserve reversibility (detailed balance): acceptance probability for 1 st stage: and for the 2 nd stage: Delayed Rejection Reversible Jump Markov Chain Monte Carlo method ‘DRRJMCMC’ Green & Mira (2001) Biometrika

Journées LISA-France, Meudon, May 15-16, Label-switching As set up, the posterior is invariant under signal renumbering – we have not specified what we mean by ‘signal 1’. Break the symmetry by ordering in frequency: 1.Fix m at the most probable number of signals, containing n MCMC steps. 2.Order the nm MCMC parameter triples (A,B,f) in frequency. 3.Perform a rough density estimate to divide the samples into m blocks. 4.Perform an iterative minimum variance cluster analaysis on these blocks. 5.Merge clusters to get exactly m signals. 6.Tag the parameter triples in each cluster. f

Journées LISA-France, Meudon, May 15-16, Application to the LISA confusion problem