Lecture 8 Source detection NASSP Masters 5003S - Computational Astronomy - 2009.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Inferential Statistics and t - tests
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
1 Chi-Square Test -- X 2 Test of Goodness of Fit.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Outline input analysis input analyzer of ARENA parameter estimation
NASSP Masters 5003F - Computational Astronomy Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no.
Lecture 5 Template matching
Maximum likelihood (ML) and likelihood ratio (LR) test
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Edge detection. Edge Detection in Images Finding the contour of objects in a scene.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Slide 1 Statistics for HEP Roger Barlow Manchester University Lecture 3: Estimation.
Maximum likelihood (ML) and likelihood ratio (LR) test
GG313 Lecture 8 9/15/05 Parametric Tests. Cruise Meeting 1:30 PM tomorrow, POST 703 Surf’s Up “Peak Oil and the Future of Civilization” 12:30 PM tomorrow.
Environmental Data Analysis with MatLab Lecture 24: Confidence Limits of Spectra; Bootstraps.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Fitting a Model to Data Reading: 15.1,
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Chapter 11: Inference for Distributions
Inferences About Process Quality
Linear and generalised linear models
BCOR 1020 Business Statistics
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Maximum likelihood (ML)
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Statistical problems in network data analysis: burst searches by narrowband detectors L.Baggio and G.A.Prodi ICRR TokyoUniv.Trento and INFN IGEC time coincidence.
NASSP Masters 5003F - Computational Astronomy Lecture 7 – chi squared and all that Testing for goodness-of-fit continued. Uncertainties in the fitted.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
NASSP Masters 5003F - Computational Astronomy Lecture 3 First, a bit more python. Then some noise statistics.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
Lecture 03 Area Based Image Processing Lecture 03 Area Based Image Processing Mata kuliah: T Computer Vision Tahun: 2010.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Digital Image Processing Lecture 10: Image Restoration March 28, 2005 Prof. Charlene Tsai.
NASSP Masters 5003S - Computational Astronomy Lecture 1 Aim: –Become familiar with astronomy data –Learn a bit of programming No set text – –Web.
NASSP Masters 5003F - Computational Astronomy Lecture 6 Objective functions for model fitting: –Sum of squared residuals (=> the ‘method of least.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Intelligent Vision Systems ENT 496 Image Filtering and Enhancement Hema C.R. Lecture 4.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Baseband Receiver Receiver Design: Demodulation Matched Filter Correlator Receiver Detection Max. Likelihood Detector Probability of Error.
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Chapter 13 Understanding research results: statistical inference.
CHAPTER- 3.2 ERROR ANALYSIS. 3.3 SPECIFIC ERROR FORMULAS  The expressions of Equations (3.13) and (3.14) were derived for the general relationship of.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Inferential Statistics Psych 231: Research Methods in Psychology.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
NASSP Masters 5003F - Computational Astronomy Lecture 4: mostly about model fitting. The model is our estimate of the parent function. Let’s express.
Chapter 9 Introduction to the t Statistic
Image Analysis Image Restoration.
Data Analysis in Particle Physics
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
I. Statistical Tests: Why do we use them? What do they involve?
Presentation transcript:

Lecture 8 Source detection NASSP Masters 5003S - Computational Astronomy

Different sorts of model NASSP Masters 5003S - Computational Astronomy All models Background + signal Background + many similar signals

b + s vs b + Σ s i NASSP Masters 5003S - Computational Astronomy s may often be assumed to be: -slowly varying with r; -with compact support

Source detection The basic idea is related to Null Hypothesis testing… –But if the sources can be assumed to be localized, we can cut the data up and test each source-sized bit at a time.  sliding window. NASSP Masters 5003S - Computational Astronomy Survival function  Some missed jargon: the probability at the intercept is called the P-value (you can google it)

Testing the NH: Not all tests are equally good at finding signals! Eg Cash statistic is better than χ 2 (in circumstances where the Cash test is appropriate – eg bkg is a subset of the signal model). Cash stat makes use of knowledge about the signal shape – in general any stat which does similar (eg a matched filter) will also perform well. There is an infinite variety of ‘statistics’ to choose from. NASSP Masters 5003S - Computational Astronomy

NASSP Masters 5003F - Computational Astronomy Source detection. –If the SF probability in each patch (the P- value) is smaller than a previously chosen cutoff, we can call this a positive detection. BUT! Note that there is no certainty. –Sometimes the null model will by chance give a large χ 2 => ‘false positives.’ For given data, background and cutoff, there will be a fixed number of false positives expected in the source list. => ‘reliability’. More on this later. –Sometimes a real source will give a small null- hypothesis χ 2 => ‘false negatives’, real sources which are missed. => ‘completeness’. More on this later.

NASSP Masters 5003F - Computational Astronomy Problems with the NH approach: We don’t have exact knowledge of the background. –Have to estimate it either from separate data – in which case we need the separate data! (Don’t always have the luxury.) or from the same data… but this may be dominated by the source... –Or our background model may be wrong. Same issues as other model fitting. In particular: –χ 2 has to be used with care when the noise is Poisson.

NASSP Masters 5003F - Computational Astronomy But where are the sources? Applying some sort of NH test in a sliding window will return a new random signal – now correlated.. Finding the sources consists rather of looking for peaks in this random signal. The simplest example is when the noise is uncorrelated and the source peaks have width=0.

Looking for sources 1 channel at a time: In each channel, we test the NH with N=1. –Since there are no fitted parameters, υ =1 also. –If the source occupies a single channel, this procedure is optimal. –If, however, the source is spread over several channels (as is usual), this procedure is not efficient. –We want a statistic which uses the maximum amount of information about the source shape. NASSP Masters 5003F - Computational Astronomy

A generic source-detection algorithm We shall assume that: –The data is ‘binned’ (eg CCD data). –We have a good independent estimate of the background. –The sources are sparsely distributed – such that we can deal with them one at a time. –The shape of the source profile is known. –The source position is unknown. –The source amplitude is unknown (but >0).

NASSP Masters 5003F - Computational Astronomy Generic source-detection algorithm: The algorithm has 3 steps: Calculate a sliding-window map. Find the peaks in this map. For each peak, calculate the probability that it could arise by chance from the background (the null hypothesis P-value). P < P cutoff ? Sources Rejects NoYes 1: 2: 3: Choose a P cutoff

NASSP Masters 5003F - Computational Astronomy : The sliding window. y y y U U U

NASSP Masters 5003F - Computational Astronomy : The sliding window. For each position of the sliding window, a single number U is calculated from the values falling within the window. The output is a map of the U values. The intent is to: –Raise the signal-to-noise –Improve sensitivity –Amplify the sources at the expense of the noise. Sliding-window processing only has value when the source has a width > 1 pixel. Edges need special treatment. Same thing.

NASSP Masters 5003F - Computational Astronomy : Window functions A weighted sum (= a convolution). –Simplest with all weights = 1: “sliding box”. –Optimum weights – a “matched filter”: For uniform Gaussian noise, w opt = s. Trickier to optimize for Poisson noise. Per-window null-hypothesis χ 2. –With either an independent value of bkg (in which case degrees of freedom = number of pixels N w in the window), or… –…one fitted from the data (deg free = N w -1). Likelihood (same bkg provisions as χ 2 ).

NASSP Masters 5003F - Computational Astronomy : Window functions Parent function Data

NASSP Masters 5003F - Computational Astronomy Parent function 1: Window functions Matched filter, size=10 Chi squared, size=100 Log-likelihood, size=100

NASSP Masters 5003F - Computational Astronomy : Peak finding Gaussian noise, convolved with a gaussian filter. …don’t get the gaussians mixed up!

NASSP Masters 5003F - Computational Astronomy : Peak finding How best to do it? There’s no single neat prescription. Naive prescription: –Pixel i is a peak pixel if y i > any other y within a patch of pixels from i-j to i+j. This probably looks familiar to you. But what value to choose for j? Things to avoid are: –j too small – results in more than 1 peak per source; –j too large – misses a close adjacent source.

NASSP Masters 5003F - Computational Astronomy : Peak finding Box too small: Box too large:

NASSP Masters 5003F - Computational Astronomy : Decision time – is it a source or not? To calculate a P-value we need the probability distribution of peaks in the post- window map of U values (given the null hypothesis). This is not the same as the probability distribution of the original data values… …nor is it even the same as the probability distribution of U values. In fact, little work seems to have been done on p peaks. (Though there is quite a lot on the distribution of extrema – not quite the same thing.)

NASSP Masters 5003F - Computational Astronomy ‘Map’ vs ‘peak’ distributions for Gaussian noise. 3: The decision Black: all pixels Red: peaks

NASSP Masters 5003F - Computational Astronomy : Cash to the rescue A practical recipe for applying Cash to source detection goes as follows: –Choose a window area surrounding each peak. –Within this window, calculate L null with model m i = b i (the background map values). –Calculate L best by fitting a model Degrees of freedom ν = 1 (the amplitude) + d (the dimensions of the spatial fit). –The Cash statistic 2(L best -L null ) behaves like χ 2 with 1+d deg. free. m i = b i + θ 1 s(r i – θ r )

NASSP Masters 5003F - Computational Astronomy : Cash to the rescue The only difficult point (which is a problem for every method) is to calculate the fraction of pixels which are peaks. –Monte Carlo –Possibly a Fourier technique? Also, don’t want to use the fit for final parameter values. A Mighell fit is better. From my 2009 Cash paper.

NASSP Masters 5003F - Computational Astronomy What is the best detection method? From my 2009 Cash paper.