Wavefront Sensing II Richard Lane Department of Electrical and Computer Engineering University of Canterbury.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

The Maximum Likelihood Method

General Linear Model With correlated error terms  =  2 V ≠  2 I.

The Simple Regression Model

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

5/3/2015J-PARC1 Transverse Matching Using Transverse Profiles and Longitudinal Beam arrival Times.

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Model Assessment, Selection and Averaging

The General Linear Model. The Simple Linear Model Linear Regression.

Basic geostatistics Austin Troy.

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

Wavefront Sensing I Richard Lane Department of Electrical and Computer Engineering University of Canterbury Christchurch New Zealand.

Visual Recognition Tutorial

Variance and covariance M contains the mean Sums of squares General additive models.

Using process knowledge to identify uncontrolled variables and control variables as inputs for Process Improvement 1.

The Simple Linear Regression Model: Specification and Estimation

Maximum likelihood (ML) and likelihood ratio (LR) test

Minimaxity & Admissibility Presenting: Slava Chernoi Lehman and Casella, chapter 5 sections 1-2,7.

Maximum likelihood (ML) and likelihood ratio (LR) test

Independent Component Analysis (ICA) and Factor Analysis (FA)

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems

Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.

lecture 2, linear imaging systems Linear Imaging Systems Example: The Pinhole camera Outline  General goals, definitions  Linear Imaging Systems.

Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;

Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;

Objectives of Multiple Regression

Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.

Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.

Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

Model Inference and Averaging

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,

Update to End to End LSST Science Simulation Garrett Jernigan and John Peterson December, 2004 Status of the Science End-to-End Simulator: 1. Sky Models.

VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Modern Navigation Thomas Herring

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,

VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.

BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity

BCS547 Neural Decoding.

July 11, 2006Bayesian Inference and Maximum Entropy Probing the covariance matrix Kenneth M. Hanson T-16, Nuclear Physics; Theoretical Division Los.

Chapter 8: Simple Linear Regression Yang Zhenlin.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

Machine Learning 5. Parametric Methods.

Part 2: Phase structure function, spatial coherence and r 0.

Chapter 5 Image Restoration.

6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,

Fundamentals of adaptive optics and wavefront reconstruction Marcos van Dam Institute for Geophysics and Planetary Physics, Lawrence Livermore National.

Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.

R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.

Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Chapter 7. Classification and Prediction

12. Principles of Parameter Estimation

Probability Theory and Parameter Estimation I

Synaptic Dynamics: Unsupervised Learning

Statistical Methods For Engineers

Filtering and State Estimation: Basic Concepts

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Principles of the Global Positioning System Lecture 11

12. Principles of Parameter Estimation

Presentation transcript:

Wavefront Sensing II Richard Lane Department of Electrical and Computer Engineering University of Canterbury

Contents Session 1 – Principles Session 2 – Performances Session 3 – Wavefront Reconstruction

Session 2 Performances Geometrical wavefront sensing take 2 The inverse problem The astronomical setting The basic methods

Geometric wavefront sensing (or curvature sensing without curvature) Plane 1 Plane 2 Image Plane Improve sensitivity (signal stronger) Improve the number of modes measurable (signal weaker)

Slopes in the wave-front causes the intensity distribution to be stretched like a rubber sheet Geometric optics model z W(x) xx Aim is to map the distorted distribution back to uniform

Geometric wavefront sensing Take 2 Plane 1 Plane 2 Image Plane Intensity Plane 1 Intensity Plane 2 Intensity distribution gives the probability distribution For the photon arrival

Intensity Plane 1 Intensity Plane 2 Probability density functions Integrate to Form CDF Choose level Difference gives a slope estimate Final slope estimate Integrate slope to find the phase Recovering the phase Defocus!

Forward Problem

Inverse Problem Performance is determined by amount of photons entering the aperture and assumptions about the object and turbulence

Imaging a star

Multiple layers Layer 1 Layer 2 Aperture Plane h1h1 h2h2 For wide angle imaging we need to know the height of the turbulence

The fundamental problem: How to optimally estimate the optical effects of turbulence from a minimal set of measurements

Limiting Factors Technological –CCD read noise –Design of wavefront sensor (Curvature, Shack- Hartmann, Phase Diversity) Fundamental –Photon Noise –Loss of information in measurements –Quality of prior knowledge

In Its Raw Form the Inverse Problem Is Always Insoluble There are always an infinite number of ways to explain data. The problem is to explain the data in the most reasonable way Example Shack-Hartmann sensing for estimating turbulence

Example – fit a curve to known slopes Solution requires assumptions on the nature of the turbulence Use a limited set of basis functions Assume Kolmogorov turbulence or smoothness

Parameter estimation Essentially we need to find a set of unknown parameters which describe the object and/or turbulence The parameters can be in terms of pixels or coefficients of basis functions Solution should not be overly sensitive to our choice of parameters. Ideally it should be on physical grounds

Bayesian estimation 101 An important problem Estimate And if you know that it models two people splitting the bill in a restaurant?

Possible phase functions Zernike basis Zernike Polynomials Low orders are smooth Pixel basis, highest frequency = 1/(2Δ) Δ

Estimation using Zernike polynomials Measurement Interaction Zernike Polynomial vector matrix Coefficents i th column of Θ corresponds to the measurement that would occur if the phase was the i th Zernike polynomial phase weighting Zernikie polynomial

Extension to many modes Provided the set of basis functions is complete, the answer is independent of the choice The best functions are approximately given by the eigenfunctions of the covariance matrix C These approximate the low order Zernike polynomials, hence their use. Conventional approach is to use a least squares solution and estimate only the first M Zernikes when M ≈N/2 (N is the number of measurements)

Not all measurements are equally noisy hence minimise Ordinary least squares Minimise Weighted least squares

Conventional Results As the number M increases the wavefront error decreases then increases as M approaches N. Reason when M=N there is no error and there should be as higher order modes exist and will be affecting the measurements

Phase estimation from the centroid Tilt and coma both produce displacement of the centroid According to Noll for Kolmogorov turbulence –Variance of the tilt –Variance of the coma Ideally you should estimate a small amount of coma

Bayesian viewpoint The problem in the previous slide is that we are not modelling the problem correctly Assuming that the higher order modes are zero, is forcing errors on the lower order modes Need to estimate the coefficients of all the modes as random variables

Example of Bayesian estimation for underdetermined equations Measurement z is a linear function of two unknowns x,y Statistical expectation We want to minimise the expected error The estimate (denoted by ^) is a linear function of z

Minimisation of the error Key step, rewrite in terms of and Solution is a function of the covariance of the unknown parameters

Vector solution for the phase Express the phase as a sum of orthogonal basis functions Observed measurements are a linear function of the coefficients Reconstructor depends on the covariance of a

Simple example for tilt D/r0=4 From Noll From Primot et al

Bayesian estimate of the wavefront Minimizes

Summary Bayesian method When the data is noisy you need to put more emphasis on the prior. For example, if the data is very bad, don’t try and estimate a large number of modes When done properly the result does not depend strongly on C being exact Error predicted to be where

Operation of a Bayesian estimator Minimizes When D becomes very large, the data is very noisy then more weight is placed on the prior dataprior Ultimately as D→∞, a→0 (for very noisy data no estimate is made)

Bayesian examination question You are on a game show. You can select one of three doors Behind one door is $10000, behind the others nothing After you select a door, the compere then opens one of the other doors revealing nothing. You are given the option to change your choice Should you?

Estimating the performance limits when it is non-Gaussian The preceding analysis is fine when the measurement errors can be modelled as a Gaussian random variable On many equations you need to perform an analysis to work out the error in the analysis Cramer-Rao bounds

Cramer-Rao bound Linear unbiased estimators only Essentially the quality of the parameter estimate is given by the curvature of the pdf Doesn’t tell you how to achieve the bound

Simple example Find the performance limit estimating the mean of a one-dimensional Gaussian from 1 sample

Points to note Limit is a lower bound. Clearly for 1 sample from the pdf it cannot be attained The variance decays as 1/N with more samples For a Gaussian asymptotically the centroid of the distribution can be shown to approach the Cramer- Rao bound

Estimation of a laser guide star location, Cramer-Rao bound Small projection telescope Large AO corrected projection telescope Large uncorrected projection telescope Key points: In the presence of saturation a focused spot may not be optimal Need to know the pattern to reach the limit

Optimal estimation of a parameter wavefront tilt Important because the wavefront tilt is the dominant form of phase aberration A small error in estimating the tilt can be larger than the full variance of a higher order aberration.

Issues Displacement of the centroid of an image is proportional to the average tilt (not the least mean square) of the phase distortion Will discuss this issue later, but for the moment concentrate on estimating the mean square tilt.

How do you estimate the centre of a spot? The performance of the Shack-Hartmann sensor depends on how well the displacement of the spot is estimated. The displacement is usually estimated using the centroid (center-of-mass) estimator. This is the optimal estimator for the case where the spot is Gaussian distributed and the noise is Poisson.

Centroid estimation for a sinc 2 function

Why Not Use the Centroid? In practice the spot intensity decays as This means that photons can still occur at points quite distant from the centre. Estimator is divergent unless restricted to a finite region in the image plane

Diffraction-limited spot For a square aperture, the distribution is:

Photon arrival simulation

Solutions (1) Use a quad cell detector and discard the photons away from the centre The signal from the outer cells is discarded because it adds too much noise

Solutions 2 Use an optimal estimator that weights the information appropriately Consider two measurements of an unknown parameter an estimate of a parameter with different variances A weighted sum is always a better estimator A non linear estimator is better still

Maximum-likelihood estimation If photons are detected at x 1, x 2 …, x N, the estimate is the value that maximizes the expression The Cramer-Rao lower bound for the variance is For a large number of photons, N, the variance approaches the Cramer-Rao lower bound.

Technique relies on finding a model of the object Not sensitive to the size of window (unlike the centroid) Centroid is a closed form solution for fitting a Gaussian of variable width Centroid location by model fitting

Tilt estimation in curvature sensing The image is displaced by the atmospheric tilt, how well you can estimate it is determined by the shape of the image formed.

Tilt estimation in the curvature actual propagated wavefronts

Performance versus detector position for a curvature sensor

Actual Wavefront sensor data Observation at Observatoire de Lyon SPID instrument on 1-m telescope 20x20 Shack-Hartmann lenslet array Exposure time 2ms Objects: Pollux, point object 2500 frames Castor3 arc second binary 2500 frames

Centroiding issues Accuracy required to a fraction of a pixel Sampling rate 60% of Nyquist

Need a good model of the object In each lenslet of the Shack-Hartmann acts like a small telescope the dominant effect is one of tilt. => We have a large number of images of the same object shifted before they are sampled. Finding the model

Solution: approach Use blind deconvolution to find model MAP framework (Hardie et al, FLIR) Data-capturing process: Choose initially so that Prior information: –Laplacian smoothness for the optics –Maximum entropy for CCD

Typical SPID data frames Single Wavefront Sensor FrameLong term WSF Blow up of a spotMovie of a spot

Simulations Inputs: –Object f = point source –Optics = diffraction-limited pattern of square aperture –CCD structure: Gaussian-like –Random displacements –White Gaussian noise: dB, 30dB, 15dB

Simulation result: 15dB noise Optics reconstruction CCD reconstruction

Traditional centroiding Centre of gravity of spot image Problems: –Finite pixel size –Finite window size –Readout noise (more pixels = more noise) –Bias Problems become worse with extended objects

Model-fitting Full blind deconvolution computationally unreasonable Fit a model estimated by blind deconvolution Use model to determine centroids

Error in centroid calculation

Blind deconvolution results Optics reconstruction CCD reconstruction

Results from speckle image deconvolution (narrowband) Binary estimated with model fitted centroids Binary estimated with traditional centroids

Phase reconstructions of Binary Traditional centroiding Model based centroiding

Conclusions Bayesian approaches provide a logical framework for filling in missing data Make sure of what you are assuming Cramer-Rao bound can provide a performance limit You need to look at the whole process when deriving an algorithm

And the answer is: (ref Stark and Woods) Yes change the door ? ??

Actual Wavefront sensor data Observation at Observatoire de Lyon SPID instrument on 1-m telescope 20x20 Shack-Hartmann lenslet array Exposure time 2ms Objects: Pollux, point object 2500 frames Castor3 arc second binary 2500 frames

Subpixel displacement estimation Wavefront sensing is based on estimating the tilts produced by atmospheric distortion, the accuracy of displacement estimation is critical. Data from SPID 2500 frames undersampled by 40% Estimated CCD pixel sensitivity Estimated optics psf Spot displacements

Explanation of the terms Results from

Possible phase functions Zernike basis

The inverse problem

Alternatively

Prior information Infinite number of unknowns, but a finite number of centroid measurements from the sensor Conventional approach is to choose the basis functions and estimate M coefficients, where M < N the number of measurements

Using real data: binary star 14x14 Shack-Hartmann lenslet array Exposure time 3.2ms Object: Castor, a binary star –Intensity ratio: 2.1 –Separation: 3.1 arcseconds

Blind deconvolution results Intensity ratio: 2.4 Separation: 3 arc seconds