Multiscale Likelihood Analysis and Inverse Problems in Imaging

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Learning deformable models Yali Amit, University of Chicago Alain Trouvé, CMLA Cachan.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Applications in Signal and Image Processing
Multiscale Analysis of Photon-Limited Astronomical Images Rebecca Willett.
Multiscale Analysis for Intensity and Density Estimation Rebecca Willett’s MS Defense Thanks to Rob Nowak, Mike Orchard, Don Johnson, and Rich Baraniuk.
Extensions of wavelets
Data mining and statistical learning - lecture 6
Visual Recognition Tutorial
Oriented Wavelet 國立交通大學電子工程學系 陳奕安 Outline Background Background Beyond Wavelet Beyond Wavelet Simulation Result Simulation Result Conclusion.
Lecture 5: Learning models using EM
Wavelet Transform 國立交通大學電子工程學系 陳奕安 Outline Comparison of Transformations Multiresolution Analysis Discrete Wavelet Transform Fast Wavelet Transform.
Wavelet Transform A very brief look.
Paul Heckbert Computer Science Department Carnegie Mellon University
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
Efficient Quantum State Tomography using the MERA in 1D critical system Presenter : Jong Yeon Lee (Undergraduate, Caltech)
Transforms: Basis to Basis Normal Basis Hadamard Basis Basis functions Method to find coefficients (“Transform”) Inverse Transform.
Representation and Compression of Multi-Dimensional Piecewise Functions Dror Baron Signal Processing and Systems (SP&S) Seminar June 2009 Joint work with:
Multiscale transforms : wavelets, ridgelets, curvelets, etc.
ENG4BF3 Medical Image Processing
Image Representation Gaussian pyramids Laplacian Pyramids
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
ELE 488 F06 ELE 488 Fall 2006 Image Processing and Transmission ( ) Wiener Filtering Derivation Comments Re-sampling and Re-sizing 1D  2D 10/5/06.
1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
1 Wavelets, Ridgelets, and Curvelets for Poisson Noise Removal 國立交通大學電子研究所 張瑞男
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Medical Image Analysis Image Reconstruction Figures come from the textbook: Medical Image Analysis, by Atam P. Dhawan, IEEE Press, 2003.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Basis Expansions and Regularization Part II. Outline Review of Splines Wavelet Smoothing Reproducing Kernel Hilbert Spaces.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Image Denoising Using Wavelets
Wavelets and Multiresolution Processing (Wavelet Transforms)
EE565 Advanced Image Processing Copyright Xin Li Image Denoising Theory of linear estimation Spatial domain denoising techniques Conventional Wiener.
EE565 Advanced Image Processing Copyright Xin Li Image Denoising: a Statistical Approach Linear estimation theory summary Spatial domain denoising.
Lecture 2: Statistical learning primer for biologists
Coarse-to-Fine Image Reconstruction Rebecca Willett In collaboration with Robert Nowak and Rui Castro.
COMPARING NOISE REMOVAL IN THE WAVELET AND FOURIER DOMAINS Dr. Robert Barsanti SSST March 2011, Auburn University.
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Machine Learning 5. Parametric Methods.
APPLICATION OF A WAVELET-BASED RECEIVER FOR THE COHERENT DETECTION OF FSK SIGNALS Dr. Robert Barsanti, Charles Lehman SSST March 2008, University of New.
Active Learning and the Importance of Feedback in Sampling Rui Castro Rebecca Willett and Robert Nowak.
Using Neumann Series to Solve Inverse Problems in Imaging Christopher Kumar Anand.
Feature Matching and Signal Recognition using Wavelet Analysis Dr. Robert Barsanti, Edwin Spencer, James Cares, Lucas Parobek.
Proposed Courses. Important Notes State-of-the-art challenges in TV Broadcasting o New technologies in TV o Multi-view broadcasting o HDR imaging.
Super-resolution MRI Using Finite Rate of Innovation Curves Greg Ongie*, Mathews Jacob Computational Biomedical Imaging Group (CBIG) University of Iowa.
Bayesian fMRI analysis with Spatial Basis Function Priors
PERFORMANCE OF A WAVELET-BASED RECEIVER FOR BPSK AND QPSK SIGNALS IN ADDITIVE WHITE GAUSSIAN NOISE CHANNELS Dr. Robert Barsanti, Timothy Smith, Robert.
- photometric aspects of image formation gray level images
LECTURE 11: Advanced Discriminant Analysis
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Degradation/Restoration Model
Wavelets : Introduction and Examples
Directional Multiscale Modeling of Images
Machine Learning Basics
4th Joint EU-OECD Workshop on BCS, Brussels, October 12-13
Multiscale Likelihood Analysis and Image Reconstruction
LECTURE 05: THRESHOLD DECODING
Wavelet-Based Denoising Using Hidden Markov Models
Image Restoration and Denoising
Filtering and State Estimation: Basic Concepts
LECTURE 05: THRESHOLD DECODING
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Wavelet-Based Denoising Using Hidden Markov Models
Biointelligence Laboratory, Seoul National University
Support Vector Machines
LECTURE 05: THRESHOLD DECODING
Image restoration, noise models, detection, deconvolution
Presentation transcript:

Multiscale Likelihood Analysis and Inverse Problems in Imaging Rob Nowak Rice University Electrical and Computer Engineering www.dsp.rice.edu/~nowak Supported by DARPA, INRIA, NSF, ARO, and ONR

Image Analysis Applications remote sensing brain mapping time image restoration network tomography

Maximum Likelihood Image Analysis unknown object statistical model measurements maximize likelihood physics data prior knowledge Maximum likelihood estimate

Analysis of Gamma-Ray Bursts unknown object statistical model measurements Maximum likelihood estimate maximize physics data prior knowledge photon counting piecewise smoothness estimate of underlying intensity

Image Deconvolution unknown object statistical model measurements Maximum likelihood estimate maximize physics data prior knowledge noise & blurring image model

Brain Tomography unknown object statistical model measurements Maximum likelihood estimate maximize physics data prior knowledge counting & projection anatomy & physiology

Wavelet-Based Multiresolution Analysis high resolution image mid resolution + prediction errors low resolution + prediction errors prediction errors  wavelet coefficients most wavelet coefficients are zero  sparse representation

Wavelets and Denoising Noisy image Wavelet transform Denoised reconstruction thresholding estimator ‘keep’ large coefficents, ‘kill’ small coefficients near minimax optimality over range of function spaces extensions to non-Gaussian noise models are non-trivial

Wavelets and Inverse Problems linear noise operator Two common approaches : Linear reconstruction, followed by nonlinear denoise image is easy to model in reconstruction domain, but noise is not Denoise raw data, followed by linear reconstruction noise is easy to model in observation domain, but image is not Exception: If linear operator K is approximately diagonalized by a wavelet transform, then both image and noise are easy to model in both domains

Example: Image Restoration convolution noise operator Deblurring in FFT Domain O(N log N) : Denoising in DWT Domain O(N) :

Wavelets and Inverse Problems Fast exact methods for special operators Donoho (1995) wavelet-vaguelette decomposition for scale-homogeneous operators (e.g., Radon transform) Rougé (1995), Kalifa & Mallat (1999) wavelet-packet decompositions for certain scale-inhomogeneous convolutions General Methods: Heavy computation or approximate solution Liu & Moulin (1998), Yi & Nowak (1999), Neelamani et al (1999), Belge et al (2000), Jalobeanu et al (2001), Rivaz et al (2001) General Methods: Fast exact solutions via EM Nowak & Kolaczyk (1999) Poisson inverse problems Figueiredo & Nowak (2001) Gaussian inverse problems

Maximum Penalized Likelihood Estimation (MPLE) = log likelihood function Gaussian Model: Poisson Model:

Image Complexity Images have structure Not a totally unorganized city Not a completely random brain Images have structure

Location Uncertainty in Images Location information is central to image analysis : When are there jumps / bursts ? Where are the edges ? Where are the hotspots ? From limited data we can detect & locate key “structure” “images” lie on a low-dimensional manifold … but we don’t know which manifold

Complexity Penalized Likelihood Analysis Let  = wavelet/multiscale coefficients find  with high likelihood but keep model as simple as possible Goal: Balance trade-off between model complexity and fitting to the data Examples: gamma-ray bursts: penalty  # jumps / bumps satellite images & brains: penalty  # boundaries / edges

function and maximizing this function Basic Approach Multiscale maximum penalized likelihood estimator (MPLE) is easy to compute if K = I If we could directly observe the image then we would have a simple denoising problem MPLE is very difficult to compute in indirect case – no analytic solution exists in general case (K  I) Expectation Maximization (EM) Idea: introduce (unobserved) direct image data iterate between estimating direct data likelihood function and maximizing this function Expectation step involves simple linear filtering ! Maximization step involves simple denoising !

Key Ingredients Fast multiscale analysis and thresholding-type estimators for directly observed Gaussian or Poisson data Selection of the “right” unobserved data space Multiscale likelihood factorizations : generalization of conventional “energy-based” MRA to “information-based” MRA Re-interpretation of the inverse problem as combination of two imaging experiments – a fictitious direct (K=I) observation and real observed (K  I) observation

Multiscale Likelihood Factorizations Assume direct observation model: noise (possibly nonGaussian) y =    Orthonormal Wavelet Decomposition: multiscale decomposition into orthogonal energy components I I I Multiscale Likelihood Factorization: g(y | x) =  f (y | y ,  ) c(I) I I I Haar analysis factorization into independent “information” components

Summary of Likelihood MRA Results Multiresolution Analysis (MRA): Set of sufficient conditions for multiscale factorization of likelihood Efficient Algorithms for MPLE: Linear-complexity algorithms for analogues of “denoising” methods Risk Bounds: Risk rates like for BV and Besov objects estimated from Gaussian or Poisson data

Wavelet MRA vs. Likelihood MRA Function Space MRA Hierarchy of nested subspaces Orthonormal Basis in Scaling between subspaces Translations within subspace Likelihood MRA Hierarchy of data partitions Statistical independence in data space Likelihood function reproduces under summation Conditional likelihood of child given parent is a single parameter density

Multiscale Data Aggregation basic Haar multiscale analysis

Multiscale Likelihood Factorizations original likelihood multiscale factorization

Examples  Haar wavelet coefficient function of Haar wavelet & scaling coefficients

Denoising and Information Extraction Wavelet Denoising: “keep” “kill” Multiscale Information Extraction: “keep” “kill”

Penalized Likelihood Estimation Algorithm: Set penalty to count the number of non-trivial multiscale coefficient Then optimization reduces to a set of N separate Generalized Likelihood Ratio tests which can be computed in O(N) operations analog of hard thresholding

Risk Analysis Hellinger loss: Bound on Hellinger risk: (follows from Li & Barron ’99) KL distance Choice of Penalty:

Risk Analysis – Upper Bounds Theorem (Nowak & Kolaczyk ’01) : If underlying function / intensity x belongs to BV (=1) or Besov space , then Gaussian model: Poisson model: Multinomial model (density estimation): n = number of samples

Risk Analysis – Lower Bounds Theorem (Nowak & Kolaczyk ’01) Proof technique similar in spirit to “method of hyperrectangles” (Donoho et al ’90) Multiscale likelihood factorization plays key role in deriving minimax bounds in Poisson and multinomial cases Key Point: Multiscale MPLE

Gamma-Ray Burst Analysis photon counts burst time Compton Gamma-Ray Observatory Burst and Transient Source Experiment (BATSE) One burst (10’s of seconds) emits as much energy as our entire Milky Way does in one hundred years ! x-ray “after glow”

Gamma-Ray Burst Analysis (Poisson data) If we know where jumps are located, then N = number of measurements If we don’t know where jumps are located, then we can still achieve Simultaneously detect jump locations and estimate intensity log N log N / N 1 / N Optimal estimate computed via multiresolution sequence of GLRTs in O(N) operations

BATSE Trigger 845 piecewise linear multiscale PMLE piecewise polynomial multiscale PMLE

Inverse Problems Observation model: noise (possibly nonGaussian) Goal: Apply the same multiscale PMLE methodology Difficulty: Likelihood function does not admit multiscale factorization due to presence of K Example:

EM Algorithm: Basic Idea

EM Algorithm We do not actually measure z, therefore must be estimated Given observed data y and an estimate of define The EM algorithm alternates between E Step: Computing M Step: Monotonicity:

Gaussian Model Observation model: Wavelet parameterization: Reformulation with unobserved “direct” image: “inner noise”: “outer noise”:

EM Algorithm – Gaussian Case Intialize: E Step (linear filtering) O(N log N) : M Step (basic hard-thresholding) O(N) : “complete data” likelihood log  linear function of z

Example – Image Deblurring BSNR = 40dB original blurred+noise unrestored SNR = 15dB restored SNR = 21dB

Example – Image Deblurring original initialization of EM with “aggressive” linear Wiener restoration Wiener restored SNR = 18dB EM restored SNR = 22dB

Example – Satellite Image Restoration original © CNES simulated observed unrestored SNR = 18dB EM restored SNR = 26dB

Poisson Model Observation model: Multiscale parameterization of unobserved image: Complete data likelihood function: “inner noise”: “outer noise”:

EM Algorithm – Poisson Case Intialize: E Step (linear filtering) O(N log N) : M Step (N independent GLRTs) O(N) : “complete data” likelihood log  linear function of z

Example – Tomographic Reconstruction Shepp-Logan phantom sinogram projection process photon-counting

Example – Tomographic Reconstruction

Open Problems What can we say about optimality ? Key Challenges Minimax optimal brain reconstruction Information-theoretic restoration / segmentation in remote sensing Key Challenges indirect measurements adaptation of risk bounds, uniqueness complex image manifolds images edges are curves

Future Work determination of convergence rates and log N log N / N 1 / N determination of convergence rates and performance bounds in inverse problems development of practical algorithms that achieve near-optimal rates multiscale image segmentation schemes based on recursive partitioning natural image modeling and alternative representations (eg edgelets, curvelets) www.dsp.rice.edu/~nowak