Statistical Methods for Image Reconstruction

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

MCMC estimation in MlwiN
Active Shape Models Suppose we have a statistical shape model –Trained from sets of examples How do we use it to interpret new images? Use an “Active Shape.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Lecture 23 Exemplary Inverse Problems including Earthquake Location.
1 Low-Dose Dual-Energy CT for PET Attenuation Correction with Statistical Sinogram Restoration Joonki Noh, Jeffrey A. Fessler EECS Department, The University.
 Nuclear Medicine Effect of Overlapping Projections on Reconstruction Image Quality in Multipinhole SPECT Kathleen Vunckx Johan Nuyts Nuclear Medicine,
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Image Reconstruction T , Biomedical Image Analysis Seminar Presentation Seppo Mattila & Mika Pollari.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.
Visual Recognition Tutorial
Statistical image reconstruction
Lecture 24 Exemplary Inverse Problems including Vibrational Problems.
BMME 560 & BME 590I Medical Imaging: X-ray, CT, and Nuclear Methods Tomography Part 3.
BMME 560 & BME 590I Medical Imaging: X-ray, CT, and Nuclear Methods Tomography Part 4.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Numerical Recipes (Newton-Raphson), 9.4 (first.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Structure Computation. How to compute the position of a point in 3- space given its image in two views and the camera matrices of those two views Use.
Application of Digital Signal Processing in Computed tomography (CT)
Rician Noise Removal in Diffusion Tensor MRI
Maurizio Conti, Siemens Molecular Imaging, Knoxville, Tennessee, USA
Normalised Least Mean-Square Adaptive Filtering
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
ELE 488 F06 ELE 488 Fall 2006 Image Processing and Transmission ( ) Wiener Filtering Derivation Comments Re-sampling and Re-sizing 1D  2D 10/5/06.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Example Clustered Transformations MAP Adaptation Resources: ECE 7000:
Particle Filtering in Network Tomography
© by Yu Hen Hu 1 ECE533 Digital Image Processing Image Restoration.
Design and simulation of micro-SPECT: A small animal imaging system Freek Beekman and Brendan Vastenhouw Section tomographic reconstruction and instrumentation.
DTU Medical Visionday May 27, 2009 Generative models for automated brain MRI segmentation Koen Van Leemput Athinoula A. Martinos Center for Biomedical.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Medical Image Analysis Image Reconstruction Figures come from the textbook: Medical Image Analysis, by Atam P. Dhawan, IEEE Press, 2003.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Modern Navigation Thomas Herring
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
1 Complex Images k’k’ k”k” k0k0 -k0-k0 branch cut   k 0 pole C1C1 C0C0 from the Sommerfeld identity, the complex exponentials must be a function.
Parameter estimation. 2D homography Given a set of (x i,x i ’), compute H (x i ’=Hx i ) 3D to 2D camera projection Given a set of (X i,x i ), compute.
Image deconvolution, denoising and compression T.E. Gureyev and Ya.I.Nesterets
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Nuclear Medicine: Tomographic Imaging – SPECT, SPECT-CT and PET-CT Katrina Cockburn Nuclear Medicine Physicist.
Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.
Lecture 2: Statistical learning primer for biologists
5. Maximum Likelihood –II Prof. Yuille. Stat 231. Fall 2004.
Machine Learning 5. Parametric Methods.
Regularization of energy-based representations Minimize total energy E p (u) + (1- )E d (u,d) E p (u) : Stabilizing function - a smoothness constraint.
Course 5 Edge Detection. Image Features: local, meaningful, detectable parts of an image. edge corner texture … Edges: Edges points, or simply edges,
Instructor: Mircea Nicolescu Lecture 7
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Parameter estimation class 5 Multiple View Geometry CPSC 689 Slides modified from Marc Pollefeys’ Comp
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Biointelligence Laboratory, Seoul National University
12. Principles of Parameter Estimation
M. Kuhn, P. Hopchev, M. Ferro-Luzzi
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Degradation/Restoration Model
Ch3: Model Building through Regression
Motion and Optical Flow
Single Photon Emission Tomography
Tianfang Li Quantitative Reconstruction for Brain SPECT with Fan-Beam Collimators Nov. 24th, 2003 SPECT system: * Non-uniform attenuation Detector.
Estimating 2-view relationships
Filtering and State Estimation: Basic Concepts
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Iterative Algorithm g = Af g : Vector of projection data
Principles of the Global Positioning System Lecture 11
12. Principles of Parameter Estimation
Advanced deconvolution techniques and medical radiography
Presentation transcript:

Statistical Methods for Image Reconstruction MIC 2007 Short Course October 30, 2007 Honolulu

Topics Lecture I: Warming Up (Larry Zeng) Lecture II: ML and MAP Reconstruction (Johan Nuyts) Lecture III: X-Ray CT Iterative Reconstruction (Bruno De Man) Round-Table Discussion of Your Interested Topics

Larry Zeng, Ph.D. University of Utah Warming Up Larry Zeng, Ph.D. University of Utah

Problem to solve: Computed Tomography with discrete measurements Continuous Unknown Image

Analytic Algorithms For example, FBP (Filtered Backprojection) Treat the unknown image as continuous Point-by-point reconstruction (arbitrary points) Regular grid points are commonly chosen for display Continuous Unknown Image

Iterative Algorithms For example, ML-EM, OS-EM, ART, GC, … Discretize the image into pixels Solve imaging equations AX=P X = unknowns (pixel values) P = projection data A = imaging system matrix P aij X

Example x4 x3 x2 x1 p4 p3 p2 p1 AX=P

Example 5 4 x1 x2 3 AX=P Rank(A) = 3 < 4 System not consistent No solution x3 x4 2

Solving AX=P In practice, A is not invertible (not square, or not full rank) A generalized inverse is used X = A†P Moore-Penrose inverse A†=(ATA)-1AT AA†A=A A†AA†=A† Least-squares solution (Gaussian noise model)

Least-squares solution 4 3 5 4 2.25 1.75 3 4 2 3 1.75 1.25

The most significant motivation of using an iterative algorithm: Usually, iterative algorithms give better reconstructions than analytical algorithms.

Computer Simulations Examples (Fessler) FBP = Filtered Backprojection (Analytical) PWLS = Penalized data-weighted least squares (Iterative) PL (or MAP) = Penalized likelihood (Iterative)

Why are iterative algorithms better than analytical algorithms? The system matrix A is more flexible than the assumptions in an analytical algorithm It is a lot harder for an analytical algorithm to handle some realistic situations

Attenuation in SPECT g

Attenuation in SPECT Photon attenuation is spatially variant Non-uniform atten. modeled in an iterative algorithm system matrix A (1970s) Uniform atten. in analytic algorithm (1980s) Non-uniform atten. in analytic algorithm (2000s)

Distance-Dependent Blurring Close Far

Distance-Dependent Blurring System matrix A models the blurring P X

Distance-Dependent Blurring Analytic treatment uses the Frequency-Distance Principle (approximation) Near Far Near Far Slope ~ Distance

Truncation (Analytic) Old FBP algorithm does not allow truncation (Iterative) System matrix only models measured data, ignores unmeasured data (Analytic) New DBP (Derivative Backprojection) algorithm can handle some truncation situations

Scatter Scattered Primary

Scatter (Iterative) System matrix A can model scatter using “effective scattering source” or Monte Carlo (Analytic) Still don’t know how, other than preprocessing data using multiple energy windows

In physics and imaging geometry modeling, analytic methods are behind the iterative ones.

Analytic algorithm — “Open-loop system” Data Filtering Backprojection Image

Iterative algorithm — “Closed-loop system” Compare Image Projection Data A AT Update Backprojection

Main Difference Analytic algorithms do not have a projector A, but have a set of assumptions. Iterative algorithms have a projector A.

Under what conditions will the analytic algorithm outperform the iterative algorithm? The analytical assumptions (e.g., sampling, geometry, physics) System matrix A is always an approximation (e.g., the pixel model assumes a uniform activity within the pixel)

Noise

Noise Consideration Iterative: Objective function (e.g., Likelihood) Iterative: Regularization (very effective!) Analytic: Filtering (assumes spatially invariant, not as good)

Modeling noise in an iterative algorithm (thus Statistical Algorithm) Example: A one-pixel image (i.e., total count is the only concern) 3 measurements: 1100 counts for 10 sec. (110/s) 100 counts for 1 sec. (100/s) 15000 counts for 100 sec. (150/s) Counts per second?

m1 = 1100, s2(m1)=1100, x1=1100/10=110, s2(x1)= s2(m1)/(10)2 = 1100/100=11 m2 = 100, s2(m2)=100, x2=100/1=10, s2(x2)= s2(m2)= 100 m3 = 15000, s2(m3)=15000, x3=15000/100=150, s2(x3)= s2(m3)/(100)2 = 15000/10000=1.5

Objective Function

Generalization If you have N unknowns, you need to make a lot more than N measurements. The measurements are redundant. Put more weights on the measurements that you trust more.

Geometric Illustration Two unknowns x1, x2 Each measurement = a linear eqn = 1 line We make 3 measurements; we have 3 lines

Noise Free (consistent) x2 x1

Noisy (inconsistent) x2 e2 e3 e1 x1

If you don’t have redundant measurements, the noise model (e. g If you don’t have redundant measurements, the noise model (e.g., s2) does not make any differences x2 x1

In practice, # of unknowns ≈ # of measurements However, iterative methods still outperform analytic methods Why?

Answer: Regularization & Constraints Helpful constraints: Non-negativity (xi ≥ 0) Image support (xk = 0, x in K) Bounds of values (e.g. transmission map) Prior

Some common regularization methods (1) — Stop early Why stop? Doesn’t the algorithm converge? Yes, in many cases (e.g., ML-EM) it does. When an ML-EM reconstruction becomes noisy, its likelihood function is still increasing. The ultimate solution (e.g., ML-EM) solution is too noisy; we don’t like it.

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

Iterative Reconstruction An Example (ML-EM)

EARLY Stopping early is like lowpass filtering. Low freq. components converge faster than high freq. components.

OR: Over-iterate, then filter

Regularization using Prior What is a prior? Example: You want to estimate tomorrow’s temperature. You know today’s temperature. You assume that tomorrow’s temperature is pretty close to today’s

Regularization using Prior MAP algorithm = Maximum A Posteriori algorithm = Bayesian

How to select a prior? V If V(x) = x2, it enforces smoothness. Not easy. It is an art. Example: V Prior encouragement Data matching Noise model If V(x) = x2, it enforces smoothness.

Edge Preserving Prior V(x) If V(x) = |x|, it preserves edges and reduces noise. x

How does V know where is the edge? V(x) = |x| increases a lot slower than V(x) = x2. V(x) = |x| treats small jumps as noise to suppress and large jumps as edges not to suppress as much. The shape of V(x) is important. V(x) x

What is an ML algorithm? An algorithm that maximizes the likelihood function.

Example: Gaussian Noise A likelihood function is a conditional probability Take log Thus maximize the likelihood function L(X) is equivalent to Least-Squares Problem

Least-Squares = ML for Gaussian An algorithm that minimizes is an ML algorithm for independent Gaussian noise One can set up a Poisson likelihood function and find an ML algorithm for that. The well-known ML-EM algorithm is one of such algorithms.

What is a MAP algorithm? MAP = maximum a posteriori An algorithm that maximizes the likelihood function with a prior For example, an algorithm that minimizes the following objective function is a MAP algorithm

What is ill-condition? Small data noise causes large reconstruction noise SVD

Regularization changes an ill-conditioned problem into a different well-conditioned problem.

Example Big error

Assume a prior:x1 and x2 are close Objective Function: Note: b is NOT a Lagrange multiplier

To minimize We set We obtain a different linear problem

MAP Solutions Noiseless (d1 = d2= 0) b 0.01 0.1 1 10 100 Cond. # 50000 200 22 5.7 x1 0.978 1.094 x2 22.222 1.178 1.103 1.095

MAP Solutions Noisy (d1 = -d2= 0.2) b 0.01 0.1 1 10 100 Cond. # 50000 200 22 5.7 x1 0.978 0.733 1.094 1.095 x2 22.222 66.667 1.178 1.357 1.103 1.122 1.098 1.096

Observations Using a prior changes the original problem For noiseless data, the solutions are different More stable when there is noise It is an art

Observations In fact, a closed-form solution exist for this PWLS (penalized weighted least squares) problem:

Can analytic algorithm do these? Not yet. They don’t even know how to correctly enforce the non-negativity, the support etc.

On Condition Numbers AX=P Error propagation: k = Conditin # Trade-off: Introducing b reduces k, but increases ||DA||.

More on Condition Numbers AX=P Error propagation: k = Conditin # Trade-offs: More accurate (i.e., ||DA|| significantly smaller ) modeling makes A less sparse A may have a larger k The overall error of X may be reduced (better resolution and lower noise)

Ray-Driven & Pixel-Driven Projector(A)/Backprojector(AT) Ray-driven aij: Along a projection ray, calculate how much pixel i contributes to detector bin j. Usually, line-length and trip-area weighting

Line-Length Weighting pj aij Projection (A): X Backprojection (AT):

Area Weighting pj aij Projection (A): X Backprojection (AT):

Rotation-Based Projector/Backprojector Very easy to implement distance dependent blurring Fast Warping is required for convergent beams

Pixel-Driven Backprojector Widely used in analytic algorithms aij = 1 P Interpolation is required on the detector X

If you don’t know what you are doing, do not use an “iterative algorithm backprojector” (AT), e.g., line-length weighted backprojector, in an analytic algorithm An “analytic algorithm backprojector” backprojector is not AT. You may not want to use it in an iterative algorithm if you don’t know what you are doing.

Some Common Terms Used in Iterative Algorithms Pixels (Voxels) Easy to use Unrealistic high-frequency edges

Blobs (no edge, overlap) x y z Voxel x y z Blob

Natural Pixels Projection path Depends on the collimator geometry A Natural Pixel f = Σ wi xi

Iterative Methods Summary Can incorporate Imaging physics Irregular imaging geometry Prior information But It is an art to set things up Complex Long computation time