Research Update: Optimal 3TI MPRAGE T1 Mapping and Differentiation of Bloch Simulations Aug 4, 2014 Jason Su.

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

Complex Data For single channel data [1] – Real and imag. images are normally distributed with N(S, σ 2 ) – Then magnitude images follow Rice dist., which.
Active Appearance Models
July 31, 2013 Jason Su. Background and Tools Cramér-Rao Lower Bound (CRLB) Automatic Differentiation (AD) Applications in Parameter Mapping Evaluating.
Experiments and Variables
Order Statistics Sorted
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Extended Kalman Filter (EKF) And some other useful Kalman stuff!
More MR Fingerprinting
MATH 685/ CSI 700/ OR 682 Lecture Notes
1 Finite-Length Scaling and Error Floors Abdelaziz Amraoui Andrea Montanari Ruediger Urbanke Tom Richardson.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Motion Analysis (contd.) Slides are from RPI Registration Class.
Lecture 5: Learning models using EM
Maximum likelihood (ML)
Visual Recognition Tutorial
Course AE4-T40 Lecture 5: Control Apllication
Linear and generalised linear models
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Maximum likelihood (ML)
Adaptive Signal Processing
Objectives of Multiple Regression
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Computation of the Cramér-Rao Lower Bound for virtually any pulse sequence An open-source framework in Python: sujason.web.stanford.edu/quantitative/ SPGR.
Kalman filtering techniques for parameter estimation Jared Barber Department of Mathematics, University of Pittsburgh Work with Ivan Yotov and Mark Tronzo.
by B. Zadrozny and C. Elkan
 1  Outline  stages and topics in simulation  generation of random variates.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Practical Statistical Analysis Objectives: Conceptually understand the following for both linear and nonlinear models: 1.Best fit to model parameters 2.Experimental.
EE369C Final Project: Accelerated Flip Angle Sequences Jan 9, 2012 Jason Su.
Analysis of Algorithms
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Modern Navigation Thomas Herring
Optimal Experimental Design Theory. Motivation To better understand the existing theory and learn from tools that exist out there in other fields To further.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
Human-Computer Interaction Kalman Filter Hanyang University Jong-Il Park.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
J OURNAL C LUB : Lankford and Does. On the Inherent Precision of mcDESPOT. Jul 23, 2012 Jason Su.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
CHAPTER 17 O PTIMAL D ESIGN FOR E XPERIMENTAL I NPUTS Organization of chapter in ISSO –Background Motivation Finite sample and asymptotic (continuous)
NCAF Manchester July 2000 Graham Hesketh Information Engineering Group Rolls-Royce Strategic Research Centre.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Optimal Design with Automatic Differentiation: Exploring Unbiased Steady-State Relaxometry Jan 13, 2014 Jason Su.
Monte-Carlo based Expertise A powerful Tool for System Evaluation & Optimization  Introduction  Features  System Performance.
An Introduction to Optimal Estimation Theory Chris O´Dell AT652 Fall 2013.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
Machine Learning 5. Parametric Methods.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Simulation and Experimental Verification of Model Based Opto-Electronic Automation Drexel University Department of Electrical and Computer Engineering.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 6 - Chapters 22 and 23.
Using Neumann Series to Solve Inverse Problems in Imaging Christopher Kumar Anand.
Computacion Inteligente Least-Square Methods for System Identification.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Examining mcDESPOT Mar 12, 2013 Jason Su.
PSG College of Technology
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
MURI Kickoff Meeting Randolph L. Moses November, 2008
5.2 Least-Squares Fit to a Straight Line
Presentation transcript:

Research Update: Optimal 3TI MPRAGE T1 Mapping and Differentiation of Bloch Simulations Aug 4, 2014 Jason Su

Cramér-Rao Lower Bound A key theorem in estimation theory and statistics. The Cramér-Rao lower bound (CRLB) provides a lower bound on the variance of any estimator in the presence of noise. It can: Analyze the effectiveness of parameter mapping methods and protocols Optimize their design by finding the protocol that minimizes the CRLB and provides the most precision per unit scan time However, it has seen limited use in MRI due to the difficulty of analytically deriving the bound for complex signal behavior Kingsley. Concepts Magn. Reson. 1999;11:243–276.

Theory: A common formulation A desirable property, the mean of many samples converges to the true parameter value Unbiased estimator For magnitude images with adequate SNR this holds true Normally distributed noise Separate images in a scan protocol have independent noise Measurement noise is uncorrelated 𝐽 𝑖,𝑗 = 𝛿 𝑔 𝑖 𝑝𝑟𝑜𝑡𝑜𝑐𝑜𝑙 Θ 𝛿 Θ 𝑖 𝐹= 𝐽 𝑇 Σ 𝑛𝑜𝑖𝑠𝑒 −1 𝐽 Σ Θ ≥ 𝐹 −1 𝜎 Θ 𝑖 ≥ Σ Θ ,𝑖𝑖 These assumptions are fairly mild and reasonable for MR in good SNR regime Note that these are by no means necessary, CRLB can analyze biased and non-normal cases as well. Though biased is harder. Often requires monte carlo simulation. More specifically, the variance in your T1 estimate is at best the square root of the corresponding entry on the diagonal of the covariance of

Theory: Fisher Information Matrix 𝐽 𝑖,𝑗 Θ = 𝛿 𝑔 𝑖 𝑝𝑟𝑜𝑡𝑜𝑐𝑜𝑙 Θ 𝛿 Θ 𝑖 𝐹= 𝐽 𝑇 Σ 𝑛𝑜𝑖𝑠𝑒 −1 𝐽 Definition gi, the signal equation for the ith image Θ, the set of parameters, e.g. T1 and M0 J, the Jacobian of the scan protocol Derivatives of the acquired signal equations for each tissue parameter input Σ, the noise covariance between the images This is diagonal with uncorrelated images Interpretation J captures the sensitivity of the signal equation to changes in a parameter F is the Fisher matrix representing the information captured by a protocol Its “invertibility” or conditioning is how separable parameters are from each other: the specificity of the measurement This is can be confounded by nuisance parameters which have little diagnostic value but must be estimated, such as M0 Essentially the heart of the CRLB A given tissue, theta, which means anything that affects your signal equation, T1s, T2s, diffusion, MT, etc. You can think of it as the projection of the parameters of interest on the space orthogonal to the space spanned by the nuisance variables. http://www.colorado.edu/isl/papers/info/node2.html For example, T1 as a nuisance variable in B1 mapping.

Theory: Cramér-Rao Lower Bound Σ Θ ≥ 𝐹 −1 𝜎 Θ 𝑖 ≥ Σ Θ ,𝑖𝑖 The variance of an estimate of a parameter, Θi, is no better than the corresponding ith diagonal entry on the inverse of the Fisher information matrix

The Challenge: Computing the Jacobian Classical computer-based differentiation methods are problematic to apply to this task Has previously limited the application of CRLB Difficult by hand, inefficient by computer program Slow with partial derivatives of multi-input/output functions Analytic or symbolic differentiation Prone to round-off errors Numeric differentiation An algorithm that solves all these problems Calculation times comparable to first difference numeric But 108 times more accurate Automatic differentiation

Optimal Experimental Design How do we design an experiment to give us the highest SNR measurement of a quantity? If you only had one shot to examine something for the next 20 years, how would you collect the best data possible?

Optimal Experimental Design The effect of noise on measurements of a nonlinear function is hard to estimate Often analyzed with computationally expensive Monte Carlo simulations Impractical for evaluating many scenarios Unsuitable for the problem of finding the most efficient set of measurement points We can relax the problem: use the Cramér-Rao Lower Bound instead to predict our experimental precision This is much more tractable. We just need some derivatives. This formalism has been used in many scientific fields including dark energy experiments in cosmology. 𝐽 𝑖,𝑗 = 𝛿 𝑔 𝑖 𝑝𝑟𝑜𝑡𝑜𝑐𝑜𝑙 Θ 𝛿 Θ 𝑖 𝐹= 𝐽 𝑇 Σ 𝑛𝑜𝑖𝑠𝑒 −1 𝐽 Σ Θ ≥ 𝐹 −1 𝜎 Θ 𝑖 ≥ Σ Θ ,𝑖𝑖 Atkinson and Donev. Optimum experimental designs. Oxford: Oxford University Press; 1992

The Framework We created a framework that marries the CRLB with automatic differentiation Enables the calculation of the unbiased CRLB for arbitrary signal equations with no more effort than implementing the equation under study Including sequences that can only be described with Bloch simulations Python was used with PyAutoDiff and SciPy for general function optimization This enables the analysis and optimization of arbitrary parameter mapping experiments Including highly exotic techniques like MR Fingerprinting Or one can design new methods by evaluating the information provided by a new pulse sequence Git it: http://sujason.web.stanford.edu/quantitative/

A Typical Setup Implement a function that computes the signal(s) for your pulse sequence(s) Sometimes as simple as a one line signal equation for magnitude data, but it can also be more complex Create a cost-function We provide a one line wrapper for your signal function to compute its derivatives Use our functions to find the CRLB of the scan protocol with these derivatives Weight the variance of the estimated parameters according to your interest

A Typical Setup: Optimization Set up constraints on your solution space We provide utilities to accomplish this easily using the actual names of your function inputs Feed the cost function into a general solver Find the optimal choice of scan parameters, number of images, and fraction of time to spend on each image series Profit!

Application: 3TI MPRAGE Liu et al. performed a brute force optimization of the experiment using a noise function Not sure if noise function is exact Sampled on a grid and took the minimum From this he drew some heuristics about what optimum values of TI and TS should be I reproduce this analysis using the experimental design framework With greater numerical accuracy (parameters limited by sampling grid) And extend it to more complex scenarios

3TI MPRAGE: Setup MPRAGE signal equation implemented in 3 different formulations from literature to ensure correctness For unbiased CRLB analysis, I think we should be using the raw signal equations, before any image combination Fisher information/CRLB describes the sensitivity of the actual images collected Representing the intrinsic lower bound on the pulse sequence itself to changes in T1, instead of a particular recon. implementation In 3TI MPRAGE, various effects are removed or modeled out in the fitting process These include: M0 and more perhaps subtly, B1+ or κ In such analyses, there are two types of parameters, those that we assume a value for and those that we estimate/account for

3TI MPRAGE: Setup Find TIs, TS, <acquisition time fraction> Estimate T1, M0, κ Cost: T1/σT1 * sqrt(TS) The coefficient of variation of T1, normalized by acquisition time i.e. precision efficiency Fixed parameters – based on Liu’s marmosat parameters N = 64 readouts, centric-encoding α = 9deg., TR = 8.45 Constraints – implemented as plugin transforms to the input variables train_length = (N-1)TR = 532.35 TS ≥ 2*train_length TI ≤ TS – train_length TI ≥ 10ms (or higher for multi T1) Solver Brute force search number of TIs/images from 2-5 Regularization = 1e-16, checked against neighboring values 0, 1e-14, 1e-12 for consistency

Single T1=1300ms Variable Time Fraction Equal Time Fraction

Single T1=1300ms Variable Time Fraction Equal Time Fraction

Comments As Liu observed, the first and last TIs should be as close to the bounds as they can get The middle TI should be ~T1 Liu stated that TS should be 4.2*T1 This shows about 4.4 Let’s be a bit more rigorous about these conclusions Solve the single T1 design problem over a range

Optimal Parameters as T1 Changess

Cost Gap

Multiple T1 To optimize over a range of T1s, we sample the range and evaluate the CRLB of a protocol for each The cost is then a single value summary of these precisions Common to either minimize the mean or maximum (i.e. worst case) CoV T1=1000-2000ms (20 pts)

Multi T1=1000-2000 Mean Max

Multi T1=1000-2000 Mean Max

Multi T1=1000-2000, equal time Mean Max

Multi T1=1000-2000, equal time Mean Max

Multi T1=1000-2000, free FA, equal time Mean Max

Multi T1=1000-2000, free FA, equal time Mean Max

Next Steps Multi κ Multiple α recon? ARLO?

Bloch Simulation One of the distinct advantages of automatic differentiation is that it can handle complex programs Bloch simulation and extended phase graph analysis are ways to analyze the MR experiment using computer programs More complicated mapping methods, like MR Fingerprinting, rely on simulation to describe their This provides a way to extend the experimental design framework to more exotic pulse sequences

Libraries I’ve been using pyautodiff for AD It provided the most seamless conversion of functions to derivatives, with 0 extra code asked of the user However, it is slow. Somewhat surprising because it uses theano which does a JIT compile to C. It is even slower for programs with loops So I went shopping for different AD packages theano (what I’ve been using) ad (pure python) algopy (inspired by ADOL-C?) CasADi (python frontend to C library)

Simple Speed Test Bloch simulation is essentially a sequence of matrix multiplies and additions on an input time series Simple vector accumulation test Other ones I want to try: CppAD, pyadolc Package F (1k x 1) dF (1k x 1k) Python 0.848ms theano 4.34ms 8640ms algopy 271ms ad DNF CasADi 0.0573ms 8.64ms

Implementation Given that there are still other AD packages out there that may be better Bloch simulation implemented to be modular so can plugin in whatever AD library Assuming it has the same general structure: Instantiate symbolic tracer variables Use the specific math functions from the library With this I have theano and CasADi versions working

Bloch Simulation Speed For a 1000 length input Bloch simulation: Hargreaves’s MEX = 0.13 ms CasADi = 1ms theano = 94ms For a 1000x1000 Jacobian of the simulation: Central difference = 0.1*2000 = 260ms Half for forward difference CasADi = 115ms (with no loss of accuracy!) theano = 2m54s

Should I migrate everything to CasADi? Or away from theano Ease of use? Been treating bloch as a separate module so don’t implementation can be as complex as I can take