CRLB via Automatic Differentiation: DESPOT2 Jul 12, 2013 Jason Su.

Slides:

Advertisements

Similar presentations

Antony Lewis Institute of Astronomy, Cambridge

Advertisements

Formal Computational Skills

Complex Data For single channel data [1] – Real and imag. images are normally distributed with N(S, σ 2 ) – Then magnitude images follow Rice dist., which.

VBM Voxel-based morphometry

July 31, 2013 Jason Su. Background and Tools Cramér-Rao Lower Bound (CRLB) Automatic Differentiation (AD) Applications in Parameter Mapping Evaluating.

MRI preprocessing and segmentation.

Gordon Wright & Marie de Guzman 15 December 2010 Co-registration & Spatial Normalisation.

Journal Club: mcDESPOT with B0 & B1 Inhomogeneity.

Computer vision: models, learning and inference Chapter 8 Regression.

Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.

More MR Fingerprinting

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

1Ellen L. Walker Edges Humans easily understand “line drawings” as pictures.

Lecture 4 Linear Filters and Convolution

Support Vector Machines and Kernel Methods

x – independent variable (input)

Lecture Notes for CMPUT 466/551 Nilanjan Ray

Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.

Prof. Bart Selman Module Probability --- Part d)

Segmentation Divide the image into segments. Each segment:

Development of Empirical Models From Process Data

MSmcDESPOT: Baseline vs. 1- year Diagnosis. N008 Baseline SPGR.

Computer vision: models, learning and inference

ROBOT MAPPING AND EKF SLAM

Random Processes and LSI Systems What happedns when a random signal is processed by an LSI system? This is illustrated below, where x(n) and y(n) are random.

MNTP Trainee: Georgina Vinyes Junque, Chi Hun Kim Prof. James T. Becker Cyrus Raji, Leonid Teverovskiy, and Robert Tamburo.

1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo

Computation of the Cramér-Rao Lower Bound for virtually any pulse sequence An open-source framework in Python: sujason.web.stanford.edu/quantitative/ SPGR.

Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.

Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

7T Thalamus and MS Studies Jason Su Sep 16, 2013.

Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.

Computer vision: models, learning and inference Chapter 19 Temporal models.

Physics 114: Exam 2 Review Lectures 11-16

Practical Statistical Analysis Objectives: Conceptually understand the following for both linear and nonlinear models: 1.Best fit to model parameters 2.Experimental.

EE369C Final Project: Accelerated Flip Angle Sequences Jan 9, 2012 Jason Su.

Research Update: Optimal 3TI MPRAGE T1 Mapping and Differentiation of Bloch Simulations Aug 4, 2014 Jason Su.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Optimal Experimental Design Theory. Motivation To better understand the existing theory and learn from tools that exist out there in other fields To further.

MA 1128: Lecture 17 – 6/17/15 Adding Radicals Radical Equations.

Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.

J OURNAL C LUB : Lankford and Does. On the Inherent Precision of mcDESPOT. Jul 23, 2012 Jason Su.

Modern Navigation Thomas Herring MW 11:00-12:30 Room

23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,

J OURNAL C LUB : Deoni et al. One Component? Two Components? Three? The Effect of Including a Nonexchanging ‘‘Free’’ Water Component in mcDESPOT. Jan 14,

Background Subtraction and Likelihood Method of Analysis: First Attempt Jose Benitez 6/26/2006.

NCAF Manchester July 2000 Graham Hesketh Information Engineering Group Rolls-Royce Strategic Research Centre.

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

Optimal Design with Automatic Differentiation: Exploring Unbiased Steady-State Relaxometry Jan 13, 2014 Jason Su.

NA-MIC National Alliance for Medical Image Computing Evaluating Brain Tissue Classifiers S. Bouix, M. Martin-Fernandez, L. Ungar, M.

Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

MultiModality Registration Using Hilbert-Schmidt Estimators By: Srinivas Peddi Computer Integrated Surgery II April 6 th, 2001.

Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.

Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.

Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.

J OURNAL C LUB : “General Formulation for Quantitative G-factor Calculation in GRAPPA Reconstructions” Breuer, Griswold, et al. Research Center Magnetic.

Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.

Fisher Information Matrix of DESPOT

Examining mcDESPOT Mar 12, 2013 Jason Su.

Linear Filters and Edges Chapters 7 and 8

Statistics in MSmcDESPOT

Statistical Methods For Engineers

ISMRM 2012 Prelim. Abstracts Oct 17, 2011 – Jason Su

Learning Theory Reza Shadmehr

Anatomical Measures John Ashburner

Mathematical Foundations of BME

DISCO-mcDESPOT Nov. 6, 2011 Jason Su.

Presentation transcript:

CRLB via Automatic Differentiation: DESPOT2 Jul 12, 2013 Jason Su

Overview Converting Matlab DESPOT simulation functions to Python and validation Extension of pyautodiff package to Jacobians Application to CRLB in DESPOT1, DESPOT2, and DESS Other projects: – 7T Thalamus ROI analysis – 7T MS segmentation

Validation: SPGR & SSFP Just a sanity check that I correctly ported the single-component functions 700 SPGR values are compared between Matlab and Python – 10 T1s from ms, and 70 flip angles from 1-70deg 14,000 SSFP values are compared each for before and after RF excitation equations All results are the same within machine accuracy

Validation: AD/FD vs Symbolic A comparison between automatic and (central) finite differentiation vs. symbolic AD matches symbolic to machine precision Finite difference doesn’t come close

Extending pyautodiff with Jacobian Many packages support derivatives of multi- output functions, i.e. Jacobian pyautodiff is not one of them I chose pyautodiff because it does not require usage of special elementary functions, it can analyze code with no modification Leverages a project called “theano” which can optimize by Python by dynamically creating C code Instead, analyzes function inputs and output to produce the graph for chain rule derivative calculation

Extending pyautodiff with Jacobian I extended pyautodiff to do so. Ramifications: The Jacobian of nearly any analytic and even programmatic function with arbitrary number of inputs or outputs can be computed in 1 line Not sure yet how to handle matrix inverse. There is support in theano so I think pyautodiff should too but still need to try it. Calculation of an entire space of Jacobians can be done with a single call, e.g. a 4D space of T1s, T2s, phase cycles, and FAs Maintained compatibility with NumPy broadcasting, no for loops This makes it possible to explore an optimal protocol by evaluating the CRLB over a space of relevant tissues

CRLB: Goals With these new tools, I set out to explore all forms of single-component DESPOT with the CRLB formalism Expected to see results as in Deoni 2004 “Determination of Optimal Angles…” – Also, if yes, then validates CRLB as a useful criterion to find an optimal protocol because would then be an equivalent analysis

CRLB: Parameters TR = 5ms, M0 = 1, T1 = ms (100pts) σ S = 2e-3 (SNR≈20 over mean of SPGR curve) – The “ideal” protocol is scaled by sqrt(5) to match SNR of the others I use the protocols considered in Deoni 2004: spgr_angles = { 'tuned': np.array([2, 3, 4, 5, 6, 7, 8, 9, 10, 12]), 'ideal': np.array([3, 12]), 'ga': np.array([2, 3, 4, 5, 7, 9, 12, 14, 16, 18]), 'weighted ga': np.array([2, 3, 4, 5, 7, 9, 11, 14, 17, 22]), }

DESPOT1-SPGR: Constant Noise CoV is how Lankford presents CRLB results at a given noise level Deoni 2004 flips the scale to T1NR and normalizes it as a factor of the SNR at the Ernst angle (with equivalent NEX)

Constant SNR: Converting to H1 n=10, the equivalent NEX m p /n is a scale factor that scales the noise down if less flip angles are collected, e.g. m p =2 for “ideal” so variance reduces by 5x

Constant SNR: Converting to H1 Suppose we normalize such that σ S = S E, SNR E = 1 – Scale noise to Ernst angle signal, i.e. it becomes a function of T1 This is like effectively simulating with Σ = sqrt(m p )S E I

argmax α (SPGR): Ernst Angle

DESPOT1-SPGR: Constant SNR

Results Curve shapes are similar except for “weighted ga” – I’m not sure how to incorporate this with CRLB – Intuitively it would be to weight Σ, but the interpretation would then equivalently be that the noise is changing with different angles Does that make sense? – Not exactly the same, note the intersection point There is still a scaling difference though – Something different about H1 calculation?

Results Note how fundamentally different approaches arrive at a similar answer – Deoni 2004 approximates the solution to the curve fitting cost function with a Taylor expansion – CRLB computes relationships based on the Jacobian of SPGR – Both involve squaring derivatives of SPGR, maybe not surprising I think the preliminary CRLB results with constant noise is actually more realistic/useful – Noise is independent of tissue (or at least scanner noise dominates)

CRLB: Parameters TR = 5ms, M0 = 1, T1 = ms (100pts), T2 = ms (50pts) Protocols: despot2_angles = { 'tuned': { 'spgr': np.array([2, 3, 4, 5, 6, 7, 8, 9, 10, 12]), 'ssfp_before': np.array([8, 12, 18, 24, 29, 36, 48, 61, 72, 83]), 'ssfp_after': np.array([8, 12, 18, 24, 29, 36, 48, 61, 72, 83]), }, 'ideal': { 'spgr': np.array([3, 12]), 'ssfp_before': np.array([20, 80]), 'ssfp_after': np.array([20, 80]), }, 'ga': { 'spgr': np.array([2, 3, 4, 5, 7, 9, 12, 14, 16, 18]), 'ssfp_before': np.array([8, 13, 19, 25, 37, 45, 58, 71, 81, 87]), 'ssfp_after': np.array([8, 13, 19, 25, 37, 45, 58, 71, 81, 87]), }, 'weighted ga': { 'spgr': np.array([2, 3, 4, 5, 7, 9, 11, 14, 17, 22]), 'ssfp_before': np.array([5, 10, 16, 24, 32, 43, 54, 66, 75, 88]), 'ssfp_after': np.array([5, 10, 16, 24, 32, 43, 54, 66, 75, 88]), }, }

DESPOT2-SSFP T1 is assumed from DEPOT1 and those rows/columns eliminated from J bSSFP

argmax α (bSSFP): “bErnst”

DESPOT2: Constant SNR – Tuned

DESPOT2: Constant SNR – Ideal

DESPOT2: Constant SNR – GA

DESPOT2: Constant SNR – Wtd GA

Results Take Weighted GA with a grain of salt as it may not be correct as we saw with SPGR Ideal has focused itself with a smaller cone of high T2NR GA has the most uniform T2NR as expected As a side note, we can include T1 in J and try to evaluate the precision of bSSFP to estimate all 3 parameters – F (Fisher Information Matrix) is then ill-conditioned, a known property that SSFP can’t do everything by itself

Joint DESPOT2 Next I perform the analysis assuming a joint fit of T1, T2, and M0 with both the SPGR and bSSFP equations The noise normalization is increased by sqrt(2) to account for the doubling of total images collected

J-DESPOT2: Constant SNR – Tuned

J-DESPOT2: Constant SNR – Ideal

J-DESPOT2: Constant SNR – GA

J-DESPOT2: Constant SNR – Wtd GA

Results Interpretation of the scale is not intuitive? – Mixed sequences but each is normalized to have peak SNR of 1 at Ernst/bErnst – Does it make sense that estimate of T1 or M0 can be better than peak SNR of underlying images? With a joint fit, the tuned set appears to trade more uniform T1NR for a small price in T2NR – GA isn’t a clear winner here, but may be if we only care about a certain range

Joint DESSPOT2 Next I perform the analysis assuming a joint fit of T1, T2, and M0 with both the SPGR and dual-echo SSFP equations – I model DESS as the magnetization before and after RF excitation, equations from Freeman-Hill 1971 TR is kept the same and the same flip angle protocols are examined Noise is now scaled up by sqrt(3)

J-DESSPOT2: Constant SNR – Tuned

J-DESSPOT2: Constant SNR – Ideal

J-DESSPOT2: Constant SNR – GA

J-DESSPOT2: Constant SNR – Wtd GA

Results Perhaps unsurprisingly, we lose some T1NR – Since SPGRs have more noise But we gain some T2NR Interestingly, tuned protocol splits into 2 cones for T2 estimation T1-MeanT1-StdT2-MeanT2-StdM0-MeanM0-Std tuned J-DESPOT J-DESSPOT ideal J-DESPOT J-DESSPOT ga J-DESPOT J-DESSPOT wtd ga J-DESPOT J-DESSPOT

bSSFP Plots show signal curves at different θ Mx is symmetric about θ=π (0.5 cycles here) My has a crossover point where all phase curves intersect – Its x,y location is defined by E1 and E2 Magnitude max over flip angle is constant with phase Phase is constant with phase – Can we use this to get off- resonance?

Next Steps CRLB with real/imag or mag/phase? DESPOT2-FM – Analysis, visualization will be trickier as want to see precision as a function of off-resonance too Estimation of M0, T1, T2, θ as a function of T1, T2, θ – Protocol optimization, including TR A protocol that optimizes precision in B1 inhomogeneity? Assuming B1 map is known – Maximize ∫ T1NR(κ)dκ for a given (or range of) T1 mcDESPOT – Protocol optimization – Exploration of DESS and other sequences

7T MS: PVF Pipeline Goal is to compute atrophy by generating 2 masks reliably: – Intracranial mask – Parenchyma mask In MSmcDESPOT we did: – Intracranial mask = BET mask on T1w, includes ventricles but generally excludes space between brain and skull -> not exactly what is stated – Parenchyma mask = WM+GM, essentially excludes ventricle CSF

7T MS: PVF Pipeline New pipeline uses both T1w and T2w: Intracranial mask = BET mask on T2w, includes cortical CSF, then manually edit to only keep supratentorial brain Brain mask = segment CSF with SPM8 and subtract We also remove everything outside BET mask on registered T1w Since CSF is dark on T1w, this is another way to cut out cortical CSF and any other stray matter outside brain Requires some special handling to deal with 7T inhomogeneity (bump up the bias correction by 100x)

Future Segmentation Natalie suggested some ideas to attempt automatic segmentation via registration 1 st we should try with simple tasks like the removal of cerebellum done in the intracranial mask editing I probably want to finally learn nipype for this as it sounds like it will require integration of many packages: ANTS for study-specific template to register to ITK to do the nonlinear registration (Natalie said this is faster/better for binary masks where we care about the outside contour and not the inside) FSL and NumPy for generating masks to restrict the registration to ROI Hope to extend it to whole thalamus and hippocampus, nuclei would probably be in our dreams Lesions Waiting for normal population to perform standard space z-score thresholding on FLAIR as in MSmcDESPOT Hopefully registration is not too problematic with 7T data

7T Thalamus Used gaussian KDE to approximate distribution – Width of kernel is automatically determined by the function

7T Thalamus In the future, I imagine we want to do ANOVA testing of T1 values between all the ROIs – Or some non-parametric variant (Kruskall-Wallis I think) – We can then multiply that elementwise by an “adjacency matrix” that flags which nuclei are neighbors Gives us which nuclei are separable from each other if we take their spatial location into account