An Algorithm to Compute Independent Sets of Voxels for Parallelization of ICD-based Statistical Iterative Reconstruction Sungsoo Ha and Klaus Mueller Department of Computer Science Visual Analytics and Imaging (VAI) Lab Stony Brook University and SUNY Korea
Motivation Statistical Iterative Reconstruction Algorithm FBP SIR
Motivation Statistical Iterative Reconstruction Algorithm Weighted Least Square (WLS) cost function Measured projection data XAttenuation coefficients of the object subject to be reconstructed ASystem matrix with size of WDiagonal matrix for statistical weighting R(x)Regularization
Motivation Statistical Iterative Reconstruction Algorithm Weighted Least Square (WLS) cost function High cost for forward & back projections The nature of iterative algorithm
Motivation: optimization ICD-basedCG-based FASTSLOW Convergence rate HARDEASY Parallelization x y GCD (Fessler et al. 1997) B-ICD (Benson et al. 2010) x y ABCD (Fessler et al. 2011) z
Goal Devise an algorithm – Find voxels that are “fully” independent each other – No additional algorithmic & computational complexity – More accurate (also complicated) pattern – Applicable for all CT geometry ICD-basedGC-based FASTSLOW Convergence rate HARDEASY Parallelization
Independency among voxels correction weightingupdate
A Single voxel update A voxel A object x-ray source flat detector region related to voxel A
A B A voxel A object x-ray source flat detector region related to voxel A B voxel B region related to voxel B Independent voxel
A BC Overlap between B & C CT system matrix view M N Independent – A, B Dependent – A, C – B, C Overlap between A & C
Knapsack problem: Finding set of independent voxels
Knapsack problem: Combinatorial NP-hard problem Finding set of independent voxels AB C DEF A G = B C X
Finding set of independent voxels Knapsack problem: Combinatorial NP-hard problem First-Fit Decreasing algorithm 1.Sort voxels in descending order of the number of non-zero elements in their corresponding system matrix column vector 2.Fit first with a voxel that contain the largest number of non-zero elements 3.Cull out dependent voxels with the selected voxel
Experiment settings Cone-beam CT geometry Volume: 128 x 128 x 128 (1 x 1 x 1 mm) Flat detector: 512 x 512 (1 x 1 mm) SAD: 600 mm SID: 1000 mm The number of projections – Varying from 1 to 360 – Uniformly distributed over 360 degrees
Extreme case study # views # independent group Max. size of independent group Avg. size of independent group ,18611, , ABCD (Axial Block Coordinate Descent) algorithm Along z-direction: 128 More parallelism No additional complexity
Theoretical parallelism # views # independent group Max. size of independent group Avg. size of independent group ,18611, ,
Estimated gain of GPU-accelerated OS-SIR Number of views / subset
Independence visualization (bottom) 64 (middle) 96 (top)
At 360 views Independence visualization 32 (bottom)96 (top)
A clue for optimism Independence visualization 32 (bottom)96 (top) 1 view 360 views
Conclusion & Future works More parallelism than existing methods – No additional complexity – One time computation – Applicable for all CT geometry Hints for GPU implementation of SIR Apply to actual GPU-accelerated SIR framework – Determine optimal computational performance – Convergence rate
Thanks! Q&A This research was partially supported by NSF grant IIS and the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ‘IT Consilience Creative Program (ITCCP)’ (NIPA-2013-H ) supervised by NIPA (National IT Industry Promotion Agency).