NA-MIC National Alliance for Medical Image Computing Competitive Evaluation & Validation of Segmentation Methods Martin Styner, UNC NA-MIC Core 1 and 5
National Alliance for Medical Image Computing Slide 2 Main Activities DTI tractography: afternoon, Sonia Pujol Segmentation algorithms –Competitions at MICCAI –NAMIC: Co-sponsor –Largest MICCAI workshops –Continued online competition –07: Caudate, liver –08: Lesion, liver tumor, coronary artery –09: Prostate, Head & Neck, Cardiac LV –10: Knee bones, cartilage ?
National Alliance for Medical Image Computing Slide 3 Data Setup Open datasets with expert “ground truth” 3 sets of data: 1.Training data with GT for all 2.Testing data prior workshop for proceedings 3.Testing data at workshop Workshop test data is hard test –Several methods failed under time pressure –Ranking with sets 2 and 3 always different thus far Ground truth only disseminated on training Sets 2 & 3 fused for online competition Additional STAPLE composite from submissions
National Alliance for Medical Image Computing Slide 4 Tools for Evaluation Open datasets Publicly available evaluation tools –Adjusted for each application –Automated unbiased evaluation Score: composite of multiple metrics –Normalized against expert variability Reliability/repeatability evaluation –Scan/Rescan datasets => Coefficients of variation
National Alliance for Medical Image Computing Slide 5 Example Caudate Segmentation Caudate: Basal ganglia –Schizophrenia, Parkinsons, Fragile-X, Autism Datasets from UNC & BWH –Segmentations from 2 labs –Pediatric, adult & elderly scans –33 training, testing 10 scan/rescan single subject
National Alliance for Medical Image Computing Slide 6 Metrics/Scores General metrics/scores –Absolute volume difference percent –Volumetric overlap –Surface distance (mean/RMS/Max) Volume metric for standard neuroimaging studies Shape metrics for shape analysis, parcellations Scores are relative to expert variability –Intra-expert variability would score at 90 –Score for each metric –Average score for each dataset
National Alliance for Medical Image Computing Slide 7 Results Automatic generation of tables & figures Atlas based methods performed best
National Alliance for Medical Image Computing Slide 8 Online evaluation Continued evaluation for new methods 6 new submission in 09 –2 prior to publication, needed for favorable review Not all competitions have working online evals
National Alliance for Medical Image Computing Slide 9 Papers Workshop proceedings –Open pub in Insight Journal/MIDAS Papers in IEEE TMI & MedIA Schaap et al. Standardized evaluation methodology and reference database for evaluating coronary artery centerline extraction algorithms. Medical image analysis (2009) vol. 13 (5) pp Heimann et al. Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Transactions on Medical Imaging (2009) vol. 28 (8) pp Caudate paper with updated online evaluation in prep
National Alliance for Medical Image Computing Slide 10 Discussion Very positive echo from community Evaluation workshops proliferate –DTI tractography at MICCAI 09 –Lung CT registration at MICCAI 10 Many unaddressed topics Dataset availability biggest problem NA-MIC is strong supporter