Validation and Evaluation of Algorithms

Validation and Evaluation of Algorithms
Vincent A. Magnotta The University of Iowa June 30, 2008

Software Development In many areas of medical imaging, the generation of an algorithm is the “easy” aspect of the project Now that I have an algorithm what is the next step? Validate the algorithm Evaluate reliability Evaluate biological relevance These are very different and give the developer information that is useful to enhance an algorithm

Validation Degree of accuracy of a measuring device
Validation of medical image analysis is a major challenge What are our standards Actual structure of interest Another technique Manual raters Comparison with the literature

Validation Based on Actual Specimens
Laser scanned surface Traced surface Surface Distance Map

Doppler US and Phase Contrast MR
From Ho et al. Am. J. Roentgenol. 178 (3): 551, 2002

Manual Raters Often we are left with manual raters in medical imaging to serve as a standard Need to evaluate rater reliability May be subject to rater drift and bias Algorithms such as STAPLE have been developed to estimate the probability of a voxel being in a region-of-interest Several metrics to evaluate reliability Percent difference Intraclass correlation Border distance Overlap metrics: Dice, Jaccard, Spatial Overlap Sensitivity and Specificity

Metrics Intraclass Correlation Coefficient
R2=(σsubject)2/ [(σsubject)2+ (σmethod)2+ (σerror)2] Volume(A∩B) Jaccard Metric = Volume(AUB) 2*Volume(A∩B) Dice Metric = [Volume(A)+Volume(B)] Volume(A∩B) Spatial Overlap = Volume(A)

Intraclass Correlation
Data Set 1 Data Set 2

Performance of Overlap Metrics
Jaccard Metric Dice Metric

Reliability Ability to reproduce measurements within a subject across trials Most algorithms will give the same results when run on the same image data Typically evaluated on a scan/rescan basis Provides an estimate of the noise introduced by the algorithm Helps to determine the sample size required to measure a known effect size

Scan/Resan of DTI Fiber Tract
FA Scan/Resan of DTI Fiber Tract Dist (mm)

Evaluation Use of digital phantoms
Easily define cases of interest Can readily adjust SNR Usually a simplification of biological structure Lacks physiological noise Often do not model the PSF and partial volume artifacts Does the method replicate findings in the literature or known via observation

Age Related FA Changes

Conclusions Validation and evaluation of tools can be the most difficult part of a neuroimaging project There exist several methods for evaluating algorithms that have there strengths and weaknesses Validation determines how close we are to the actual process of interest Reliability determines in part our ability to measure changes In general, neuroimaging provides an index of brain volumes and function; not absolute measurements

Acknowledgements Department of Psychiatry Department of Radiology
Hans Johnson Department of Radiology Stephanie Powell Peng Cheng MIMX Lab Nicole Grosland Nicole DeVries Ester Gassman

Validation and Evaluation of Algorithms

Similar presentations

Presentation on theme: "Validation and Evaluation of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Validation and Evaluation of Algorithms

Similar presentations

Presentation on theme: "Validation and Evaluation of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback