Automatic Tracing of Vocal Fold Motion in High Speed Laryngeal Video Erik Bieging
Vocal Fold Imaging Human vocal folds oscillate at 100 to 400 Hz during normal phonation High-speed digital imaging (4000 frames/sec) is used to study the motion of the vocal folds Automated methods are needed to extract edge of glottis and glottal area Vocal Folds Glottis
Current Methods Histogram method Threshold is applied to separate glottis and vocal fold tissue Threshold is determined from each frame’s histogram Thresh
Current Methods Region Growing Seeds are started at darkest points in image Regions are grown based on similarity between region pixels and surrounding pixels Active Contour Initial region is defined using thresholding Edge is iteratively moved based on image gradient and several parameters
New Differentiation Based Method Each column of the image is passed through a smoothing differentiating filter Max and min of derivative are taken to be glottal edges Binary image created Canny edge detection applied to binary image to smooth the edge
Comparison of Methods (a) (b) (c) (d) (e) High Quality Data: Lower Quality Data: (a)Original Image (b)Histogram (c)Region Growing (d)Active Contour (e)New Method
Comparison of Methods
Results 100 frames from 10 videos were analyzed with each method Deviation from manually detected glottal area calculated VideoHistogram Region Growin g Active Cont ourOur Method %197.55%62.98%15.53% %26.58%38.73%5.71% 33.93%5.50%40.38%1.85% %16.31%44.95%4.02% %21.58%43.85%18.72% %7.31%57.54%14.02% %18.21%34.11%3.05% 86.23%4.42%41.13%4.75% %47.29%38.92%12.97% %10.73%39.77%10.60% Average31.49%35.55%44.24%9.12%
Computaion Time Average time to analyze 100 frames: Histogram: 2.09 min. Region Growing: min. Active Contour: min. New method: 1.21 min