A Framework for a Fully Automatic Karyotyping System E. Poletti, E. Grisan, A. Ruggeri Department of Information Engineering, University of Padova, Italy Introduction Methods: Segmentation Acknowledgements This work has been partially funded by TesiImaging S.r.l., Milan, ItalyCorrespondence Enea Poletti, University of Padova - Dept. of Information Engineering Via G. Gradenigo 6/a Padova - ITALY Results and Discussion Karyotype analysis is a widespread procedure in cytogenetics to assess the possible presence of genetics defects. The procedure is lengthy and repetitive, so that an automatic analysis would greatly help the cytogenetist routine work. Still, automatic segmentation and classification of chromosomes are open issues: existing commercial software packages are far from being fully automatic and their poor performances require human intervention to correct challenging situations. We propose a framework for a fully automatic karyotyping procedure. space variant thresholding: cluster identification Original input image Push clusters into the queue Pop first cluster and evaluate the SCM Single chromosome? Save single chromosome A cluster is selected for analysis An axis is extracted and the SCM is evaluated Concave points identification Concave points are here identified and used as cues for cuts and overlaps Curvature along the contour Resolution of the cluster used as example Identify new clusters Geometric analysis and Disentanglement Concave points as cues The local minima of the curvature of the contour (K) are the points suggesting the possible presence of touching and overlaps. Space variant threshold divide the image into a tessellation of squares evaluate the Otsu threshold for each square separately elimination of small, spurious segmented blobs identification of nuclei present in the image Single Chromosome Measure (SCM) morphological dilation of the axis with a disk evaluation of the ratio of the obtained area with that of the original blob. Y N Dark paths The quasi-contact area along adjacent chromosomes. Overlaps Each two of lines connecting disjoint pairs of minima points in K are considered. Geometrical cuts Candidate cut lines links two points in K and lies entirely inside the cluster. Classification The segmentation is carried out by means of a space variant thresholding scheme, which proved to be successful even in presence of hyper- or hypo-fluorescent regions in the image. Then a greedy approach is used to identify and resolve touching and overlapping chromosomes, based on geometric evidence and image information. The classification step is coupled with a sequence of modules conceived to cope with routine images in which chromosomes are randomly rotated, possibly blurred or corrupted by overlapping or by dye stains. Features extraction The axis estimation is carried out by a robust modified version of a vessel-tracking algorithm. Three features are derived from the axis: length density profile (64 samples) contour function (64 samples) Two other geometrical features considered are: perimeter area Axis calculation for the feature extraction Polarization Chromosomes are randomly rotated. We need to comply with: an uniformed array feature orderliness the orientation standard adopted Boosted alternating decision tree: Decision node: specify a predicate condition based on a feature. Prediction node: specify a value to add to the polarization score. Feature pre-processing different zoom different illumination conditions chromosomes belonging to slightly different stages of the prometaphase standardization needed. Length distribution for every class, previous (up) and after (down) rescaling Classification via Neural Network 3-layer ANN 131, 131, and 24 nodes respectively. activation functions: log-sigmoid. training algorithm: scaled conjugate gradient training set: 50 karyotypes validation set: 20 karyotypes testing set::49 karyotypes Class Reassigning Algorithm The human karyotype contains 22 pairs of autosomal chromosomes and 1 pair of sex chromosomes constrained classification problem. The performance of the proposed methods are better or comparable to the best of other methods reported in the literature, providing a tool able to automatically analyze an image, and whose results can be handed over wit minimal human intervention to a classifier for automatic karyotyping. 119 cells containing a total of 5474 chromosomes was analyzed to test the segmentation algorithm. 50 of these cells have been used to train the classifier, 20 to validate the training and 50 to test the classification step. Correctly segmented chromosomes 94% Correctly classified chromosomes 96% We have presented an algorithm able to automatically identify chromosomes in metaphase images, taking care of a first segmentation step and then of the disentanglement of chromosome clusters by resolving separately adjacencies and overlaps with a greedy approach, that ensures that at each step only the best split of a blob is performed. The automatic classification step is able to deal with routine images in which chromosomes are randomly rotated, blurred, corrupted by overlapping or by dye stains. Linear Programming algorithm: rearranges the classifier output satisfy the above constraints maximize the accuracy.