Recognition of biological cells – development Marcin Skoczylas Marcin Skoczylas Padova, 2007
To test the classifier and texture processing methods efficiency, two datasets were extracted from two different sources of cell images. Dataset source #1: new microscope, cells on mylar. Dataset source #2: old microscope, cells on plastic.
Dataset #1: new microscope, cells on mylar. 1165 learning spots, 129 testing spots. Background: Cell Edge: Cell Inner Part:
Dataset #2: old microscope, cells on plastic. 1161 learning spots, 128 testing spots. Background: Cell Edge: Cell Inner Part:
Small extracted fragments of the image were preprocessed and texture features were computed: 1. Co-occurrence matrix features: (Inverse Difference Moment)
2. Wavelet features #1: Subimages were obtained after wavelet decomposition using Daubechies 4 wavelet. Every subimage contained the information of specific scale and orientation. Two features for each subimage were calculated, the averaged norm for the mth channel was used as first energy feature: DB4 wavelet Selected subimages obtained from wavelet decomposition
... 3. Wavelet features #2: edge detector New operator is introduced. From obtained pixel-vectors discrete wavelet coefficients (DB4) are calculated. For each coefficient, mean and standard deviation from all vectors are obtained, thus giving us two feature vectors which are rotation-invariant. The Principal Component Analysis is used to minimize dimensionality of these feature vectors.
Example images show Continuous Wavelet Transform coefficients after processing with proposed operator scale shift
Support Vector Machines input hidden output Classifiers: Feed-forward neural network ... ... Support Vector Machines Input space Feature space Kernel function: RBF kernel: n-Fold Cross Validation and grid search is used to obtain and C parameters.
Dataset #1 features recognition accuracy.
Dataset #2 features recognition accuracy.
Recognition examples
(segmentation)
Shape classification Segmented objects are extracted from the image and their shape features are calculated: - height - width - volume: area size occupied by the objects; number of pixels inside - mean pixel intensity - major axis length: the distance between two points of the longest line drawn through the object - minor axis length: the distance between two points of the longest line perpendicular to the major axis - thinness ratio: roundness of the object - equivalent diameter: the diameter of a circle with the same area as the object region Objects from learning images were firstly qualified manually and their normalized shape features were fed as an input to the additional classifier.
(segmentation)
SUMMARY Implementation of sophisticated “segments join” algorithm, other segmentation algorithms (such as Active Contours, Graph-Based) and additional shape features is necessary. GUI is implemented in Java, so it should be possible to run the "client" on all known operating systems which have Java virtual machine installed. Data storage is the SQL database. System core is implemented in C++/MPI and most calculations are performed in parallel manner. However, some of the algorithms are still in Java or Matlab. This is temporary - to test their reliability and to provide "ground truth" for the C++ version which is harder to debug. Change in the environment (especially zoom factor) makes it necessary to learn the classifier again. Amount of data is very big and transfer takes time, consider adding Huffman/LZW compression to the client/server protocol. Pixels classification and segmentation took ~14 seconds on 2 processors PentiumIII 800MHz machine (little bit overloaded by other cluster users). 8-pixels step, 28x28-pixels spot (window) size. At this moment system has 73 632 lines of code.
DO YOU HAVE ANY QUESTIONS? THANK YOU DO YOU HAVE ANY QUESTIONS? Marcin.Skoczylas@lnl.infn.it