Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman
The Motivation Breast Cancer is the second most deadly type of cancer in women worldwide. 1.3 million women diagnosed worldwide Nearly half-a-million women dying from this disease each year. Very curable if diagnosed early. Cervical Cancer is the third most deadly type of cancer in women worldwide. Half-a-million women diagnosed. 250,000 women dying from this disease each year. Also very curable if diagnosed early.
The Data Set Breast Cancer Wisconsin (Original) Data Set Samples collected from a minimally invasive fine-needle aspirate (FNA). 458 benign (65.5%) and 241 malignant (34.5%) 9 Features (scale from 1-10): Clump thickness Uniformity of cell size Uniformity of cell shape Marginal adhesion Single epithelial cell size Bare nuclei Bland chromatin Normal nucleoli Mitoses Cervical Cancer Data 5 Features: Amount of cytoplasm Nuclei count Nuclei shape Nuclei texture Nuclei area
Pre-Processing Unknown Samples 16 incomplete data samples - bare nuclei Samples used for analysis: 683 Normalization Normalize value to between 0 - 1
Artificial Neural Network Analysis MATLAB driver program Network Configurations: 1, 2 or 3 hidden layers Each layer with 3, 5 or 7 perceptrons 70% training 15% for testing 15% for validation Transfer functions used: Scaled Conjugate Gradient Logistic Tan Sigmoid Network retrained 50 times with random sample with replacement.
Support Vector Machine Analysis MATLAB driver program Parameters Tuned: Kernels Radial Basis Function Linear Kernel Polynomial Kernel - degrees 2 and 3 Box Constraint/C - support vector cost/penalty 1e-5 to 1e5 - increasing by factor of 10 Kernel Scale/Gamma - individual examples influence the hyperplane 1e-5 to 1e5 - increasing by factor of way cross validation
Artificial Neural Network Results Configurations with 97.4% accuracy The Scaled Conjugate Gradient (SCG) without normalization and configuration: [5], [7], [7 7] The Scaled Conjugate Gradient (SCG) with normalization and configuration: [5 5], [7] The Logistic transfer function (logsig) without normalization and configuration: [5 5], [7] The Logistic transfer function (logsig) with normalization and configuration: [5], [7] The Tan-Sigmoid transfer function (tansig) without normalization:[5], [5 5], [7], [7 7] The Tan-Sigmoid transfer function (tansig) with normalization:[7], [7 7], [7 7 7] Sensitivity: 98% Specificity: 96%
Support Vector Machine Results Maximum accuracy: 94.24% 2nd-order polynomial kernel Box constraint of 1e-1 kernel scale (gamma) of 1 Sensitivity: 96.58% Specificity: 91.43% Second most accurate configurations, accuracy: 94.13% Radial Basis Function kernel, box constraint of 1, kernel scale set to auto, Polynomial degree 3 with box constraint of 1e-1 and kernel scale set to auto. Sensitivity: 96.35% Specificity 91.02%
Conclusion None of the ANN layer configurations with 3 perceptrons achieved a maximum accuracy of 97.4%. ANN configurations with 3+ layers and perceptrons do not generalize well, and overfit the data. Classification of breast cancer data using an ANN should be kept to one or two hidden layers of about 5 perceptrons in order to achieve the highest classification accuracy. In comparison to the SVM, the neural network achieved higher accuracy by 3%, sensitivity by 2% and specificity by 5%. A neural network appears to be a better algorithm for classifying the data.
Discussion Other configurations for the ANN and SVM which were not analyzed. More fine tuning of parameters. Artificial Neural Network Learning rates, transfer functions, more/less layers/perceptrons Support Vector Machines More Kernels Different cost function Removing “bare nuclei” feature and using all 699 samples.