Robust Lung Nodule Classification using 2 Robust Lung Nodule Classification using 2.5D Convolutional Neural Network Shiwen Shen1,2, Alex Bui1,2, William Hsu1,2 1Radiological Sciences, 2Bioengineering, University of California, Los Angeles INTRODUCTION (b) (a) Figure 1. (a) Chest CT images with annotated lung nodule in red; (b) examples of lung nodules. The presence of lung nodules in computed tomography (CT) images is an important indicator for early-stage lung cancer, but discovering a nodule among a complex web of normal structures in CT image is a non-trivial task, as shown in Figure 1. Researchers have actively pursued methods for computer-aided diagnosis (CAD) to assist radiologists in the interpretation process; but maintaining high sensitivity while minimizing false positives remains a challenging problem. Moreover, existing methods show a significant decrease in performance on datasets with different acquisition parameters and/or patient cohorts (compared to training data). Thus, models with better transferability are highly desirable and needed for practical clinical usage. In this work, we have developed a novel robust model to automatically classify lung nodules vs. non-nodules using a hybrid ensemble of multiple deep convolutional neural networks (CNNs). This model was first built from a large, publicly available dataset, and then externally validated on an independent dataset without retraining the model. The results proved that the model is robust across datasets without retraining, and showed greater transferability. Figure 2. Illustration of 2.5D deep convolution neural network architecture RESULTS LIDC-IDRI Dataset UCLA Dataset AUC ACC SEN SPE CPM Proposed 0.994 0.974 0.970 0.973 0.955 0.971 0.942 0.886 0.939 0.83 [5] (2017) 0.922 0.919 0.947 NA DATASET Table 1. Results comparison for lung nodule classification. Two datasets were used in this work: the Lung Image Database Consortium image collection (LIDC-IDRI) and a UCLA dataset. LIDC-IDRI: 893 diagnostic and lung cancer screening chest CT scans were used. Lung nodules with a diameter larger than 3 mm were manually annotated (nodule contours) by four radiologists following a two-phase image annotation process. We considered nodules with sizes between 5 – 30 mm, obtaining 6,776 nodules. UCLA dataset: 158 diagnostic chest CT scans were collected from UCLA to establish an independent dataset. An experienced thoracic radiologist annotated 158 nodules for this dataset. The scans in the LIDC dataset were randomly divided into five subsets of similar size. Three subsets were used for training, one for validation, and one for testing. The test subset contained 207 scans with 1,262 nodules and 8,281 non-nodules. The UCLA dataset, which included 158 nodules and 3,938 non-nodules, was used as an independent dataset to validate the model performance. The results are compared with a recent published 2017 work [5] and shown in Table 1. We achieved better results across the board and with robust performance. DISCUSSION METHODS Although appreciable research has been conducted on lung nodule detection and classification, very few studies include model validation with independent datasets. Among those that perform such external testing, most show significant performance decrease. Thus, a nodule classification system with high transferability is highly desirable, and would satisfy clinical usage requirements. Motivated by this, we developed a novel CNN with overfitting-control techniques. The developed system is shown to have high classification performance, as well as transferability across differently acquired datasets. Future work will involve developing a robust nodule diagnosis system using related deep learning techniques. Acknowledgements. The project is supported in part by NSF award CCF-1436827. The authors would also like to thank Dr. Denise Aberle for her insights into this effort. Preprocessing. We: 1) normalized the pixel values to a range of (0,1) from (-1000, 500 HU); 2) used all regions marked by radiologists as nodules; 3) extracted non-nodule candidates with the methods described in [1] using multi-level thresholding and morphological operations combined with rule-based analysis; 4) extracted a 3D cube of 40 × 40 × 40 mm, which was resized to 64 × 64 × 64 pixels.; and 5) stacked three patches centered at the cube parallel to axial, sagittal and coronal view as the 2.5D input. 2.5D CNN for nodule detection. We designed a hybrid ensemble CNN structure consisting of a VGG module [2], residual module [3] and dense module [4], as depicted in Figure 2. One convolution (conv) unit in the VGG module comprises two 3×3 conv layers and one 2×2 max pooling layer. Three conv modules were used in total, consisting of 32, 64, and 128 kernels for each of their two conv layers, respectively. The residual module enforces learning of residual functions in relation to the layer inputs. The residual unit (RU) is the basic building block for the residual module and consists of two paths: 1) a direct path from the input, and 2) a batch normalization and convolution path. The output from these two paths are combined using an addition function. A 49 layer residual structure is used. The dense module is made up of three densely connected convolution blocks. Each block is made up of 12 batch normalization + convolution layer combinations. And all convolution layers inside a dense block are connected to every convolution layers afterward in the same block. The output of each module was then linked to a fully-connected (FC) layer with 1,024 rectified linear units. This layer was then fully connected to two outputs. References [1] Shen S, et. al. An automated lung segmentation approach using bidirectional chain codes to improve nodule detection accuracy. Comput Biol Med. 2015. [2] Simonyan K, el. al. Very deep convolutional networks for large-scale image recognition. arXiv 2014 [3] He K, el. al. Identity mappings in deep residual networks. ECCV 2016 [4] Huang G, et. al. Densely connected convolutional networks. arXiv 2016. [5] Froz BR, et. al. Lung nodule classification using artificial crawlers, directional texture and support vector machine. Expert Systems with Applications. 2017