Natural Language Processing of Knee MRI Reports

Slides:



Advertisements
Similar presentations
MLP Lyrical Analysis ● % of Unique Words ● # of Unique Words ● Average Word Length ● # of Lyrics ● # of Characters Input Feature Vectors:
Advertisements

Text Classification: An Implementation Project Prerak Sanghvi Computer Science and Engineering Department State University of New York at Buffalo.
K-means Based Unsupervised Feature Learning for Image Recognition Ling Zheng.
Machine learning Image source:
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Lecture 3b: CNN: Advanced Layers
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Usman Roshan Dept. of Computer Science NJIT
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Sentiment analysis using deep learning methods
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
Deep Feedforward Networks
Action-Grounded Push Affordance Bootstrapping of Unknown Objects
Compact Bilinear Pooling
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Computer Science and Engineering, Seoul National University
Matt Gormley Lecture 16 October 24, 2016
Perceptrons Lirong Xia.
Classification with Perceptrons Reading:
Juweek Adolphe Zhaoyu Li Ressi Miranda Dr. Shang
Neural networks (3) Regularization Autoencoder
Machine Learning Basics
Training Techniques for Deep Neural Networks
CS 188: Artificial Intelligence
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Bird-species Recognition Using Convolutional Neural Network
Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman
Convolutional Neural Networks for sentence classification
Goodfellow: Chap 6 Deep Feedforward Networks
Face Recognition with Deep Learning Method
Vessel Extraction in X-Ray Angiograms Using Deep Learning
CS 4501: Introduction to Computer Vision Training Neural Networks II
Deep Learning Hierarchical Representations for Image Steganalysis
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
LECTURE 35: Introduction to EEG Processing
Neural Networks Geoff Hulten.
Lecture: Deep Convolutional Neural Networks
Analysis of Trained CNN (Receptive Field & Weights of Network)
John H.L. Hansen & Taufiq Al Babba Hasan
Unsupervised Pretraining for Semantic Parsing
Natural Language to SQL(nl2sql)
Word2Vec.
Image Classification & Training of Neural Networks
Neural networks (3) Regularization Autoencoder
Dennis Zhao,1 Dragomir Radev PhD1 LILY Lab
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Machine Learning with Clinical Data
Clinically Significant Information Extraction from Radiology Reports
Deep Learning for the Soft Cutoff Problem
Department of Computer Science Ben-Gurion University of the Negev
Introduction to Sentiment Analysis
Automatic Handwriting Generation
Introduction to Neural Networks
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Extracting Why Text Segment from Web Based on Grammar-gram
Debasis Bhattacharya, JD, DBA University of Hawaii Maui College
Goodfellow: Chapter 14 Autoencoders
Tschandl P1,2, Argenziano G3, Razmara M4, Yap J4
Patterson: Chap 1 A Review of Machine Learning
Shengcong Chen, Changxing Ding, Minfeng Liu 2018
Presentation transcript:

Natural Language Processing of Knee MRI Reports Lionel Jin1 1Department of Computer Science, Yale University LILY Lab Introduction Results Natural Language Processing can be a powerful tool for processing the large quantity of radiology reports generated in the clinical setting. This project aims build a system that processes knee MRI reports produced at Yale-New Haven Hospital to diagnose knee conditions with good precision and recall. I apply different NLP tools to classify whether the reports reflect any abnormality of the Anterior Cruciate Ligament (ACL). The pipeline can be applied to other knee regions such as the Posterior Cruciate Ligament. Building a pipeline to classify knee MRI reports will be useful for labeling radiology reports available in online databases, facilitating medical research and image recognition work. It will also guide physician diagnosis. Data Exploration: 80 of the 505 reports (15.8%) have an ACL abnormality. Baseline: An accuracy of 89% was achieved, with an F1-score of 0.56 for abnormal reports and an weighted overall F1-score of 0.88. Machine Learning: The Stochastic Gradient Classifier using a L1 loss was consistently the best performing classifier, achieving an accuracy of 93%. Linear SVC also performed well when a L1 loss was used, giving an accuracy of 91%. kNN, Random Forest, and Ridge classifiers did poorly. These algorithms are known to be suboptimal when dealing with high-dimensional data. Larger training sets led to better performance, though the improvement in performance slows with larger dataset sizes. For both SGD and Linear SVC, better classification is achieved with a L1 loss compared to L2 loss. This suggests that it is beneficial that the weight on many of the features be reduced to zero. For both classifiers, using an n-gram set that includes n-grams going from 1-grams to 5-grams gives optimal performance. Deep Learning: Accuracy and F1-scores slightly lower than that obtained by SGD. A maximum accuracy of 92% was achieved with the result obtained by setting the character embedding dimensions to 128, applying a dropout of 0.5, using filters ranging from 3 to 5 words, and using 32 filters. Figure 1. Excerpt from knee MRI report 32,000 knee MRI reports - From Yale-New Haven Hospital Label 505 knee MRI reports - Filter non-knee MRIs and Label by ACL condition - Split into training (80%) and test (20%) sets Extract features with Tf-Idf statistic - Use 1- to 10-grams Train Machine Learning Classifiers - Random Forest, kNN, SGD Classifier, Linear SVC, Perceptron, Nearest Centroid Embed words as low-dimensional vectors - Use Word2Vec embeddings Train Convolutional Neural Network - Perform convolutions - Max-pool result - Add dropout and regularization - Classify using softmax layer Label more samples Figure 3. Classification results with different dataset sizes Data and Methods 32,000 knee MRI reports were acquired from Yale-New Haven Hospital. A gold standard dataset of 505 reports was constructed, with reports given one of seven labels: "Normal", "Full Thickness", "Near Full", "Partial", "Mucoid Degeneration", "Low grade strain", and "ACL graft". The training/test split was 80:20. Baseline Model: A baseline text search was trained: Filter out normal reports by searching for ACL[a-zA-z ]* (intact|unremarkable|normal) or Anterior Cruciate Ligament[a-zA-z ]* (intact|unremarkable|normal) Identify abnormal reports as those remaining that contain: ACL or Anterior Cruciate Ligament Feature Vector Construction: The reports were tokenized by computing the Term Frequency-Inverse Document Frequency statistic. Sets of n-grams where the maximum n ranged from 1 to 10 were used. Machine Learning: Classifiers including Naive Bayes, kNN, Nearest Centroid, Stochastic Gradient Descent, Linear Support Vector Machine, Random Forest, Perceptron, and Passive-Aggressive algorithm were trained on the features. Deep Learning: I also trained a Convolutional Neural Network on the data. Words in each report were embedded into low-dimensional vectors using Word2Vec embeddings. Then, a convolutional layer is added, and the output is max-pooled. Dropout and regularization is added, and a final softmax layer is used to classify the reports. Figure 4. Comparison of classifiers with L1 and L2 loss Figure 2. Project Method Conclusion I successfully built an NLP pipeline to classify knee radiology reports and achieved high accuracy, precision, and recall. I found that SGD and Linear SVC, both with L1 loss, work best amongst the different machine learning classifiers. The performance of these classifiers can be increased by using the tool to create a larger training set using a human-in-the-loop approach. The use of medical ontology, and section-specific parsing of the radiology reports may also improve performance. Figure 5. F1-score with different maximum n-gram lengths Acknowledgement Thank you Dragomir Radev for advising my research. Thank you Elliott Brown for acquiring the dataset and driving this work forward, Lawrence Staib for your helpful input, and Douglas Herrin, Daniel Oazi, and Atin Saha for your work in putting together the dataset. Table 2. CNN classification accuracy with different number of filters Table 1. Classification Results