Diagnosing heart diseases with deep neural networks Recommending picking out he most interesting and important parts from your documentation.
Freelancer software / machine learning Julian de Wit Freelancer software / machine learning Technical University Delft / TNO Software engineering Love biologically inspired computing Last few years neural net “revolution” Turn academic ideas into practical apps Medical, documents, radar, plant grading My background
Diagnose heart disease challenge Deep learning Solution discussion Agenda Diagnose heart disease challenge Deep learning Solution discussion Results Feel free to ask questions during talk !
Second national data science bowl Kaggle.com / Booz Allen Hamilton Challenge Second national data science bowl Kaggle.com / Booz Allen Hamilton This year’s challenge Automate manual 30min clinical procedure Ca. 500.000 patients/year in USA Estimate heart volume based on MRI’s Ratio systole/diastole is ‘health’ predictor 750 teams $200.000 prize money
Competition platform for ‘data scientists’ Challenge Kaggle.com Competition platform for ‘data scientists’ Challenges hosted for companies Prize money and exposure 400.000+ registered ‘competitors’ Learn: Always someone smarter than you ! Today’s state of the art is tomorrow’s baseline!
Given: MRI’s, metadata, train-volumes Train 700, Test: 1000 patients, 300.000+ imgs Estimate volume of left ventricle Challenge
Image data → Deep Learning (CNN) Neural networks 2.0 Don’t believe ALL the hype Structured data → feature engineering + Tree/Lin Great when “perception” data is involved Spectacular results with image analysis My take: “Super human” with a twist Deep learning
Deep learning
‘Vanilla’ architecture. Approach used by many teams (ie. #2 Ghent university) Input slices, regress on provided volumes Solution 123ml
Less publicized approach: Segment image. Integrate estimated areas into volume using metadata. Solution Problem: ‘No annotations provided.’ Sunnybrook/hand
Segmentation : Traditional architecture bad fit Every layer is higher level features less spatial info (BOW) Per pixel classification possible coarse due to spatial loss Cumbersome! 256 x 256 x 300.000 classifications. Solution
Segmentation : Fully convolutional architecture + upscale Efficient. Classify all pixels at once Still problem spatial bottleneck at bottom : coarse Solution
Segmentation : U-net architecture Skip connection give more detail in segmentation output Author works at Deepmind health now Solution
Segmentation results impressive. Machine did exactly what it was told. Solution Segmentation results impressive. Machine did exactly what it was told. Confused with uncommon examples < 1%. Remedy : Active learning Nice property : brightness == (un)certainty
Dirty secret: MUCH data cleaning Slice order Missing slices Solution Dirty secret: MUCH data cleaning Slice order Missing slices Out of bound slices Wrong orientation Missing frames BAD ground truth volumes Gradient boosting “calibration” procedure Not relevant in real setting. Just rescan MRI.
Sub 10ml MAE → clinically significant Many improvements possible : Results Result: 3rd place Only 1 model. No ensemble. Sub 10ml MAE → clinically significant Many improvements possible : More, cleaner train data Expert annotations Active learning
Many other deep learning medical successes Results Many other deep learning medical successes Example: Retinopathy challenge For bulk as good as expert doctors Solution in use by companies already
Deep learning for medical imaging Summary Deep learning for medical imaging
EINDE....
Diagnosing heart diseases with deep neural networks
Competition platform for ‘data scientists’ Kaggle.com Competition platform for ‘data scientists’ Challenges hosted for companies Prize money and exposure 400.000+ registered competitors Learn. Always someone smarter than you ! Today’s state of the art is tomorrow’s baseline! Competition
Freelancer software / machine learning Technical University Delft : SE Julian de Wit Freelancer software / machine learning Technical University Delft : SE Biologically inspired computing / AI Since 2006 heavily re-interested in neural nets Looking for opportunities to test and bring in practice My background
Approach
Use provided volumes to calibrate Remove systematic errors Calibration Use provided volumes to calibrate Remove systematic errors Use Gradient Booster on residuals Top 5 -> top 3 Beware of overfitting
Every pixel: Left ventricle Yes/No Use convolutional neural network Sunnybrook too simplistic Train with hand-labeled segmentations Reverse engineer how to label Fix systematic errors with calibration against provided volumes. Approach
Competition
Use DICOM info to make images uniform Crop around heart 180x180 Contrast stretch Preprocessing
Hand labeling with own tool Big performance limiting factor Could not find how to do it exactly
Cat!
Cat !
Grass
Uncertainty based on stdev in error as a function of size. Submission CRPS Uncertainty based on stdev in error as a function of size. Model provided uncertainty. However does not account for uncertainty in labels Example: patient 429. Error of 89ml !!! Provided label was wrong…