Machine Learning Misconceptions May 3rd, 2017

Machine Learning Misconceptions May 3rd, 2017
TM

Director of Data Science
Data Science Team Levi Thatcher, PhD Director of Data Science Mike Mastanduno, PhD Data Scientist Taylor Miller, PharmD Taylor Larsen, MS Data Science Engineer Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Purpose of Today’s Chat
Compare and contrast machine learning and artificial intelligence. Discuss techniques that offer feedback into the system and when it’s necessary to retrain a model. Give advice on how to avoid common pitfalls in machine learning implementation. Talk about potential applications of the different classes of machine learning techniques. Q&A Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Machine Learning Definition
Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed. Such algorithms overcome following strictly static program instructions by making data-driven predictions or decisions through building a model from sample inputs. - Wikipedia Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Machine Learning Typical Use
Movie recommendations on Netflix People you may know on Facebook Advertising Patient likelihood of contracting sepsis, being readmitted… Using any tabular data source to predict a Y/N or continuous outcome Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Artificial Intelligence Definition
Artificial intelligence (AI) is intelligence exhibited by machines. In computer science, the field of AI research defines itself as the study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of success at some goal. - Wikipedia These models are limited in their ability to “reason”, i.e. to carry out long chains of inferences, or optimization procedure to arrive at an answer. The number of steps in a computation is limited by the number of layers in feed-forward nets, and by the length of time a recurrent net will remember things. - Yann LeCun, Director of Facebook AI Research Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Artificial Intelligence Typical Use
Speech translation Complex game playing Self-driving cars Content delivery Radiology? Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Difference Between ML and AI
It’s fuzzy Learning from data? No, not really. Continuous learning from data? No, not really. AI feels more complicated. AI should be able to learn a skill and generalize it to another entirely different thing. Many AI ideas get rebranded as ML as time goes on and we understand them. Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Poll #1: Have you ever used machine learning or AI? 148 respondents
Yes, in my daily work – 21% Yes, as a hobby – 17% No, but I plan to – 52% No, not applicable – 9% Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

How is machine learning used?
Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Poll #2: Where is your organization in terms of using machine learning in regular operations? 138 respondents Using machine learning tools daily across many departments and use cases – 13% Daily across a couple of use case – 17% Confined to a research study or two – 49% What is machine learning? – 21% Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

When does a model learn? Different algorithms learn at different times
Only during training Logistic regression Random forest Clustering Periodically after new data comes in Any of the above (but more complex implementation) Naïve Bayes Neural networks Deep learning Continuously as new data comes in Any of the above (but still more complex implementation) Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

When should a model be retrained?
After significant data turnover If performance in production drops over time Seasonality Changing treatment methods If new features or techniques are identified If the use case changes Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Pitfall 1: Poorly Defined Use Case
Leads to: Incorrect usage of data fields Unavailable data No adoption Use case is always the first priority What is the question? Who are the users? When are they using it? How are they using? Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Pitfall 2: Production Environment is Different
Data might not be available Timing of data might lead to target leakage Predictions are made multiple times per patient Learn how your data is populated over time Only train with what’s available at the time of prediction Know your use case! Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Pitfall 3: Bad Performance Metrics
99% accurate, but didn’t find any sick people Imbalanced classes Performance changing over time AUC or Precision-Recall Sampling methods during model training Monitor correct performance metric over time Know your use case! Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Pitfall 4: Poor Adoption
Do people know about it? Is it answering a relevant question? Is visualization done well? Do people trust the model? Tell people about it Know the use case Simple is better, shouldn’t affect workflow Improve trust with prediction explanations or transparent models Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Poll #3: What’s impeding you from moving forward with machine learning in your organization? 116 respondents Available tools are overwhelming OR don’t know what exists – 16% Use cases are overwhelming OR don’t know what’s possible – 28% Don’t have or can’t afford the technical staff to implement – 23% Adoption—clinical team isn’t interested – 9% Other – 25% Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Potential Applications: ML and EMR
Clinical Risk scores – readmissions, mortality Risk adjusted comparisons Replacing clinical rulesets Correct coding Operational Staff need forecasting Length of stay prediction Financial Propensity to pay Predicted procedure cost Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Potential Applications: NLP or Smarter Analytics
Parsing clinical notes Fill in discrete text fields automatically Find new features that only come up in conversation Smart retrospective analysis Trend analysis Exploration across the whole EMR Serve up insights automatically Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Potential Applications: Image Processing
Diagnostics of pre-segmented suspicious regions Automatic segmentation of tissue types Diagnosis of or staging of screening images Diagnosis or staging of pathology slides Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Poll #4: What’s the most valuable use for ML/AI/Big Data to your organization? 95 respondents
Parsing free-form clinical notes – 14% Image interpretation – 5% Clinical risk scores – 47% Operational efficiency – 29% These are buzz words and not worth the time. – 4% Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Poll #5: If there was an algorithm that was FDA approved and read mammographic images on par with a radiologist, would you use it? 90 respondents Yes, I’d trust it completely – 16% Yes, but only as an aide to the radiologist – 81% No, I wouldn’t trust it – 3% Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Before we end… Additional outcome goals to select from:
TM Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified

Questions? © 2017 Health Catalyst Proprietary and Confidential
Additional outcome goals to select from: Reduce financial risk through improved ability to: Identify and analyze variation from targets Understand populations (conditions, risks, aims) Use trends to predict future needs Additional process aim statements to select from: Increase number of sites/providers/initiatives who receive reports of their performance Increase the number of high-service utilization patients and diagnostic groups identified (can serve as targets for improvement work) Increase the number of major drivers of utilization (hospital, ED, rehab) identified © 2017 Health Catalyst Proprietary and Confidential

Machine Learning Misconceptions May 3rd, 2017

Similar presentations

Presentation on theme: "Machine Learning Misconceptions May 3rd, 2017"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine Learning Misconceptions May 3rd, 2017

Similar presentations

Presentation on theme: "Machine Learning Misconceptions May 3rd, 2017"— Presentation transcript:

Similar presentations

About project

Feedback