Presentation is loading. Please wait.

Presentation is loading. Please wait.

Answering Hard Healthcare Questions with Data

Similar presentations


Presentation on theme: "Answering Hard Healthcare Questions with Data"— Presentation transcript:

1 Answering Hard Healthcare Questions with Data
Fred Rahmanian Chief Technology Officer Geneia

2 “I know that half of my advertising doesn’t work
“I know that half of my advertising doesn’t work. The problem is that I don’t know which half.” department store magnate John Wanamaker

3 Google’s Answer Cost/1000 impression (CPM) Vs. Cost/Click (CPC)
Using data Google understood each user’s behavior Google was able to place advertisements that an individual was likely to click. They knew “which half” of their advertising was more likely to be effective And didn’t bother with the rest.

4 Healthcare is expensive
The U.S. spends over $2.6 trillion on health care every year; These costs include over $600 billion of unexplained variations in treatments; Misuse of drugs and treatments, resulting in avoidable adverse effects of medical treatment that could save $52.2 billion; Overuse of non-urgent emergency department (ED) care that could save (conservatively) $21.4 billion; Underuse of generic anti-hypertensives, with potential savings of $3 billion; Underuse of controller medicines in pediatric asthma, particularly inhaled corticosteroids, with projected savings of $2.5 billion; Overuse of antibiotics for respiratory infections, with potential savings of $1.1 billion. Source:

5 Average Treatment For the past 60 years we’ve treated patients as some sort of an average Diagnose a condition and recommend a treatment based on what worked for most people, as reflected in large clinical studies A treatment was deemed effective or ineffective Safe or unsafe Based on, gold standard, double-blind studies that rarely took into account the differences between patients

6 Remember Tamoxifen? Roughly 80% effective for breast cancer patients.
But now we know much more We know that it’s 100% effective in 85% to 90% of the patients, and ineffective in the rest. Would be nice to know for which patients it’s effective 100% of the times

7 Explosion of Data In recent years, there has been an explosion of data in healthcare Clinical and Health outcomes data contained in ever more prevalent electronic health records (EHRs) Longitudinal drug and medical claims Genomic data Proteomic data Metabolomic data (systematic study of the unique chemical fingerprints that specific cellular processes leave behind) Social network data Mobile Devices Exogenous data

8 And with this Our ability to process this data have improved drastically We can now ask important questions the Wanamaker questions, about what treatments work and for whom. How to improve the health of population How to improve the experience of care And perhaps more importantly do all of this while reducing the cost of care

9 Data science may be the answer
We know much of our medicine doesn't work for half the patients Just don't know which half – like Wanamaker Data science promise is that if we can collect enough treatment data and use it effectively We'll be able to develop predictive models that will tell us which treatment will be more effective for which patient

10 Healthcare Analytic Data availability and variability in the ways we analyze it are the two factors behind this new approach to medicine It is not enough to say that a drug is effective on most patients Using machine learning techniques we can group patients and then determine the difference between these groups We can now ask for which patient a drug is effective instead of just asking whether a drug is effective This is possible because we are now using data that was not available before So is more data the answer?

11 Knowledge Discovery for Survival Analysis in NSCLC Does incorporating more data help?
Combining clinical data from disparate sources improves prediction accuracy S. Yu, C. Dehing-Oberije, D. De Ruysscher, K. van Beek, Y. Lievens, J. Van Meerbeeck, W. De Neve, G. Fung, B. Rao, P. Lambin, “Development, External Validation and further Improvement of a Prediction Model for Survival of Non-Small Cell Lung Cancer Patients treated with (Chemo) Radiotherapy,”, ASTRO 2008

12 So is data the answer? May be
Peter Novig is credited for saying ‘Our algorithms haven’t gotten that much better. We just have more data’ To understand what he means we need to understand predictive modeling first.

13 Goal of supervised learning algorithm(predictive models)
Find the best estimate for mapping function (f) for the output variable (Y) given the input data (X). Y=f(X)+ϵ Mapping function is also know as ‘Target function’ The prediction error for any machine learning algorithm can be defined by three types of errors: Irreducible error Variance error Bias error Can’t do much about irreducible error So the goal of any model is to reduce bias and variance errors

14 Why do some models don’t perform well
Typically there are two reasons why a model is not performing well (can you guess what they are?) Model is too complicated for the size of data This is generally caused by high variance and leads to overfitting Can spot high variance when training error is much lower than training error High variance can be addressed by reducing the number of features or adding more observations Model is too simple to explain the data This is due to high bias Adding more data doesn’t help bias But adding more features does Source: Figure: 2.1

15 Why some models don’t perform well
To address high variance or high bias we need to add more data or features. Features are still data So does this mean More data = Better Signal (insight)

16 Is more data better? NO More data + sound approach = Better Signal (insight)

17 Healthcare is expensive
The U.S. spends over $2.6 trillion on health care every year; These costs include over $600 billion of unexplained variations in treatments; Misuse of drugs and treatments, resulting in avoidable adverse effects of medical treatment that could save $52.2 billion; Overuse of non-urgent emergency department (ED) care that could save (conservatively) $21.4 billion; Underuse of generic anti-hypertensives, with potential savings of $3 billion; Underuse of controller medicines in pediatric asthma, particularly inhaled corticosteroids, with projected savings of $2.5 billion; Overuse of antibiotics for respiratory infections, with potential savings of $1.1 billion. Source:

18 Explosion of Healthcare Data Means opportunity
Identify high risk patients Opioid dependency COPD patients Formulary optimization Identify variation in treatment Provider teaming Identifying gaps in care Computer aided diagnostics Source:

19 So how does this help Healthcare?
So as it turns out there are a lot of ( Some hard and some not so hard) opportunities to improve state of healthcare with data With the right data and the right approach we can improve triple aim (Population health, per capita cost and experience of care) But more importantly, the only way we can arrive at personalized medicine and stop treating patients as averages is by proper use of data


Download ppt "Answering Hard Healthcare Questions with Data"

Similar presentations


Ads by Google