Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data-driven visualization of drug interactions. Adverse Drug Events Almost 1 million deaths/injuries each year in the US [1] Some fraction of ADEs are.

Similar presentations


Presentation on theme: "Data-driven visualization of drug interactions. Adverse Drug Events Almost 1 million deaths/injuries each year in the US [1] Some fraction of ADEs are."— Presentation transcript:

1 Data-driven visualization of drug interactions

2 Adverse Drug Events Almost 1 million deaths/injuries each year in the US [1] Some fraction of ADEs are caused by previously unknown drug-drug interactions Clinical trials aren’t large enough to detect many potential interactions FDA, WHO, pharmaceutical companies maintain databases of reported [2] ADEs You can download a sample of the FDA data from the Adverse Event Reporting System website [3] We can analyze the reported data to identify suspicious drug interactions Copyright 2011 Cloudera Inc. All rights reserved

3 Challenges in Analyzing Adverse Drug Events Biased Sample Adverse event reporting is voluntary We don’t see events from patients who took the drugs and nothing happened Correlation != Causation No controlled trials, some correlations are coincidences Requires Advanced Statistical Modeling Skills Multi-item Gamma Poisson Shrinkage Estimator is used to score the significance of a drug interactions The model is too complex to solve directly, we use Expectation Maximization (EM) to estimate its parameters Copyright 2011 Cloudera Inc. All rights reserved

4 The Hard Problem: Counting It is a “small” data problem… 250,000+ events reported to the FDA annually …that explodes when we consider: Multi-drug, multi-symptom interactions Analyzed by strata (e.g., month of report, patient age, patient gender, etc.) ~1 million reports => ~360 million buckets Analysts typically filter the data to only consider a few adverse reactions at a time… …but that is not the way of the data scientist Copyright 2011 Cloudera Inc. All rights reserved

5 Solving the Hard Problem MapReduce on Hadoop 20 MapReduce jobs Filter, aggregate, join, aggregate again Model the resulting data in R Use MapReduce to apply the model parameters to the data, score each drug-drug interaction, and then filter the data to obtain the highest scoring interactions Visualizing the Results Even applying a restrictive filter on the scores, we end up with 20,000+ statistically significant drug-drug-reaction triples Copyright 2011 Cloudera Inc. All rights reserved

6 The Drug-Drug Interaction Graph Copyright 2011 Cloudera Inc. All rights reserved

7 HIV Medications Copyright 2011 Cloudera Inc. All rights reserved

8 Cancer Medications Copyright 2011 Cloudera Inc. All rights reserved

9 Exploring the Graph Copyright 2011 Cloudera Inc. All rights reserved

10 Bridges Between Dense Clusters Copyright 2011 Cloudera Inc. All rights reserved

11

12 Acknowledgments and References Thanks to Josh Wills, Director of Data Science at Cloudera, for the data collection and analysis shown here. References: [1] ADE instances/year: http://www.ahrq.gov/qual/aderia/aderia.htm http://www.ahrq.gov/qual/aderia/aderia.htm [2] AERS reporting site: http://www.ahrq.gov/qual/aderia/aderia.htm http://www.ahrq.gov/qual/aderia/aderia.htm [3] Download ADE instance data: http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInforma tion/Surveillance/AdverseDrugEffects/ucm082193.htm http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInforma tion/Surveillance/AdverseDrugEffects/ucm082193.htm Other resources: http://www.cloudera.com/blog http://wiki.cloudera.com/ Copyright 2011 Cloudera Inc. All rights reserved12


Download ppt "Data-driven visualization of drug interactions. Adverse Drug Events Almost 1 million deaths/injuries each year in the US [1] Some fraction of ADEs are."

Similar presentations


Ads by Google