Big Data, Bias and Analytics – What Can Your EHR Really Tell You? ADAM WILCOX, PHD
DATA big
Source: Nature (Feb 13, 2013)
Hype Cycle for Emerging Technologies Gartner (August 2014)
Outline Background and Experience Big Data Introduction Big Data – Bias Issues Advancing Big Data Next Steps and Conclusion
Outline Background and Experience Big Data Introduction Big Data – Bias Issues Advancing Big Data Next Steps and Conclusion
Knowledge Representation vs. Knowledge Discovery
Costs/Clinic Salary + training + admin $92,077 Benefits/Clinic Productivity (7 MD’s)$99,986 Hospitalizations ↓ *$0 Total (benefits – cost)+$7,909 * Society would save, per clinic, $79,092 in reduced hospitalizations. Dorr DA, Wilcox AB, et al. The effect of technology-supported, multidisease care management on the mortality and hospitalization of seniors. J Am Geriatr Soc Dec;56(12): Effect of Care Management: Outcomes
Increase in CDR View Access
INTEGRATION SERVICES REPLICATED Databases VIRTUAL DATA WAREHOUSE DATAMARTS DM A B C Ad-Hoc Queries – Questions Research Define Recurring – Automated Queries Management Reports Measure OLAP – Analytics Operational Reports Analyze Dashboards Point of Care Reporting Improve Applications Decision Support Control DATA WAREHOUSE TOOLS
WICER
Improve Use of Information for Learning Health System Informed strategy for healthcare transformation Measures to support real-time process and quality improvement Data and analytics driving research and discovery
Outline Background and Experience Big Data Introduction Big Data – Bias Issues Advancing Big Data Next Steps and Conclusion
Raw Clinical Matched Clinical Matched Survey Survey Matched vs. Matched Clinical vs. Survey Age p <<.0001 Proportion Female p <<.0001 Proportion Hispanic p <<.0001 Weight kg Height cm p <<.0001 BMI Prevalence of Smoking p <<.0001 Systolic Diastolic p <<.0001 Prevalence of Diabetes (Survey = self- report, Clinical = >1 Diabetes ICD- 9 AND >1 abnormal test) p <<.0001 Data Collection Methods
Outline Background and Experience Big Data Introduction Big Data – Bias Issues Advancing Big Data Next Steps and Conclusion
Data Quality and Assessment Weiskopf NG, Weng C. Methods and dimensions of data quality assessment: enabling reuse for clinical research. JAMIA 2013
“New” Analytic Methods Bootstrapping Learning curves and over-fitting Hypothesis generation process t-testsNon-parametric tests (Chi-square) Bootstrapping + Easy + Robust + Powerful+ Robust+ Powerful + Widely implemented - Less common - Not appropriate for all data types - Less powerful- Requires special packages or programming
Big Data Analytic Approaches Sub-population analysis Investigating surprises – Often more revealing about data quality than real effects
Outline Background and Experience Big Data Introduction Big Data – Bias Issues Advancing Big Data Next Steps and Conclusion
Big Data Know the data you need Use the data you have Get the data you want Adapt data to user needs Make value accessible Next Steps to Make it Useful
Minimum Requirements to Provide Value Secure database Data sources Patient-level integration – Master Patient Index* Semantic integration – Vocabulary* Excellent analysts
Patient Data Integration
Vocabulary and Data Density
Natural Language Processing
Factors Influencing Health
Collecting Patient- Reported Outcomes Transcribing Patient Portals Scanning Tablet entry
Patient Reported Information: Tablets vs. Scanned Documents ScanningTablets Institutional Equipment cost == Infection risk == Security Theft +- Data loss -+ Patient mismatch -+ Disaster recovery +-
Patient Reported Information: Tablets vs. Scanned Documents ScanningTablets Functionality Office workflow -= Education/traini ng == Data timeliness =+ Branching logic -+ Extensibility -+ Patient experience Preference =+ Security perception =-
GoalTaskUseUserTool QI Life- cycle Cost/ Instance Instances Required Answer a specific question Ad hoc query ResearchResearcherSQLDefine Defined request Observe trends Recurring query Management reports Manager Reporting application Measure Available owner Identify dependencies Sub- population analysis Operational analysis AnalystAnalytic toolsAnalyze+++ Content expert/ analyst Assist decision making Dashboard display Point of care improvement Clinical team Registries Improve++++++Pilot site Automate processes Application Decision support Clinician/ Role EMR application Control Institutional sponsor
Physical Activity