Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hong Kang, PhD (Presenter) Zhiguo Yu, PhD Yang Gong, MD, PhD

Similar presentations


Presentation on theme: "Hong Kang, PhD (Presenter) Zhiguo Yu, PhD Yang Gong, MD, PhD"— Presentation transcript:

1 Hong Kang, PhD (Presenter) Zhiguo Yu, PhD Yang Gong, MD, PhD
Initializing and Growing a Database of Health Information Technology (HIT) Events by Using TF-IDF and Biterm Topic Modeling S58: Applications of Informatics to Improve Patient Safety Hong Kang, PhD (Presenter) Zhiguo Yu, PhD Yang Gong, MD, PhD Nov 7th 2017 Washington DC

2 Disclosure Speaker Hong Kang discloses that neither he nor his partner has relevant financial relationships with commercial interests. AMIA | amia.org

3 Learning Objectives After participating in this session the learner should be better able to: Identify Health IT events from public resources by combining TF-IDF model and topic model Initialize and grow a Health IT event database Effectively learn from previous Health IT events and improve patient safety AMIA | amia.org

4 Patient Safety: Pressures and Incentives
A track of patient safety study from NIH. (Liang, Miao & Gong, unpublished) Medical error. (Makary & Daniel, 2016) AMIA | amia.org

5 Deaths due to Patient Safety Event (PSE)
AMIA | amia.org

6 Health Information Technology (HIT)
What is HIT? (AHRQ definition) HIT includes hardware or software that is used to electronically create, maintain, analyze, store, receive (information), or otherwise aid in the diagnosis, cure, mitigation, treatment, or prevention of disease. HIT & healthcare Positive Negative 1 in 6 PSEs is related to HIT, top 10 technology-related hazards Integrating and learning from previous HIT-related events is necessary AMIA | amia.org

7 The Lack of Databases for HIT-related Events
Only ONE report was archived as HIT-related in 2015 by a PSO institute States working with Patient Safety Organization (PSO) by 2017 AMIA | amia.org

8 Objectives Develop a strategy to identify HIT-related events from both structured and unstructured data Initialize and grow a database for HIT-related events Prototype an integrated reporting and shared learning platform for HIT and other types of PSE AMIA | amia.org

9 Data – FDA Medical Device Event Reports
FDA Manufacturer and User Facility Device Experience (MAUDE) database ~ 6,000,000 events involving medical devices Has both structured & unstructured fields ~ 0.1% are HIT-related events Filter on Structured data Classifier on Unstructured data Raw Data HIT Expert Review Evaluation Gold Standard Trend of new MAUDE reports per year (bars) and MAUDE related publications per year (line) since 2000 AMIA | amia.org

10 Method – HIT Filter on Structured Data
4,947,220 reports from MAUDE Keyword searching on Generic Name and Manufacturer Name Identify HIT-related events from sample reports through domain expert review 6 inclusion criteria 4 exclusion criteria Assess reviewer consistency by Cohen’s kappa Workflow of reviewing a report from FDA MAUDE database AMIA | amia.org

11 Method – HIT Classifier on Unstructured Data
Term frequency (TF) – inverse document frequency (IDF) Biterm topic model Extract the semantic themes (topics) from a corpus of short documents Classifiers Random forest, Logistic regression, Naïve Bayes, SVM, J48, JRip Gold standard: HIT-related and non-HIT event reports identified by domain experts AMIA | amia.org

12 Result – Keyword-based HIT Event Filter
Inclusion & exclusion keywords Generic Names Manufacturer Names Total Inclusion Keywords 94 38 132 Exclusion Keywords 21 Raw and filtered reports of MAUDE database Year Raw reports Filtered reports Reports HCFA/Manufacturer/Distributor 2008 145,598 9,148 1,817 146 2009 201,996 9,906 2,640 214 2010 327,961 10,792 3,434 316 2011 414,083 12,597 2,371 307 2012 520,043 12,952 4,825 308 2013 636,145 12,516 3,551 313 2014 867,451 12,927 4,338 380 2015 965,240 15,762 17,963 384 2016 868,703 15,023 4,685 408 Sum 4,947,220 45,624 AMIA | amia.org

13 Result – Evaluation through Expert Review
50% of the filter identified reports are HIT-related Up to 60,000 reports (0.9%) in MAUDE are HIT-related A database of HIT-related events was initialized with 3,521 high- quality reports Expert review result on the samples from filtered reports Year Samples Review results HIT-related Non-HIT Unsure 2008 373 123 82 168 2009 459 165 114 180 2010 654 220 169 265 2011 490 252 122 116 2012 732 217 203 312 2013 614 193 178 243 2014 751 313 314 124 2015 2,111 1,770 216 125 2016 810 268 362 Sum 6,994 3,521 1,760 1,713 Reviewers N = 3 Pairwise Cohen’s Kappas: 0.82, 0.85, and 0.87 AMIA | amia.org

14 Result – Contributing Factor Distribution Analysis
AMIA | amia.org

15 Result – Classifiers using TF-IDF
AUC = 0.900 F-Score = 0.807 Dataset 289 HIT 376 non-HIT TF-IDF 1,541 words Kang et al, AMIA 2017 AMIA | amia.org

16 Result – Comparison between LDA and Biterm Topic Models
F - Scores AUC 0.901 0.838 0.824 0.892 BTM BTM LDA LDA Biterm Topic Model with 70 topics performs better Kang et al, MedInfo 2017 AMIA | amia.org

17 Result – Biterm Topic Modeling Improves HIT Classification
Best Combination: 70 topics + TF-IDF features AUC = 0.920 F-score = 0.834 Kang et al, AMIA 2017 AMIA | amia.org

18 Result – An Integrated Model of HIT event Identification
Filter on Structured data Classifier on Unstructured data Raw Data HIT Proportion of HIT events 0.4~0.9% 50% 97% 0.97 precision Grow the HIT database Trade off between precision and recall AMIA | amia.org

19 Underway – Prototype An Integrated Reporting and Shared Learning System in Healthcare Community
AMIA | amia.org

20 Limitations Difficult to provide a precise figure for the proportion of HIT-related event reports in FDA MAUDE Missed by the filter Only 10% reports were reviewed Low recall of the classifier Discrepancies among reviewers during expert review AMIA | amia.org

21 Summary & Innovation Developed a strategy using both structured and unstructured data to identify HIT-related events from public resources Initialized a database exclusive for HIT-related event reports Updated the estimation of HIT-related event proportion in FDA MAUDE database to 0.9% Prototyped an integrated reporting and shared learning platform for HIT and other types of patient safety events AMIA | amia.org

22 Acknowledgement UTHealth School of Nursing UTHealth SBMI
Dr. Yang Gong’s group Hua Xu, PhD Trevor Cohen, MBChB, MD, PhD CPRIT Fellowship Roberta Ness, MD, MPH Patricia Mullen, MLS, MPH, DrPH David Loose, PhD Pre- and post-doc fellows UTHealth School of Nursing Jing Wang, PhD, MPH, RN UTHealth Medical School Nnaemeka Okafor, MD, MS University of Missouri Health Amy Vogelsmeier, PhD, RN, FAAN Missouri Center for Patient Safety Tina Hilmas, RN, BSN Grants Cancer Prevention and Research Institute of Texas Grant (# RP160015) Agency for Healthcare Research & Quality (1R01HS022895) University of Texas System Grants Program (#156374) AMIA | amia.org

23 Questions The Agency for Healthcare Research and Quality (AHRQ) defines a health information technology (HIT) device as hardware or software that is used to electronically create, maintain, analyze, store, or receive information to aid in the diagnosis, cure, mitigation, treatment, or prevention of disease. However, HIT were listed in the top 10 technology-related hazards because it may lead to new uncertainties and risks for patient safety through disrupting established work patterns, creating new risks in practice, and encouraging workarounds. Why collecting data on HIT-related patient safety events (PSE) for learning purposes is challenging? Healthcare providers do not know what HIT-related events are. HIT-related events are rare. There is a lack of HIT reporting forms or platforms. HIT-related events are not as important as other PSE types. AMIA | amia.org

24 Answer Healthcare providers do not know what HIT-related events are. HIT-related events are rare. There is a lack of HIT reporting forms or platforms. HIT-related events are not as important as other PSE types. Explanation: Collecting reports of adverse events and near misses in healthcare, reporting systems would enable safety specialists to analyze events, identify underlying factors, and generate actionable knowledge to mitigate risks. However, the scarcity of HIT event-exclusive databases and event reporting systems indicates the challenge of identifying the HIT events from existing resources. AMIA | amia.org

25 Questions 2. The FDA Manufacturer and User Facility Device Experience (MAUDE) database is updated weekly and searchable online. As of August 2017, MAUDE had more than 6 million reports, which makes it a rich and publicly accessible resource to extract HIT- related events. An estimation performed on three-year ( ) MAUDE data shows that the proportion of HIT-related event reports is 0.1%. Considering the sharp increment of the reports received by MAUDE after 2010 (4.6 million) and the more and more pervasive usage of HIT in healthcare settings, this study updated the estimation based on MAUDE data. Which is the updated estimation proposed in this study? Up to 0.1%. Up to 0.9%. Up to 1.9%. Up to 9.1%. AMIA | amia.org

26 Answer Up to 0.1%. Up to 0.9%. Up to 1.9%. Up to 9.1%. Explanation: We developed an HIT filter based on two structured fields of the MAUDE database, which had the ability to screen the FDA MAUDE and propose a report subset with more than 50% HIT events. We manually reviewed the sample reports identified by the HIT filter and updated the estimation of the proportion of HIT events in MAUDE database to 0.4~0.9%. AMIA | amia.org

27 AMIA is the professional home for more than 5,400 informatics professionals, representing frontline clinicians, researchers, public health experts and educators who bring meaning to data, manage information and generate new knowledge across the research and healthcare enterprise. AMIA | amia.org

28 Thank you!


Download ppt "Hong Kang, PhD (Presenter) Zhiguo Yu, PhD Yang Gong, MD, PhD"

Similar presentations


Ads by Google