A Data Reconstruction Algorithm for Temporal Clinical Expressions Zhikun Zhang, BS1,2, Chunlei Tang, PhD3,4,5, Meihan Wan, BS1,2, Joseph M. Plasek, PhD3,

Slides:

Advertisements

Similar presentations

1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.

Advertisements

1.A 33 year old female patient admitted to the ICU with confirmed pulmonary embolism. It was noted that she had elevated serum troponin level. Does this.

The ICH E5 Question and Answer Document Status and Content Robert T. O’Neill, Ph.D. Director, Office of Biostatistics, CDER, FDA Presented at the 4th Kitasato-Harvard.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

CONCLUSION & FUTURE WORK Normally, users perform triage tasks using multiple applications in concert: a search engine interface presents lists of potentially.

® From Bad to Worse: Comorbidities and Chronic Lower Back Pain Margaret Cecere JD, Richard Young MD, Sandra Burge PhD The University of Texas Health Science.

Community-based Substance Abuse Coalition Creates Mandate for Improvement of Substance Abuse Care for Hospitalized Patients Joan Quinlan, MPA, Susan Krupnick,

Influence of Comorbid Depression and Antidepressant Treatment on Mortality for Medicare Beneficiaries with Chronic Obstructive Pulmonary Disease by SSDI-eligibility.

Lung-2015 Baltimore, USA July , 2015 Suhaj A.

CONCLUSION & FUTURE WORK Normally, users perform search tasks using multiple applications in concert: a search engine interface presents lists of potentially.

Panel: Problems with Existing EHR Paradigms and How Ontology Can Solve Them Roberto A. Rocha, MD, PhD, FACMI Sr. Corporate Manager Clinical Knowledge Management.

Temporal Mediators: Integration of Temporal Reasoning and Temporal-Data Maintenance Yuval Shahar MD, PhD Temporal Reasoning and Planning in Medicine.

Splitting Complex Temporal Questions for Question Answering systems ACL 2004.

Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.

Accreditation of health practitioners in Russia starting from 2016: answers to frequently asked questions.

Amar K. Das, MD, PhD Associate Professor of Biomedical Data Science, Psychiatry and Health Policy & Clinical Practice Geisel School of Medicine at Dartmouth.

Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.

Uses of the NIH Collaboratory Distributed Research Network Jeffrey Brown, PhD for the DRN Team Harvard Pilgrim Health Care Institute and Harvard Medical.

Date of download: 5/31/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Evaluation of the patient with known or suspected ischemic heart.

The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.

LITERATURE REVIEW OF DECISION MODELS FOR DISEASES WITH SHORT-TERM FLUCTUATIONS/EPISODES: The Case of COPD Dr. Orpah Nasimiyu Wavomba

Epidemiology of Human Infections with Avian Influenza A(H7N9) Virus in China Qun Li, M.D., Lei Zhou, M.D., Minghao Zhou, Ph.D., Zhiping Chen, M.D., Furong.

Clinical Trial Design for Second Generation TAVI - Academic View

Claire Guerin and Dr Aaron J Brady

Writing Scientific Research Paper

Addressing the challenges and successes of expediting TB treatment among PLHIV who are seriously ill: experience from Kenya Masini E & Olwande C National.

. Troponin limit of detection plus cardiac risk stratification scores for the exclusion of myocardial infarction and 30-day adverse cardiac events in ED.

CRF &SVM in Medication Extraction

Leigh E. Tenkku, PhD, MPH Department of Family and Community Medicine

MR images analysis of glioma

Cascade of care for persons newly diagnosed

Amanda L. Do, MPH1,2, Ruby Y. Wan, MS1,2, Robert W

(95% confidence interval)

Evaluating Sepsis Guidelines and Patient Outcomes

Joseph C. Kvedar Director, Telemedicine Partners HealthCare Systems

Journal of Nuclear Cardiology | Official Journal of the American Society of Nuclear Cardiology PET Measurements of Myocardial Blood flow Post Myocardial.

by Hyunwoo Park and Kichun Lee Knowledge-Based Systems 60 (2014) 58–72

Bayard R. Wilson, BA, Kathryn R. Tringale, BS, Brian R

Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD

Pharmacovigilance in clinical trials

Development and Validation of HealthImpactTM: An Incident Diabetes Prediction Model Based on Administrative Data Rozalina G. McCoy, M.D.1, Vijay S. Nori,

Efficient Remediation of Terms Inactivated by Dictionary Updates

Collective Network Linkage across Heterogeneous Social Platforms

INTRODUCTION TO COMMUNITY PHARMACY

Medication Use Pattern Mining for Childhood Pneumonia Using Six Year Inpatient Electronic Medical Records in a Shanghai Hospital, China Chunlei Tang, PhD1,2,3,

Fenglong Ma1, Jing Gao1, Qiuling Suo1

David B. Price, MBBChir, MA, FRCGP, Barbara P

Data fusion classification method based on Multi agents system

Grampian COPD MCN Delivering Spirometry in a Community Pharmacy setting, a rural solution? Small I (1,2), Clelland J (1,2), Robertson W (1), Freeman D.

Mixture of Mutually Exciting Processes for Viral Diffusion

Bevin K. Shagoury, Communications & Education Director

Clustering Similar Clinical Documents in Electronic Health Records

Impact of Hyponatremia Correction on the Risk for 30-Day Readmission and Death in Patients with Congestive Heart Failure Jacques D. Donzé, MD, MSc, Patrick.

Friends of Cancer Research

Use Cases CS/SWE 421 Introduction to Software Engineering Dan Fleck

How to publish in a format that enhances literature-based discovery?

Introduction to public health surveillance

Global Registry of Acute Coronary Events: GRACE

Use Cases CS/SWE 421 Introduction to Software Engineering Dan Fleck

(95% confidence interval)

Level of Asthma Controller Therapy Before Admission to the Hospital

Chapter 4 SURVIVAL AND LIFE TABLES

RegionAl: an Optimized Regional Classifier to Predict Mortality in

Systematic review of atopic dermatitis disease definition in studies using routinely-collected health data M.P. Dizon, A.M. Yu, R.K. Singh, J. Wan, M-M.

Use Cases CS/SWE 421 Introduction to Software Engineering Dan Fleck

Association between hidradenitis suppurativa and hospitalization for psychiatric disorders: A cross-sectional analysis of the National Inpatient Sample.

T. Tzellos1,2; H. Yang3; F. Mu3; B. Calimlim4; J. Signorovitch3

Haokai Sheng, Yun Xiong, David W. Bates, Li Zhou

Presentation transcript:

A Data Reconstruction Algorithm for Temporal Clinical Expressions Zhikun Zhang, BS1,2, Chunlei Tang, PhD3,4,5, Meihan Wan, BS1,2, Joseph M. Plasek, PhD3, Yun Xiong, PhD1,2, Li Zhou, MD, PhD3,4, David W. Bates, MD, MSc 3,4,5 Podium Abstract Introduction Method Reference Temporal expressions annotated in clinical notes pose challenges to downstream analytical activities. For example, a disease-centric knowledge graph often requires massive time aggregation operations that organize itself around the relationship among multiple chronic diseases (e.g., chronic obstructive pulmonary disease (COPD) and heart failure). We present a novel data reconstruction algorithm that has three stages. First, it detects if an expression has temporal intent. Second, it decomposes and rewrites the expression into non-temporal sub-expression and temporal constraints. Finally, it clusters similar non-temporal sub-expressions by using unsupervised sentence embedding under the K-means paradigm. Experiments on a corpus of cardiology reports demonstrate that our method is feasible. Jia Z, Abujabal A, Saha Roy R, et al. TEQUILA: Temporal Question Answering over Knowledge Bases. Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018:1807-1810. Arora S, Liang Y, Ma T. A simple but tough-to-beat baseline for sentence embeddings. Proceeding of 5th International Conference on Learning Representations. Temporal information is crucial for many data analytic tasks; however temporal expressions in narrative clinical notes is challenging to process and be used for downstream analytical activities. Representing and reasoning about temporal expressions in clinical notes is critical for clinical diagnosis and treatment1. TimeML1-3 is used for annotating four types of temporal expression (i.e., time stamping, relative ordering, context, duration). However, TimeML and similar markup languages still struggle with representing implicit temporal conditions as well as complex expressions that require joining the results from the corresponding sub-expression. Consider the following examples: TimeML’s recognition capacity on E1 is adequate. In E2, no explicit date is mentioned, thus detecting the temporal nature within E2 is the first challenge. The phrase “after which” refers to after an event (IV adenosine infusion). TimeML could detect this phrase, but does not properly disambiguate it to a normalized date. The temporal preposition “with” is a cue as well, and words like “subsequently” are also used in temporal contexts. The second challenge in E2 is to judiciously decompose the temporal expressions into sub-expressions. For example, E2 should be decomposed into: E2.1 “what the patient did before a pharmacological stress test;” E2.2 “when the patient was administered a pharmacological stress test;” and E2.3 “when the patient was administered sestamibi.” E1: “When compared with ECG of 18-JUL-YYYY 10:41, (unconfirmed) no significant change was found. Confirmed by X MD on 7/22/YYYY 17:18.” E2: “Subsequently, a pharmacological stress test was performed with IV adenosine infusion after which sestamibi was injected IV at peak drug effect.” Author Affiliations Shanghai Key Laboratory of Data Science, Shanghai, CHN; School of Computer Science, Fudan University, Shanghai, CHN; Division of General Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Clinical and Quality Analysis, Partners HealthCare System, Boston, MA, USA.

Conclusions and Relevance A Data Reconstruction Algorithm for Temporal Clinical Expressions Zhikun Zhang, BS1,2, Chunlei Tang, PhD3,4,5, Meihan Wan, BS1,2, Joseph M. Plasek, PhD3, Yun Xiong, PhD1,2, Li Zhou, MD, PhD3,4, David W. Bates, MD, MSc 3,4,5 Methods Results Materials We present a data reconstruction algorithm that has three stages. First, it detects if an expression has temporal intent. Second, it decomposes and rewrites the expression into non-temporal sub-expression and temporal constraints. Third, it clusters similar non-temporal sub-expressions by using unsupervised sentence embedding5 under the K-means paradigm. Specifically: Temporal signals and relations. Our TimeML expansion annotates textual elements that denote explicit and implicit temporal expressions such as the cues when a medical event is mentioned only implicitly such as “after which.” Decomposing and rewriting expressions Phrase/Sentence embeddings under K-means paradigm. Phrase embeddings is used to obtain similar phrases via comparing the distance between vectors. Our cohort consists of 15,500 COPD patients who had received care at Partners Healthcare network and died between 2011 and 2017. The clinical notes for this cohort were extracted from Partners Research Patient Data Registry (RPDR)4. In this study, we extracted the ABNORMAL ECG section from free-text cardiology reports. This study was approved by Partners Institutional Review Board (IRB). Figure 1 was reconstructed from 30,363 ECG notes in a time segment of 0-180 days before death. The format for reconstructed data (produced from our algorithm) is event data such as “duration (days) between two ECGs,” “ECG diagnosis,” “the probability at the same duration,” etc. Take E1 as an example, three sub-expressions are: E1.1 “When compared with ECG of 18-JUL-YYYY 10:41,” E1.2 “(unconfirmed) no significant change was found,” and E1.3 “Confirmed by X MD on 7/22/YYYY 17:18.” The duration is the time interval between E1.1 and E1.3. After having calculated the similarity of non-temporal sub-expression as E1.2, the number of clusters of similar ECG diagnosis is easy to obtain. Comparing positive ECG diagnoses against negative controls, we can extend the time interval between two ECGs to about 20 days based on the first “no significant change.” Author Affiliations Shanghai Key Laboratory of Data Science, Shanghai, CHN; School of Computer Science, Fudan University, Shanghai, CHN; Division of General Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Clinical and Quality Analysis, Partners HealthCare System, Boston, MA, USA. Conclusions and Relevance Our data reconstruction algorithm for temporal clinical expression captured over phrase embeddings in a way that was feasible to address several gaps in natural language processing. This is a significant step toward handling further analytical activities such as knowledge graph that often requires massive time aggregation operations. 2