Generating and Using a Qualified Medical Knowledge Graph for Patient Cohort Retrieval from Big Clinical Electroencephalography (EEG) Data Sanda Harabagiu,

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Get Involved Download and use the data Take the survey Join the user group Neural Engineering Data Consortium Iyad Obeid PhD, Joseph Picone PhDTemple University,
Take the Survey! Big data needs? How could membership benefit you? Automatic Interpretation of EEGs Statistics Acknowledgements DARPA/MTO (D13AP00065)
Corpus Development EEG signal files and reports had to be manually paired, de-identified and annotated: Corpus Development EEG signal files and reports.
Software Testing and Quality Assurance
Information Retrieval in Practice
Abstract EEGs, which record electrical activity on the scalp using an array of electrodes, are routinely used in clinical settings to.
Automatic Labeling of EEGs Using Deep Learning M. Golmohammadi, A. Harati, S. Lopez I. Obeid and J. Picone Neural Engineering Data Consortium College of.
Data Processing Machine Learning Algorithm The data is processed by machine algorithms based on hidden Markov models and deep learning. They are then utilized.
Abstract The emergence of big data and deep learning is enabling the ability to automatically learn how to interpret EEGs from a big data archive. The.
Science & Technology Centers Program Center for Science of Information Bryn Mawr Howard MIT Princeton Purdue Stanford Texas A&M UC Berkeley UC San Diego.
Acknowledgements This research was also supported by the Brazil Scientific Mobility Program (BSMP) and the Institute of International Education (IIE).
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
THE TUH EEG CORPUS: The Largest Open Source Clinical EEG Corpus Iyad Obeid and Joseph Picone Neural Engineering Data Consortium Temple University Philadelphia,
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
Why Big Data is Crucial Overall progress in the field is not commensurate with the scope of investment. The existence of massive corpora has proven to.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Search Engine Architecture
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
The Goal Use computers to aid physicians in diagnosis of neurological diseases, particularly epilepsy Detect pathological events in real time Currently,
TUH EEG Corpus Data Analysis 38,437 files from the Corpus were analyzed. 3,738 of these EEGs do not contain the proper channel assignments specified in.
Cognitive Systems Foresight Language and Speech. Cognitive Systems Foresight Language and Speech How does the human system organise itself, as a neuro-biological.
1 Context-Aware Internet Sharma Chakravarthy UT Arlington December 19, 2008.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Cognitive Science and Biomedical Informatics Department of Computer Sciences ALMAAREFA COLLEGES.
Big Mechanism for Processing EEG Clinical Information on Big Data Aim 1: Automatically Recognize and Time-Align Events in EEG Signals Aim 2: Automatically.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Abstract Automatic detection of sleep state is important to enhance the quick diagnostic of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Demonstration A Python-based user interface: Waveform and spectrogram views are supported. User-configurable montages and filtering. Scrolling by time.
Market: Customer Survey: 57 clinicians from academic medical centers and community hospitals, and 44 industry professionals. Primary Customer Need: 70%
Data Analysis Generation of the corpus statistics was accomplished through the analysis of information contained in the EDF headers. Figure 4 shows some.
Abstract Automatic detection of sleep state is an important queue in accurate detection of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Ethical Considerations Dr. Richard Adanu Editor-in-Chief International Journal of Gynecology and Obstetrics (IJGO)
The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.
An Ontology framework for Knowledge-Assisted Semantic Video Analysis and Annotation Centre for Research and Technology Hellas/ Informatics and Telematics.
Human Language Technology Research Institute
Scalable EEG interpretation using Deep Learning and Schema Descriptors
BioSignal Analytics Inc.
Results and Discussion
The Big Data to Knowledge (BD2K)
G. Suarez, J. Soares, S. Lopez, I. Obeid and J. Picone
Search Engine Architecture
THE TUH EEG SEIZURE CORPUS
Human Language Technology Research Institute
N. Capp, E. Krome, I. Obeid and J. Picone
Names, titles & Affiliations
A Short Tutorial on Causal Network Modeling and Discovery
Names, titles & Affiliations
A Similarity Retrieval System for Multimodal Functional Brain Images
Big Data Resources for EEGs: Enabling Deep Learning Research
Names, titles & Affiliations
Names, titles & Affiliations
E. von Weltin, T. Ahsan, V. Shah, D. Jamshed, M. Golmohammadi, I
Names, titles & Affiliations
Batyr Charyyev.
Names, titles & Affiliations
Insert Title (identical to title of accepted abstract) size >=54 points, use sans serif font (Arial) John Brown1, Jane Smith2, other authors are listed.
Automatic Interpretation of EEGs for Clinical Decision Support
Human Language Technology Research Institute
A Dissertation Proposal by: Vinit Shah
Deep Residual Learning for Automatic Seizure Detection
Context-Aware Internet
Motivation The subjects/objects are correlated to each other under semantic relationships.
Presentation transcript:

Generating and Using a Qualified Medical Knowledge Graph for Patient Cohort Retrieval from Big Clinical Electroencephalography (EEG) Data Sanda Harabagiu, PhD, Travis Goodwin and Ramon Maldonado The Human Language Technology Research Institute University of Texas at Dallas Human Language Technology Research Institute Abstract Natural language processing of the narratives from EEG reports enables the recognition of EEG events and EEG activities as well as their clinical correlations, the patient state and the patient’s clinical picture. To organize these medical concepts we have designed an EEG-focused Qualified Medical Knowledge Graph (EEG-QMKG) in which concepts (e.g. EEG events, EEG activities) are qualified by their modality (e.g. factual, conditional) and polarity (e,g, positive or negative). The qualification of concepts takes into account the hedging used by physicians in EEG reports. Also, this representation events are grounded spatially (e.g. “frontocentral”) and temporally (“before” seizure) and all concepts are associated with their attributes (e.g. “small” amount of theta). Since many of the events and artifacts as well as medical concepts are mentioned multiple times in the same EEG report, co- reference was resolved automatically in order to best capture the context of medical concepts. The EEG-QMKG also includes the results of automatically processing the EEG signals interpreted in reports. By capturing automatically the semantics and section-informed cohesion from the EEG reports, we designed the EEG-QMKG as a probabilistic representation, in which edges between nodes are inferred statistically through BigData techniques, such as MapReduce. Active learning controls the quality of the knowledge captured in the EEG-QMKG. The EEG-QMKG is a central component our architecture for retrieving patient cohorts when queries such as “young patients with focal cerebral dysfunction treated with Topamax“ are being posed. Not only does the EEG-QMKG enable the retrieval of a list of ranked EEG reports and signals that satisfy the inclusion criteria set by the query, but it also informs innovative learning-to-rank techniques based on results of acceptance testing using three focus groups: expert annotators, clinicians and medical students. The learning-to-rank framework will produce optimal patient cohorts and contribute to regularization of interpretations in EEG reports. Abstract Natural language processing of the narratives from EEG reports enables the recognition of EEG events and EEG activities as well as their clinical correlations, the patient state and the patient’s clinical picture. To organize these medical concepts we have designed an EEG-focused Qualified Medical Knowledge Graph (EEG-QMKG) in which concepts (e.g. EEG events, EEG activities) are qualified by their modality (e.g. factual, conditional) and polarity (e,g, positive or negative). The qualification of concepts takes into account the hedging used by physicians in EEG reports. Also, this representation events are grounded spatially (e.g. “frontocentral”) and temporally (“before” seizure) and all concepts are associated with their attributes (e.g. “small” amount of theta). Since many of the events and artifacts as well as medical concepts are mentioned multiple times in the same EEG report, co- reference was resolved automatically in order to best capture the context of medical concepts. The EEG-QMKG also includes the results of automatically processing the EEG signals interpreted in reports. By capturing automatically the semantics and section-informed cohesion from the EEG reports, we designed the EEG-QMKG as a probabilistic representation, in which edges between nodes are inferred statistically through BigData techniques, such as MapReduce. Active learning controls the quality of the knowledge captured in the EEG-QMKG. The EEG-QMKG is a central component our architecture for retrieving patient cohorts when queries such as “young patients with focal cerebral dysfunction treated with Topamax“ are being posed. Not only does the EEG-QMKG enable the retrieval of a list of ranked EEG reports and signals that satisfy the inclusion criteria set by the query, but it also informs innovative learning-to-rank techniques based on results of acceptance testing using three focus groups: expert annotators, clinicians and medical students. The learning-to-rank framework will produce optimal patient cohorts and contribute to regularization of interpretations in EEG reports. EEG-QMKG Represented as a probabilistic graphical model through a factorized Markov network. The factors are learned by using Big Data methods relying on MapReduce. EEG-QMKG Represented as a probabilistic graphical model through a factorized Markov network. The factors are learned by using Big Data methods relying on MapReduce. Acknowledgements Research reported in this poster was supported by the National Human Genome Research Institute of the National Institutes of Health under award number 1U01HG The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The TUH EEG Corpus development was sponsored by the Defense Advanced Research Projects Agency (DARPA), Temple University’s College of Engineering and Office of Research. Acknowledgements Research reported in this poster was supported by the National Human Genome Research Institute of the National Institutes of Health under award number 1U01HG The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The TUH EEG Corpus development was sponsored by the Defense Advanced Research Projects Agency (DARPA), Temple University’s College of Engineering and Office of Research.