THE TUH EEG CORPUS: The Largest Open Source Clinical EEG Corpus Iyad Obeid and Joseph Picone Neural Engineering Data Consortium Temple University Philadelphia,

Slides:



Advertisements
Similar presentations
M. Tabrizi: Seizure Detection May, A COMPARATIVE ANALYSIS OF NONLINEAR FEATURES FOR AN HMM-BASED SEIZURE DETECTION SYSTEM Masih Tabrizi and Joseph.
Advertisements

Abstract Electrical activity in the cortex can be recorded by surface electrodes. Electro Encephalography (EEG) machine records potential difference between.
Manual Interpretation of EEGs: A Machine Learning Perspective Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data Consortium College.
Get Involved Download and use the data Take the survey Join the user group Neural Engineering Data Consortium Iyad Obeid PhD, Joseph Picone PhDTemple University,
Take the Survey! Big data needs? How could membership benefit you? Automatic Interpretation of EEGs Statistics Acknowledgements DARPA/MTO (D13AP00065)
Corpus Development EEG signal files and reports had to be manually paired, de-identified and annotated: Corpus Development EEG signal files and reports.
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
From Registration to Accounts Receivable – The Whole Can of Worms 2007 UBO/UBU Conference 1 Briefing:Coding Inpatient Professional Services Date:21 March.
Validity and Reliability Dr. Voranuch Wangsuphachart Dept. of Social & Environmental Medicine Faculty of Tropical Medicine Mahodil University 420/6 Rajvithi.
Abstract EEGs, which record electrical activity on the scalp using an array of electrodes, are routinely used in clinical settings to.
Automatic Labeling of EEGs Using Deep Learning M. Golmohammadi, A. Harati, S. Lopez I. Obeid and J. Picone Neural Engineering Data Consortium College of.
Decision Support for Quality Improvement
Big Data in Biomedical Engineering Iyad Obeid, PhD November 7, 2014 Temple University Neural Engineering Data Consortium
FPDS- NG Reports Overview December 16, Today’s Goals Provide an overview of the FPDS-NG reporting capability Demonstrate each of the reporting tools.
Data Processing Machine Learning Algorithm The data is processed by machine algorithms based on hidden Markov models and deep learning. They are then utilized.
Analysis of Temporal Lobe Paroxysmal Events Using Independent Component Analysis Jonathan J. Halford MD Department of Neuroscience, Medical University.
Streamlining the Review Cycle Michael Oettli, nlg GmbH Santa Clara, October 10 th.
Abstract The emergence of big data and deep learning is enabling the ability to automatically learn how to interpret EEGs from a big data archive. The.
© 2003 East Collaborative e ast COLLABORATIVE ® eC SoftwareProducts TrackeCHealth.
Acknowledgements This research was also supported by the Brazil Scientific Mobility Program (BSMP) and the Institute of International Education (IIE).
The Neural Engineering Data Consortium: ‘Déjà Vu All Over Again’ Iyad Obeid, Associate Professor Joseph Picone, Professor Department of Electrical and.
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
Why Big Data is Crucial Overall progress in the field is not commensurate with the scope of investment. The existence of massive corpora has proven to.
Manual Interpretation of EEGs: A Machine Learning Perspective Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data Consortium College.
The ISI Web of Knowledge nce/training/wok/#tab3.
Data Acquisition An EEG measurement represents a difference between the voltages at two electrodes. The signal is usually displayed using a montage which.
The Goal Use computers to aid physicians in diagnosis of neurological diseases, particularly epilepsy Detect pathological events in real time Currently,
School of Health Sciences Week 8! AHIMA Practice Briefs Healthcare Delivery & Information Management HI 125 Instructor: Alisa Hayes, MSA, RHIA, CCRC.
TUH EEG Corpus Data Analysis 38,437 files from the Corpus were analyzed. 3,738 of these EEGs do not contain the proper channel assignments specified in.
1 COMMUNITY DENTAL HEALTH Algonquin College Janet Ladas.
Big Mechanism for Processing EEG Clinical Information on Big Data Aim 1: Automatically Recognize and Time-Align Events in EEG Signals Aim 2: Automatically.
Bringing Big Data to Neural Interfaces Iyad Obeid PhD, Joseph Picone PhD Temple University, Philadelphia, Pennsylvania Existing Research & Funding Model.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Abstract Automatic detection of sleep state is important to enhance the quick diagnostic of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Demonstration A Python-based user interface: Waveform and spectrogram views are supported. User-configurable montages and filtering. Scrolling by time.
Generating and Using a Qualified Medical Knowledge Graph for Patient Cohort Retrieval from Big Clinical Electroencephalography (EEG) Data Sanda Harabagiu,
Feature Extraction Find best Alignment between primitives and data Found Alignment? TUH EEG Corpus Supervised Learning Process Reestimate Parameters Recall.
Market: Customer Survey: 57 clinicians from academic medical centers and community hospitals, and 44 industry professionals. Primary Customer Need: 70%
Data Analysis Generation of the corpus statistics was accomplished through the analysis of information contained in the EDF headers. Figure 4 shows some.
 2014 Diagnotes, Inc. – Confidential & Proprietary Spring Into Quality Symposium March 14, 2014.
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
Abstract Automatic detection of sleep state is an important queue in accurate detection of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Improved EEG Event Classification Using Differential Energy A.Harati, M. Golmohammadi, S. Lopez, I. Obeid and J. Picone Neural Engineering Data Consortium.
The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Results from Mean and Variance Calculations The overall mean of the data for all features was for the REF class and for the LE class. The.
Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.
Scalable EEG interpretation using Deep Learning and Schema Descriptors
BioSignal Analytics Inc.
Automated Identification of Abnormal Adult EEG
CLASSIFICATION OF SLEEP EVENTS IN EEG’S USING HIDDEN MARKOV MODELS
G. Suarez, J. Soares, S. Lopez, I. Obeid and J. Picone
Enhanced Visualizations for Improved Real-Time EEG Monitoring
THE TUH EEG SEIZURE CORPUS
Automatic Sleep Stage Classification using a Neural Network Algorithm
Introducing OmniPage Ultimate
Enhanced Visualizations for Improved Real-Time EEG Monitoring
N. Capp, E. Krome, I. Obeid and J. Picone
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Big Data Resources for EEGs: Enabling Deep Learning Research
To learn more, visit The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions.
E. von Weltin, T. Ahsan, V. Shah, D. Jamshed, M. Golmohammadi, I
Automatic Interpretation of EEGs for Clinical Decision Support
feature extraction methods for EEG EVENT DETECTION
EEG Event Classification Using Deep Learning
A Dissertation Proposal by: Vinit Shah
EEG Event Classification Using Deep Learning
Presentation transcript:

THE TUH EEG CORPUS: The Largest Open Source Clinical EEG Corpus Iyad Obeid and Joseph Picone Neural Engineering Data Consortium Temple University Philadelphia, Pennsylvania, USA

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, The Clinical Process A technician administers a 30−minute recording session. An EEG specialist (neurologist) interprets the EEG. An EEG report is generated with the diagnosis. Patient is billed once the report is coded and signed off.

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, TUH EEG: Bring Big Data to EEG Science Release 20,000+ clinical EEG recordings from Temple University Hospital ( ).  Includes physician reports and patient medical histories.  Data resides on over 1,500 CDs.  Data must be deidentified. Jointly funded by DARPA, Temple University Office of Research and College of Engineering. The largest corpus of its type ever released; will answer many basic science questions about EEGs.

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, Number of Sessions: 22,000+ Number of Patients: ~15,000 (one patient has 42 EEG sessions) Age: 16 years to 90+ Sampling: 16-bit data sampled at 250 Hz, 256 Hz or 512 Hz Number of Channels: variable Over 90% of the alternate channel assignments can be mapped to the configuration. Number of Channels: ranges from [28, 129] (one annotation channel per EDF file) TUH EEG at a Glance

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, Two Types of Reports:  Preliminary Report: contains a summary diagnosis (usually in a spreadsheet format).  EEG Report: the final “signed off” report that triggers billing. Inconsistent Report Formats:  The format of reporting has changed several times over the past 12 years. Report Databases:  MedQuist (MS Word.rtf)  Alpha (OCR’ed.pdf)  EPIC (text)  Physician’s (MS Word.doc)  Hardcopies (OCR’ed pdf) Physician Reports: The Resolving Process

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, Status and Schedule Released 250+ sessions in January Released 3,000+ sessions in March 2014 for internal testing. Over 6,000 sessions are ready for release. New data keeps pouring in (24,750+ sessions online now).

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, First Attempt (5 Classes):  Focal epileptiform, generalized epileptiform, focal abnormal, generalized abnormal, artifacts and background.  Achieved over 80% sensitivity (but results were not useful to physicians). Second Attempt (5 Classes):  Spike and sharp wave, generalized periodic epileptiform discharge (GPED), periodic lateralized epileptiform discharge (PLED), seizure and background (includes eye blink)  Also: focal/generalized and continuous/intermittent Preliminary Findings Automatic Labeling:  Deep learning is used to identify critical EEG events that correlate with EEG reports (using unsupervised training).  These events are then used to train classifiers that will automatically label the data.

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, General:  This project would not have been possible without leveraging three funding sources.  Community interest is high, but willingness to fund is low. Project Specific:  Recovering the EEG signal data was challenging due to software incompatibilities and media problems.  Recovering the EEG reports is proving to be challenging due to the primitive state of the hospital record system.  Making the data truly useful to machine learning researchers will require additional data clean up, particularly with linking reports to specific EEG activity. Observations

NEDC BoD Meeting: TUH EEG Corpus Overview March 12, Publications  Harati, A., Choi, S. I., Tabrizi, M., Obeid, I., Jacobson, M., & Picone, J. (2013). The Temple University Hospital EEG Corpus. Proceedings of the IEEE Global Conference on Signal and Information Processing. Austin, Texas, USA.  Ward, C., Obeid, I., Picone, J., & Jacobson, M. (2013). Leveraging Big Data Resources for Automatic Interpretation of EEGs. Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium. New York City, New York, USA. Planned Publications  Journal paper in collaboration with neurologists on a statistical analysis of the data (should be a seminal paper cited by others using the data)  IEEE Signal Processing in Medicine and Biology, Temple University, Philadelphia, Pennsylvania, December 6, 2014 (NSF-Funded). Publications

The Temple University Hospital EEG Corpus Synopsis: The world’s largest publicly available EEG corpus consisting of 20,000+ EEGs collected from 15,000 patients, collected over 12 years. Includes physician’s diagnoses and patient medical histories. Number of channels varies from 24 to 36. Signal data distributed in an EDF format. Impact: Sufficient data to support application of state of the art machine learning algorithms Patient medical histories, particularly drug treatments, supports statistical analysis of correlations between signals and treatments Historical archive also supports investigation of EEG changes over time for a given patient Enables the development of real-time monitoring Database Overview: 21,000+ EEGs collected at Temple University Hospital from 2002 to 2013 (an ongoing process) Recordings vary from 24 to 36 channels of signal data sampled at 250 Hz Patients range in age from 18 to 90 with an average of 1.4 EEGs per patient Data includes a test report generated by a technician, an impedance report and a physician’s report; data from 2009 forward inlcudes ICD-9 codes A total of 1.8 TBytes of data Personal information has been redacted Clinical history and medication history are included Physician notes are captured in three fields: description, impression and correlation fields.

Automated Interpretation of EEGs Goals: (1) To assist healthcare professionals in interpreting electroencephalography (EEG) tests, thereby improving the quality and efficiency of a physician’s diagnostic capabilities; (2) Provide a real-time alerting capability that addresses a critical gap in long-term monitoring technology. Impact: Patients and technicians will receive immediate feedback rather than waiting days or weeks for results Physicians receive decision-making support that reduces their time spent interpreting EEGs Medical students can be trained with the system and use search tools make it easy to view patient histories and comparable conditions in other patients Uniform diagnostic techniques can be developed Milestones: Develop an enhanced set of features based on temporal and spectral measures (1Q’2014) Statistical modeling of time-varying data sources in bioengineering using deep learning (2Q’2014) Label events at an accuracy of 95% measured on the held-out data from the TUH EEG Corpus (3Q’2014) Predict diagnoses with an F-score (a weighted average of precision and recall) of 0.95 (4Q’2014) Demonstrate a clinically-relevant system and assess the impact on physician workflow (4Q’2014)