THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.

Slides:



Advertisements
Similar presentations
1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
Advertisements

Get Involved Download and use the data Take the survey Join the user group Neural Engineering Data Consortium Iyad Obeid PhD, Joseph Picone PhDTemple University,
SHELLY GUFFEY MAKING THE MOST OF YOUR REVENUE CYCLE MANAGEMENT TECHNOLOGY
EXPERT DOCUMENT SOLUTIONS FOR YOUR BUSINESS EXPERT DOCUMENT SOLUTIONS FOR YOUR BUSINESS.
Take the Survey! Big data needs? How could membership benefit you? Automatic Interpretation of EEGs Statistics Acknowledgements DARPA/MTO (D13AP00065)
Corpus Development EEG signal files and reports had to be manually paired, de-identified and annotated: Corpus Development EEG signal files and reports.
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
Abstract EEGs, which record electrical activity on the scalp using an array of electrodes, are routinely used in clinical settings to.
Related Activities With Duke and CDC, fostering public health / primary care integration though “Practical Playbook” project (practicalplaybook.org)
Automatic Labeling of EEGs Using Deep Learning M. Golmohammadi, A. Harati, S. Lopez I. Obeid and J. Picone Neural Engineering Data Consortium College of.
Big Data in Biomedical Engineering Iyad Obeid, PhD November 7, 2014 Temple University Neural Engineering Data Consortium
Data Processing Machine Learning Algorithm The data is processed by machine algorithms based on hidden Markov models and deep learning. They are then utilized.
Abstract The emergence of big data and deep learning is enabling the ability to automatically learn how to interpret EEGs from a big data archive. The.
Acknowledgements This research was also supported by the Brazil Scientific Mobility Program (BSMP) and the Institute of International Education (IIE).
Computers in Healthcare Jinbo Bi Department of Computer Science and Engineering Connecticut Institute for Clinical and Translational Research University.
The Neural Engineering Data Consortium: ‘Déjà Vu All Over Again’ Iyad Obeid, Associate Professor Joseph Picone, Professor Department of Electrical and.
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
Evaluating a Research Report
THE TUH EEG CORPUS: The Largest Open Source Clinical EEG Corpus Iyad Obeid and Joseph Picone Neural Engineering Data Consortium Temple University Philadelphia,
Why Big Data is Crucial Overall progress in the field is not commensurate with the scope of investment. The existence of massive corpora has proven to.
Manual Interpretation of EEGs: A Machine Learning Perspective Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data Consortium College.
Data Acquisition An EEG measurement represents a difference between the voltages at two electrodes. The signal is usually displayed using a montage which.
Turning Data Into Insights. Is your private practice struggling to operate? Perhaps you feel caught between a growing client caseload and continually.
De-identification: A Critical Success Factor in Clinical and Population Research Steven Merahn MD Dee Lang, RHIT Prepared for 2007 APIII Pittsburgh, PA.
The Goal Use computers to aid physicians in diagnosis of neurological diseases, particularly epilepsy Detect pathological events in real time Currently,
TUH EEG Corpus Data Analysis 38,437 files from the Corpus were analyzed. 3,738 of these EEGs do not contain the proper channel assignments specified in.
The Neural Engineering Data Consortium: ‘Déjà Vu All Over Again’ 1 Iyad Obeid, Associate Professor Joseph Picone, Professor Department of Electrical and.
Big Mechanism for Processing EEG Clinical Information on Big Data Aim 1: Automatically Recognize and Time-Align Events in EEG Signals Aim 2: Automatically.
Bringing Big Data to Neural Interfaces Iyad Obeid PhD, Joseph Picone PhD Temple University, Philadelphia, Pennsylvania Existing Research & Funding Model.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Abstract Automatic detection of sleep state is important to enhance the quick diagnostic of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Demonstration A Python-based user interface: Waveform and spectrogram views are supported. User-configurable montages and filtering. Scrolling by time.
Generating and Using a Qualified Medical Knowledge Graph for Patient Cohort Retrieval from Big Clinical Electroencephalography (EEG) Data Sanda Harabagiu,
Feature Extraction Find best Alignment between primitives and data Found Alignment? TUH EEG Corpus Supervised Learning Process Reestimate Parameters Recall.
Market: Customer Survey: 57 clinicians from academic medical centers and community hospitals, and 44 industry professionals. Primary Customer Need: 70%
Data Analysis Generation of the corpus statistics was accomplished through the analysis of information contained in the EDF headers. Figure 4 shows some.
EEG processing based on IFAST system and Artificial Neural Networks for early detection of Alzheimer’s disease.
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
Abstract Automatic detection of sleep state is an important queue in accurate detection of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Improved EEG Event Classification Using Differential Energy A.Harati, M. Golmohammadi, S. Lopez, I. Obeid and J. Picone Neural Engineering Data Consortium.
Session 6: Data Flow, Data Management, and Data Quality.
The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Results from Mean and Variance Calculations The overall mean of the data for all features was for the REF class and for the LE class. The.
Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Scalable EEG interpretation using Deep Learning and Schema Descriptors
BioSignal Analytics Inc.
Automated Identification of Abnormal Adult EEG
CLASSIFICATION OF SLEEP EVENTS IN EEG’S USING HIDDEN MARKOV MODELS
G. Suarez, J. Soares, S. Lopez, I. Obeid and J. Picone
Enhanced Visualizations for Improved Real-Time EEG Monitoring
THE TUH EEG SEIZURE CORPUS
Enhanced Visualizations for Improved Real-Time EEG Monitoring
N. Capp, E. Krome, I. Obeid and J. Picone
Optimizing Channel Selection for Seizure Detection
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Big Data Resources for EEGs: Enabling Deep Learning Research
To learn more, visit The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions.
AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS
E. von Weltin, T. Ahsan, V. Shah, D. Jamshed, M. Golmohammadi, I
Vinit Shah, Joseph Picone and Iyad Obeid
Automatic Interpretation of EEGs for Clinical Decision Support
feature extraction methods for EEG EVENT DETECTION
EEG Event Classification Using Deep Learning
A Dissertation Proposal by: Vinit Shah
EEG Event Classification Using Deep Learning
Presentation transcript:

THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium Temple University M. P. Jacobson, M.D. and S. Tobochnik Department of Neurology, Lewis Katz School of Medicine Temple University

S. Lopez: TUH EEG Corpus December 13, The Clinical Process A technician administers a 30−minute recording session. An EEG specialist (neurologist) interprets the EEG. An EEG report is generated with the diagnosis. Patient is billed once the report is coded and signed off.

S. Lopez: TUH EEG Corpus December 13, Automatic Interpretation

S. Lopez: TUH EEG Corpus December 13, The TUH EEG Corpus Number of Sessions: 25,000+ Number of Patients: ~15,000 Frequent Flyer: 42 sessions Age Range (Years): 16 to 90+ Sampling: Rates : 250, 256 or 512 Hz Resolution: 16 bits Data Format: European Data Format (EDF) Number of Channels: Variable Variations in channels and electrode labels are very real challenges Number of channels ranges from [28, 129] (one annotation channel per EDF file) Over 90% of the alternate channel assignments can be mapped to the standard configuration.

S. Lopez: TUH EEG Corpus December 13, EEG Reports Two Types of Reports:  Preliminary Report: contains a summary diagnosis (usually in a spreadsheet format).  EEG Report: the final “signed off” report that triggers billing. Inconsistent Report Formats:  The format of reporting has changed several times over the past 12 years. Report Databases:  MedQuist (MS Word.rtf)  Alpha (OCR’ed.pdf)  EPIC (text)  Physician’s (MS Word.doc)  Hardcopies (OCR’ed pdf)

S. Lopez: TUH EEG Corpus December 13, The TUH EEG Corpus Corpus is growing at a rate of about 2,750 EEGs per year. In 2014, more 40-minute EEGs are being administered. ??? A sample EDF header. Data has been carefully deidentified (e.g., removal of medical record number, patient name and exact birthdate) “Pruned EEGs” are being used.

S. Lopez: TUH EEG Corpus December 13, The TUH EEG Corpus Number of Sessions: 25,000+ Number of Patients: ~15,000 Frequent Flyer: 42 sessions Age Range (Years): 16 to 90+ Sampling: Rates : 250, 256 or 512 Hz Resolution: 16 bits Data Format: European Data Format (EDF) Number of Channels: Variable Variations in channels and electrode labels are very real challenges Number of channels ranges from [28, 129] (one annotation channel per EDF file) Over 90% of the alternate channel assignments can be mapped to the standard configuration.

S. Lopez: TUH EEG Corpus December 13, Manual Annotations

S. Lopez: TUH EEG Corpus December 13, Two-Level Machine Learning Architecture Feature Extraction Sequential Modeler Post Processor Epoch Label Epoch Temporal and Spatial Context Hidden Markov Models Finite State Machine

S. Lopez: TUH EEG Corpus December 13, Iterative Training

S. Lopez: TUH EEG Corpus December 13, Performance

S. Lopez: TUH EEG Corpus December 13, Analysis Talk about the difficulty of detecting spikes and the strategy used to differentiate them from PLED and GPED

The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets used to address those questions. To broaden participation by making data available to research groups who have significant expertise but lack capacity for data generation. Impact: Big data resources enables application of state of the art machine-learning algorithms A common evaluation paradigm ensures consistent progress towards long-term research goals Publicly available data and performance baselines eliminate specious claims Technology can leverage advances in data collection to produce more robust solutions Expertise: Experimental design and instrumentation of bioengineering-related data collection Signal processing and noise reduction Preprocessing and preparation of data for distribution and research experimentation Automatic labeling, alignment and sorting of data Metadata extraction for enhancing machine learning applications for the data Statistical modeling, mining and automated interpretation of big data To learn more, visit

S. Lopez: TUH EEG Corpus December 13, Talk about the database status (two bullets) Talk about the technology (two bullets) Summary

S. Lopez: TUH EEG Corpus December 13, [ 1]…. Brief Bibliography

The Temple University Hospital EEG Corpus Synopsis: The world’s largest publicly available EEG corpus consisting of 20,000+ EEGs collected from 15,000 patients, collected over 12 years. Includes physician’s diagnoses and patient medical histories. Number of channels varies from 24 to 36. Signal data distributed in an EDF format. Impact: Sufficient data to support application of state of the art machine learning algorithms Patient medical histories, particularly drug treatments, supports statistical analysis of correlations between signals and treatments Historical archive also supports investigation of EEG changes over time for a given patient Enables the development of real-time monitoring Database Overview: 21,000+ EEGs collected at Temple University Hospital from 2002 to 2013 (an ongoing process) Recordings vary from 24 to 36 channels of signal data sampled at 250 Hz Patients range in age from 18 to 90 with an average of 1.4 EEGs per patient Data includes a test report generated by a technician, an impedance report and a physician’s report; data from 2009 forward inlcudes ICD-9 codes A total of 1.8 TBytes of data Personal information has been redacted Clinical history and medication history are included Physician notes are captured in three fields: description, impression and correlation fields.