Big Data Resources for EEGs: Enabling Deep Learning Research

Slides:



Advertisements
Similar presentations
Senior Consultant Neurologist Singapore General Hospital
Advertisements

Abstract Electrical activity in the cortex can be recorded by surface electrodes. Electro Encephalography (EEG) machine records potential difference between.
Manual Interpretation of EEGs: A Machine Learning Perspective Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data Consortium College.
Get Involved Download and use the data Take the survey Join the user group Neural Engineering Data Consortium Iyad Obeid PhD, Joseph Picone PhDTemple University,
For Neurology Residents
Electroencephalography
Take the Survey! Big data needs? How could membership benefit you? Automatic Interpretation of EEGs Statistics Acknowledgements DARPA/MTO (D13AP00065)
Corpus Development EEG signal files and reports had to be manually paired, de-identified and annotated: Corpus Development EEG signal files and reports.
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
Abstract EEGs, which record electrical activity on the scalp using an array of electrodes, are routinely used in clinical settings to.
Automatic Labeling of EEGs Using Deep Learning M. Golmohammadi, A. Harati, S. Lopez I. Obeid and J. Picone Neural Engineering Data Consortium College of.
Data Processing Machine Learning Algorithm The data is processed by machine algorithms based on hidden Markov models and deep learning. They are then utilized.
Analysis of Temporal Lobe Paroxysmal Events Using Independent Component Analysis Jonathan J. Halford MD Department of Neuroscience, Medical University.
Abstract The emergence of big data and deep learning is enabling the ability to automatically learn how to interpret EEGs from a big data archive. The.
Acknowledgements This research was also supported by the Brazil Scientific Mobility Program (BSMP) and the Institute of International Education (IIE).
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
THE TUH EEG CORPUS: The Largest Open Source Clinical EEG Corpus Iyad Obeid and Joseph Picone Neural Engineering Data Consortium Temple University Philadelphia,
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
Manual Interpretation of EEGs: A Machine Learning Perspective Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data Consortium College.
Exploration of Instantaneous Amplitude and Frequency Features for Epileptic Seizure Prediction Ning Wang and Michael R. Lyu Dept. of Computer Science and.
Data Acquisition An EEG measurement represents a difference between the voltages at two electrodes. The signal is usually displayed using a montage which.
The Goal Use computers to aid physicians in diagnosis of neurological diseases, particularly epilepsy Detect pathological events in real time Currently,
TUH EEG Corpus Data Analysis 38,437 files from the Corpus were analyzed. 3,738 of these EEGs do not contain the proper channel assignments specified in.
Big Mechanism for Processing EEG Clinical Information on Big Data Aim 1: Automatically Recognize and Time-Align Events in EEG Signals Aim 2: Automatically.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Abstract Automatic detection of sleep state is important to enhance the quick diagnostic of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Demonstration A Python-based user interface: Waveform and spectrogram views are supported. User-configurable montages and filtering. Scrolling by time.
Generating and Using a Qualified Medical Knowledge Graph for Patient Cohort Retrieval from Big Clinical Electroencephalography (EEG) Data Sanda Harabagiu,
Market: Customer Survey: 57 clinicians from academic medical centers and community hospitals, and 44 industry professionals. Primary Customer Need: 70%
Data Analysis Generation of the corpus statistics was accomplished through the analysis of information contained in the EDF headers. Figure 4 shows some.
Abstract Automatic detection of sleep state is an important queue in accurate detection of sleep conditions. The analysis of EEGs is a difficult time-consuming.
The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Results from Mean and Variance Calculations The overall mean of the data for all features was for the REF class and for the LE class. The.
Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.
Human Language Technology Research Institute
Data Mining for Surveillance Applications Suspicious Event Detection
Scalable EEG interpretation using Deep Learning and Schema Descriptors
BioSignal Analytics Inc.
Automated Identification of Abnormal Adult EEG
1. EEG source cortical pyramidal cells
CLASSIFICATION OF SLEEP EVENTS IN EEG’S USING HIDDEN MARKOV MODELS
G. Suarez, J. Soares, S. Lopez, I. Obeid and J. Picone
Enhanced Visualizations for Improved Real-Time EEG Monitoring
THE TUH EEG SEIZURE CORPUS
Intraoperative Electrocorticography in Temporal Lobe Epilepsy Surgery
Automatic Sleep Stage Classification using a Neural Network Algorithm
Epileptic Seizure Prediction
Human Language Technology Research Institute
Enhanced Visualizations for Improved Real-Time EEG Monitoring
N. Capp, E. Krome, I. Obeid and J. Picone
Improving Diagnostic Accuracy in Epilepsy
Optimizing Channel Selection for Seizure Detection
EEG Recognition Using The Kaldi Speech Recognition Toolkit
To learn more, visit The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions.
AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS
E. von Weltin, T. Ahsan, V. Shah, D. Jamshed, M. Golmohammadi, I
Automatic Interpretation of EEGs for Clinical Decision Support
feature extraction methods for EEG EVENT DETECTION
LECTURE 42: AUTOMATIC INTERPRETATION OF EEGS
Northeast Regional Epilepsy Group Christos Lambrakis M.D.
LECTURE 41: AUTOMATIC INTERPRETATION OF EEGS
Data Mining for Surveillance Applications Suspicious Event Detection
Human Language Technology Research Institute
EEG Event Classification Using Deep Learning
A Dissertation Proposal by: Vinit Shah
Deep Residual Learning for Automatic Seizure Detection
EEG Event Classification Using Deep Learning
Presentation transcript:

Big Data Resources for EEGs: Enabling Deep Learning Research L. Veloso, J. McHugh, E. von Weltin, S. Lopez, I. Obeid and J. Picone The Neural Engineering Data Consortium, Temple University College of Engineering Temple University www.nedcdata.org Abstract The Temple University Hospital Electroencephalo- graphy Corpus (TUHEEG) is the world’s largest open source EEG corpus of its kind. Several important subsets of the data that are designed to support research in specific subspecialties of EEG analysis are: TUH EEG Seizure Corpus: created for automatic seizure detection research; has been manually annotated for seizure events; events are classified by type (e.g., tonic) and subtype (e.g., ICU), and duration (e.g., routine or LTM). TUH Abnormal EEG Corpus: supports research on classification of abnormal EEGs; includes patients ranging in age from 10-100 and many challenging benign conditions. TUH EEG Slowing Corpus: developed to aid in the development of a tool that can differentiate between slowing at the end of a seizure and an independent non-seizure slowing event. TUH EEG Epilepsy Corpus: contains 436 sessions from 100 patients with epilepsy and 134 sessions from 100 patients without epilepsy. These data are open source and freely available at https://www.isip.piconepress.com/projects/tuh_eeg/ downloads/. There are more than 650 registered users of these resources, making them one of the most popular resources in the research community. The TUH EEG Corpus (v1.0.0) The master corpus: contains all EEG sessions collected at TUH from 2002 to 2015. Data collection is ongoing (increasing at a rate of 3K sessions/yr). The corpus includes a rich and diverse set of patients and medical histories: The signal data totals 977 GB, while the reports are a total of 93 MB. Over 40 unique channel configurations; 90% of the database consists of Averaged Reference (AR) and Linked Ear (LE) EEGs; 95% of the data conforms to a standard 10/20 EEG configuration. The TUH Abnormal EEG Corpus All EDF files are 15 minutes or longer Each seizure event is classified by both a student annotator and a certified neurologist Positive agreement between the two groups was 97% and negative agreement was 1% or lower. The TUH EEG Epilepsy Corpus Patients were filtered based on criteria in the reports that indicated signs of epilepsy as determined by neurologists from NIH. Reports were searched for keywords and medications that are indicative of epilepsy: Medications that normally indicate a history of epilepsy (e.g., Keppra, Levetircatam, Vimpat). Keywords that indicate seizure behavior: Search results were manually reviewed to make sure they conformed to the requirements. Data is being used to correlate seizures in EEG signals with patterns in interictal EKGs. This corpus represents one of the first efforts to subdivide TUH EEG for use in machine learning. Typical Inclusion Criteria Typical Exclusion Criteria Spike and wave No sharp wave or spike Sharp wave Single sharp wave or spike Sharp waves No focal or epileptiform Spike Left anterior temporal sharp wave Abnormal: epileptiform features, such as spike and wave discharges, are present at the vertex of the scalp. Normal: eye blink artifacts and posterior dominant rhythm (PDR) are both normal features. No. of Patients No. of Sessions Total Duration Training 2,132 2,740 1,045 hours Evaluation 253 277 103 hours Epilepsy Diagnosis No. of Patients No. of Sessions Total Duration Yes 100 436 351 hours No 134 72 hours No. of Patients No. of Sessions Total Duration 13,551 23,257 15,968 hours Introduction The data is harvested from clinical recordings collected at a large urban public hospital (TUH). All data and reports have been rigorously deidentified to ensure the data is HIPAA compliant. Each EEG session includes EEG signal data, a neurologist’s report, and annotation information required to conduct machine learning research. All EEGs are collected in a clinical setting, meaning they have non-epileptic features such as muscle artifacts and patient movements. Each file is manually annotated by a team of neuroscience students; each session is typically reviewed by at least three annotators. Annotator accuracy has been validated against board-certified clinicians (Kappa ~ 0.8). The TUH EEG Seizure Corpus (v1.2.0) A subset of TUHEEG created to support deep learning research into automatic seizure detection. Each file has been reviewed and manually annotated by a team of students for seizure events (start time, stop time, type of seizure). Each session is classified by it’s type: Inpatient, Outpatient, EMU, and ICU. Each session has a subtype: ER, OR, General, Outpatient, EMU, and the different ICUs: BURN, CICU, ICU, NICU, NSICU, PICU, RICU, and SICU. Both routine (20 min.) and long-term (24 hr.) EEGs. The TUH EEG Slowing Corpus Created to aid in the differentiation of seizure events and slowing events in machine learning, which is the most common single error modality. Contains 100 samples of seizure, independent slowing, and complex background events. The samples of slowing and complex background were collected manually, while the seizure events were taken from the TUH EEG Seizure Corpus. Each sample is 10 seconds long to facilitate simple machine learning experiments using neural networks, which prefer fixed-length patterns. Summary The TUH EEG Corpus is enabling the application of state of the art machine learning algorithms to problems such as seizure detection. The open source nature of the data (e.g., no IRB or data-sharing agreements are required) makes it accessible to a large community of researchers. There are currently over 650 registered users of these resources. Plans for 2018 include an open Kaggle- style competition on seizure detection. Future plans include: Data collection at TUH will continue for at least the next three years, growing the corpus at a rate of at least 3,000 sessions per year. Parsed medical reports that contain medical concepts, their attributes, and knowledge representations that describe how these concepts relate to one another. A digital pathology corpus of 1M images! Acknowledgements Research reported in this publication was most recently supported by the National Human Genome Research Institute of the National Institutes of Health under award number U01HG008468. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Post-ictal slowing which is observed at the termination of many seizure events. Intermittent electrographic slowing that can cause false alarms. No. of Patients No. of Sessions No. of Seizures Training 196 456 1,303 Evaluation 50 230 649 No. of Unique Patients No. of Sessions Total No. of Events 38 75 100