Deep neural networks for spike sorting: exploring options

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Caroline Rougier, Jean Meunier, Alain St-Arnaud, and Jacqueline Rousseau IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5,
GWDAW 16/12/2004 Inspiral analysis of the Virgo commissioning run 4 Leone B. Bosi VIRGO coalescing binaries group on behalf of the VIRGO collaboration.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
A Novel Approach for Recognizing Auditory Events & Scenes Ashish Kapoor.
Spike Sorting Goal: Extract neural spike trains from MEA electrode data Method 1: Convolution of template spikes Method 2: Sort by spikes features.
Spike Sorting I: Bijan Pesaran New York University.
G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.
F 鍾承道 Acoustic Features for Speech Recognition: From Mel-Frequency Cepstrum Coefficients (MFCC) to BottleNeck Features(BNF)
Chapter 1: Introduction to Pattern Recognition
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Spike Sorting Algorithm implemented on FPGA Elad Ilan Asaf Gal Sup: Alex Z.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
LE 460 L Acoustics and Experimental Phonetics L-13
CSE 185 Introduction to Computer Vision Pattern Recognition.
Pattern Recognition Vidya Manian Dept. of Electrical and Computer Engineering University of Puerto Rico INEL 5046, Spring 2007
1/25 Current results and future scenarios for gravitational wave’s stochastic background G. Cella – INFN sez. Pisa.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj NIPS 2009.
1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 12 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
Multidimensional classification analysis of kleine Welle triggers in LIGO S5 run Soma Mukherjee for the LSC University of Texas at Brownsville GWDAW12,
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
[Chaos in the Brain] Nonlinear dynamical analysis for neural signals Jaeseung Jeong, Ph.D Department of Bio and Brain Engineering, KAIST.
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
What Is Cluster Analysis?
Deep Feedforward Networks
Correlation Signal Processing ES & BM.
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Speaker Classification through Deep Learning
Feature Mapping FOR SPEAKER Diarization IN NOisy conditions
Presenter: Ibrahim A. Zedan
Session 7: Face Detection (cont.)
Ch8: Nonparametric Methods
Conditional Random Fields for ASR
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
Overview of Supervised Learning
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
4.2 Data Input-Output Representation
A 100 µW, 16-Channel, Spike-Sorting ASIC with On-the-Fly Clustering
INF 5860 Machine learning for image classification
A Fully Automated Approach to Spike Sorting
Neural Networks and Their Application in the Fields of Coporate Finance By Eric Séverin Hanna Viinikainen.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Creating Data Representations
EE513 Audio Signals and Systems
Artificial Intelligence Lecture No. 28
Deep Neural Networks for Onboard Intelligence
Analysis of Audio Using PCA
Spike Sorting for Extracellular Recordings
Clustering Wei Wang.
CS4670: Intro to Computer Vision
John H.L. Hansen & Taufiq Al Babba Hasan
Machine Learning for Visual Scene Classification with EEG Data
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Presenter: Shih-Hsiang(士翔)
Object Detection Implementations
Combination of Feature and Channel Compensation (1/2)
Report 7 Brandon Silva.
Presentation transcript:

Deep neural networks for spike sorting: exploring options Jeremy Magland -- with Joakim Anden and Eftychios Pnevmatikakis 29 March 2019 Center for Computational Mathematics Flatiron Institute, Simons Foundation Spike sorting collaborators: James Jun, Alex Barnett

Deep neural networks for spike sorting? Overview of this presentation Brief overview of the spike sorting problem and its challenges Describe speech separation problem, clustering via deep embedding, and its relationship to spike sorting Next steps -- discuss possible ways forward

Spike sorting overview First I will give a brief overview of the spike sorting problem and a description of our efforts for comparing and validating various spike sorting algorithms using a framework called SpikeForest.

Spike Sorting Sort into clusters representing individual neurons Detect neural firing events Sort into clusters representing individual neurons Electrophysiology recording (multi-channel timeseries)

Spike sorting as a black box INPUT MxN Array OUTPUT Collection of spike times and neuron labels Spike sorting Which neuron fired when? 5

Many spike sorting packages... how to choose?

Evaluation OUTPUT INPUT Spike sorting Ground truth MxN Array OUTPUT Collection of spike times and neuron labels Spike sorting Ground truth true spike times and labels Comparison Accuracy results 7

SpikeForest website (in progress) With: Alex Barnett, James Jun, Liz Lovero Registered hundreds of ephys recordings (simulation and real) with ground-truth info Wrapped several algorithms (MountainSort, IronClust, Spyking Circus, Yass, KiloSort, KiloSort2, …) Framework for running analyses, comparing with ground truth, auto-caching of results Public website for interactive exploration of results. Reproducible, transparent, open-source

Spike sorting challenges Next I will touch upon some of the challenges associated with spike sorting.

Spike sorting challenges Presence of noise (or background signal) Unknown event times (detection) More than one unit (clustering) Spike waveforms reside in a high-dimensional space Non-Gaussian clusters (waveform variation) Drift Bursting Overlapping events Large electrode arrays (many channels)

The pure detection problem (Case of a single unit) No noise Somewhat noisy Very noisy

The pure clustering problem (Case where event times are known) Feature extraction Clustering in low-dim feature space

Clustering via ISO-SPLIT ISO-SPLIT is our density-based method for clustering in low dimensions. Will neural nets be able to handle this aspect of the spike sorting?

High-dimensional space 50 timepoints x 8 electrode channels = 400 dimensions Or maybe we can bypass the feature extraction step and directly utilize the full spike waveforms in higher dimensions.

Non-Gaussian cluster distributions Clusters can have non- Gaussian shapes due to variations in spike waveforms and the complex background signal (noise) distribution.

Spike waveform drift and bursting Spike amplitude Time (duration ~10 minutes) Spike amplitude vs. time for a single neuron. Each point is a spike event. Color represents the results of spike sorting. Drift and bursting can cause false splitting of clusters.

Overlapping spikes When spikes overlap in both time and space, waveform shapes may be distorted.

Large electrode arrays (high channel counts)

Speech blind source separation Next I will describe the problem of speech separation (audio signals) and the similarities / differences between it and spike sorting.

Time (duration ~4 seconds) Speech separation Speaker 1 Frequency Speaker 2 Speaker 2 Frequency Speakers 1&2 mixed Frequency Time (duration ~4 seconds)

Speech separation Procedure Speakers 1&2 mixed Procedure Classify the time-frequency bins in the spectrogram of the mixture as belonging to one or the other speakers (Clustering) Apply masks to the spectrogram Reconstruct the unmixed signals

Speech separation via deep clustering Spectrogram (mixture of three speakers) Ideal binary masks Output binary masks (network trained on 2-speaker mixtures) Hershey, John R., et al. "Deep clustering: Discriminative embeddings for segmentation and separation." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Speech separation via deep clustering Training Obtain audio recordings of mixed speech where ideal spectrogram histogram binning is known (WSJ collection) Split spectrogram into 100-frame segments (~800 ms, single word) Neural network: embed each segment in high-dimensional feature space Optimize separability of the known clusters in the embedding space. Speech separation Audio recording of mixed speech with unknown histogram binning Split spectrogram in 100-frame segments (as above) Neural network: embed each segment in high-dimensional space Cluster entire recording (union of all the bins in all the segments) in the embedding space (e.g., k-means) to obtain spectrogram masks.

Deep clustering -- training the embedding Spectrogram segment 100 x M 100 = # time frames per segment M = # frequences Spectrogram (mixture of two speakers) Mask (first speaker) Embedding via neural net 1st embedding dimension 2nd embedding dimension Spectrogram segment 100 x M x D D = dimension of embedding space 3rd embedding dimension Single segment Hershey, John R., et al. "Deep clustering: Discriminative embeddings for segmentation and separation." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Deep clustering -- training the embedding Spectrogram segment 100 x M 100 = # time frames per segment M = # frequences Embedding optimizes the separability (or clusterability) of all the bins in the D-dimensional embedding space based on the known speaker assignments. For example: Minimize distances between points within the same cluster. Maximize distances between points in different clusters. Embedding via neural net Spectrogram segment 100 x M x D D = dimension of embedding space

Deep clustering -- applying the network for speech separation Spectrogram (mixture of two speakers) Mask (first speaker) ❵ 1st embedding dimension Cluster all the bins in the histogram in the D-dimensional space to obtain the masks for the speakers 2nd embedding dimension 3rd embedding dimension

Back to ephys Now we’ll go back to electrophysiology and spike sorting and describe the relationship between it and speech separation.

Electrophysiology Simulated 32 channels 200 milliseconds 30 KHz 40 units Firing rates: 5-10 Hz

Electrophysiology vs. speech 32 channels 200 milliseconds 30 KHz

Ephys vs speech The primary difficulty is that in spike sorting the local context around the spike does not help at all in characterizing the spike, aside from the shape of the ~1 millisecond spike itself. Therefore the technology cannot directly transfer.

Need to use long time-scale associations in some way … ...

Possible approach Timeseries segment M x T M = # electrode channels T = # timepoints in segment Single segment Incorporate longer time-scale information Embedding via neural net Perhaps use nearest neighbor segments within +/- 30 seconds. Spectrogram segment M x T x D D = dimension of embedding space

Thank you! 33