EECS Department, UC Berkeley

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Distributed Approximate Spectral Clustering for Large- Scale Datasets FEI GAO, WAEL ABD-ALMAGEED, MOHAMED HEFEEDA PRESENTED BY : BITA KAZEMI ZAHRANI 1.
On the Dimensionality of Face Space Marsha Meytlis and Lawrence Sirovich IEEE Transactions on PAMI, JULY 2007.
MEG Experiments Stimulation and Recording Setup Educational Seminar Institute for Biomagnetism and Biosignalanalysis February 8th, 2005.
Functional Link Network. Support Vector Machines.
Dale & Lewis Chapter 3 Data Representation. Representing color Similarly to how color is perceived in the human eye, color information is encoded in combinations.
Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.
Chapter 2: Pattern Recognition
Spatial and Temporal Data Mining
Using Error-Correcting Codes For Text Classification Rayid Ghani Center for Automated Learning & Discovery, Carnegie Mellon University.
Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Radial Basis Function Networks
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Feature Extraction Spring Semester, Accelerometer Based Gestural Control of Browser Applications M. Kauppila et al., In Proc. of Int. Workshop on.
-1- ICA Based Blind Adaptive MAI Suppression in DS-CDMA Systems Malay Gupta and Balu Santhanam SPCOM Laboratory Department of E.C.E. The University of.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
ICRA2009 Evaluation of a robot as embodied interface for Brain Computer Interface systems E. Menegatti, L. Tonin Intelligent Autonomous System Laboratory.
The Berlin Brain-Computer Interface: Machine Learning-Based Detection of User Specific Brain States Umar Farooq Berlin Brain Computer Interface.
Distributed Representative Reading Group. Research Highlights 1Support vector machines can robustly decode semantic information from EEG and MEG 2Multivariate.
Analysis of Movement Related EEG Signal by Time Dependent Fractal Dimension and Neural Network for Brain Computer Interface NI NI SOE (D3) Fractal and.
An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Chapter 8. Learning of Gestures by Imitation in a Humanoid Robot in Imitation and Social Learning in Robots, Calinon and Billard. Course: Robots Learning.
Data Mining and Decision Support
A Statistical Approach to Texture Classification Nicholas Chan Heather Dunlop Project Dec. 14, 2005.
Intelligent Systems Research Centre University of Ulster, Magee Campus BCI Research at the ISRC, University of Ulster N. Ireland, UK By Dr. Girijesh Prasad.
Distributed Signal Processing Woye Oyeyele March 4, 2003.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Methods for Dummies M/EEG Analysis: Contrasts, Inferences and Source Localisation Diana Omigie Stjepana Kovac.
CSE 4705 Artificial Intelligence
Unsupervised Learning of Video Representations using LSTMs
Semi-Supervised Clustering
Exploring Hyperdimensional Associative Memory
†UC Berkeley, ‡University of Bologna, and *ETH Zurich
Deep Learning Amin Sobhani.
Compact Bilinear Pooling
Randomness in Neural Networks
[Ran Manor and Amir B.Geva] Yehu Sapir Outlines Review
COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE
Hyperdimensional Computing with 3D VRRAM In-Memory Kernels: Device-Architecture Co-Design for Energy-Efficient, Error-Resilient Language Recognition H.
Instance Based Learning
Recognizing Deformable Shapes
Major Project Presentation Phase - I
Application of Independent Component Analysis (ICA) to Beam Diagnosis
Machine Learning Feature Creation and Selection
CMPT 733, SPRING 2016 Jiannan Wang
Optimizing Channel Selection for Seizure Detection
A Similarity Retrieval System for Multimodal Functional Brain Images
Text Categorization Assigning documents to a fixed set of categories
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Dynamic Causal Modelling for M/EEG
Clustering Wei Wang.
Attentional Modulations Related to Spatial Gating but Not to Allocation of Limited Resources in Primate V1  Yuzhi Chen, Eyal Seidemann  Neuron  Volume.
Human Cortical Neurons in the Anterior Temporal Lobe Reinstate Spiking Activity during Verbal Memory Retrieval  Anthony I. Jang, John H. Wittig, Sara.
Sahand Salamat, Mohsen Imani, Behnam Khaleghi, Tajana Šimunić Rosing
Adaptive multi-voxel representation of stimuli, rules and responses
Machine Learning for Visual Scene Classification with EEG Data
M/EEG Statistical Analysis & Source Localization
Volume 25, Issue 5, Pages (March 2015)
Timing, Timing, Timing: Fast Decoding of Object Information from Intracranial Field Potentials in Human Visual Cortex  Hesheng Liu, Yigal Agam, Joseph.
Exploring Hyperdimensional Associative Memory
Abbas Rahimi, Pentti Kanerva, Jan M. Rabaey UC Berkeley
BCI Research at the ISRC, University of Ulster N. Ireland, UK
Lecture 16. Classification (II): Practical Considerations
A Low-Cost EEG System-Based Hybrid Brain-Computer Interface for Humanoid Robot Navigation and Recognition Bongjae Choi, Sungho Jo Presented by Megan Fillion.
Presentation transcript:

EECS Department, UC Berkeley Hyperdimensional Computing for Noninvasive Brain–Computer Interfaces: Blind and One-Shot Classification of EEG Error-Related Potentials Abbas Rahimi, Pentti Kanerva, José del R. Millán, Jan M. Rabaey EECS Department, UC Berkeley IBI-STI, EPFL

Outline Architecture for Brain-Computer Interface (BCI) Electroencephalogram (EEG) error-related potentials Hyperdimensional Computing Basics Mapping to hypervectors and arithmetic Hyperdimensional computing examples Mapping EEG Electrodes to Hypervectors Temporal-Spatial Hyperdimensional Encoder Experimental Results Summary

General Architecture for Brain-Computer Interface (BCI) Hyperdimensional Computing 64 electrodes Two classes Goal: Using a brain-inspired computing model—hyperdimensional computing— to understand brain signals!

EEG Error-Related Potentials Error-related potential (ERP) as a backseat driver! A user monitors the performance of an external agent upon which the user has no control User provides no commands, but only monitors the agent's performance. To classify EEG ERPs: Baseline: spatial CAR preprocessing, per-subject selective electrodes, and statistical Gaussian model [Chavarriaga, et al, TNSRE’10] Our work: brain-inspired hyperdimensional computing Less preprocessing (no CAR filter) Blindly using all electrodes (no prior domain expert knowledge) Faster learning CAR: Common Average Reference

Experimental Protocol of ERPs Start t+1 Correct move t+2 Wrong move 2000 ms Red square as target location Green square as moving cursor Dotted square as cursor location at the previous time step At each trial the cursor moves horizontally to reach the target The probability of moving in the wrong direction is ~0.2

Brain-inspired Hyperdimensional Computing Hyperdimensional (HD) computing [P. Kanerva, Cognitive Computation’09]: Emulation of cognition by computing with high-dimensional vectors as opposed to computing with numbers Information distributed in high-dimensional space The algebra of hypervectors leads to a powerful model of computing Superb properties: General and scalable model of computing Well-defined set of arithmetic operations Fast and one-shot learning (no need of back-prop) Memory-centric with embarrassingly parallel operations Extremely robust against most failure mechanisms and noise Our aim is to develop an efficient and fast learning method based on HD computing to blindly operate with all electrodes and with raw data.

What Are Hypervectors? Distributed pattern–based data representations and arithmetic in contrast to computing with numbers! Hypervectors are: high-dimensional (e.g., 10,000 dimensions) (pseudo)random with i.i.d. components holographically distributed (i.e., not microcoded) Hypervectors can: use various coding: dense or sparse, bipolar or binary be combined using arithmetic operations: multiplication, addition, and permutation (MAP) be compared for similarity using distance metrics, e.g., Hamming distance

Mapping to Hypervectors Each “symbol” (e.g., a channel in EEG) is represented by a 10,000−D hypervector chosen at random: N1 = [−1 +1 −1 −1 −1 +1 −1 −1 ...] N2 = [+1 −1 +1 +1 +1 −1 +1 −1 ...] N3 = [−1 −1 −1 +1 +1 −1 +1 −1 ...] N4 = [−1 −1 −1 +1 +1 −1 +1 −1 ...] ... N64 = [−1 −1 +1 −1 +1 +1 +1 −1 ...] Every hypervector is dissimilar to others, e.g., ⟨N1, N2⟩ = 0 This assignment is fixed throughout computation Item Memory (iM) ‘Fp1’ N1 6 10,000

HD Arithmetic Addition (+) is good for representing sets, since sum vector is similar to its constituent vectors. ⟨A+B, A⟩=0.5 Multiplication (*) is good for binding, since product vector is dissimilar to its constituent vectors. ⟨A*B, A⟩=0 Permutation (ρ) makes a dissimilar vector by rotating, it is good for representing sequences. ⟨A, ρA⟩=0

Its Algebra is General: Architecture Can Be Reused Associative memory S1 5-bit 10K-bit Item memory Hand gestures: 5 classes S2 S3 S4 Encoder: MAP operations Associative memory Letter 8-bit 10K-bit Languages: 21 classes Item memory Encoder: MAP operations Applications n-grams HD Baseline Language identification [ISLPED’16] n=3 96.7% 97.9% Text categorization [DATE’16] n=5 94.2% 86.4% EMG gesture recognition [ICRC’16] n∈ [3,5] 97.8% 89.7% EEG brain-machine interface [BICT’17] n∈ [16,29] 74.5% 69.5%

Mapping an EEG Electrode to Hypervectors Item Memory (iM) maps channels to orthogonal hypervectors. Continuous iM (CiM) maps quantities continuously to hypervectors. CiM Quantization: 100 levels value Fp1 name iM ‘Fp1’

Temporal HD Encoder for one EEG Electrode 1st Electrode Preprocessing R1 * N1 Temporal Encoder … ρ L1,n L1,3 L1,2 L1,1 G1 BPF mean Quant 100 CiM level Fp1 name ‘Fp1’ iM CiM contains 100 hypervectors for continuous mapping (2 orthogonal hypervectors) iM contains 64 orthogonal hypervectors, one per electrode Temporal Encoder: Rotate (ρ) a signal level to capture its history producing a temporal n-gram (G1) Bind an electrode name (e.g., N1) to its temporal n-gram: N1 * G1 This binding produces a record R1 describing the electrode of interest

Temporal-Spatial HD Encoder 1st Electrode * Preprocessing Temporal Encoder BPF mean Quant 100 CiM ρ L1,n … ρ L1,3 ρ L1,2 L1,1 level + E Spatial Encoder G1 Fp1 name * R1 ‘Fp1’ iM N1 … …. … O2 * Temporal Encoder … ρ L64,n L64,3 L64,2 L64,1 R64 Preprocessing BPF iM level CiM mean Quan 100 ‘02’ name N64 G64 64th Electrode Generate a temporal-spatial hypervector across 64 electrodes by addition

Class Prototypes in Associative Memory “correct”/”wrong” Fp1 Temporal-Spatial Encoder Associative Memory W Cosine E ?+ … ?+ C O2 for 𝑒𝑎𝑐ℎ 𝑡𝑟𝑖𝑎𝑙: if 𝑙𝑎𝑏𝑒𝑙 == “𝑐𝑜𝑟𝑟𝑒𝑐𝑡” then 𝐶+=𝐸 cos 𝐶,𝐸 <0.5 if 𝑙𝑎𝑏𝑒𝑙=="𝑤𝑟𝑜𝑛𝑔" then 𝑊+=𝐸 cos 𝑊,𝐸 <0.5 HD computing shares the same structure for both training and testing!

Fast and One-shot Learning Training with only 2.6% of the total non-redundant trials: HD accuracy reaches to 79.3% (higher than the baseline using all training trials).

Fast and One-shot Learning by 6 Subjects HD classifier learns faster: it uses only 0.3% of the non-redundant training trials for S6, and up to 96% for S1. On average, HD classifier meets the target accuracy of 70% when trained with only 34% of non-redundant training trials.

Blindly Using All Electrodes w/o Preprocessing Compared to baseline: Using the same setup: HD has 5% higher accuracy Using all electrodes w/o CAR filter: HD has 2.2% higher accuracy

Summary An application of HD computing to the classification of error- related potentials from EEG recordings Classification of EEG error-related potentials is comparable to the baseline classifier crafted by a skilled professional: HD algorithm does the classification without requiring prior knowledge about the important channels for this task; HD uses all 64 channels while baseline selectively uses 1 or 2 channel(s) based on the subject HD algorithm uses lighter preprocessing (no CAR filter) HD achieves this task with fewer training data Open source HD code: https://github.com/abbas-rahimi/HDC-EEG-ERP

Acknowledgment This work was supported in part by Systems on Nanoscale Information fabriCs (SONIC), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA Intel Strategic Research Alliance (ISRA) program on Neuromorphic Architectures for Mainstream Computing