Vinit Shah, Joseph Picone and Iyad Obeid

Slides:



Advertisements
Similar presentations
Neural Networks and Kernel Methods
Advertisements

Introduction to Neural Networks Computing
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning Supervised Learning Classification and Regression
Neural networks and support vector machines
Welcome deep loria !.
Big data classification using neural network
Deep Learning Software: TensorFlow
Zheng ZHANG 1-st year PhD candidate Group ILES, LIMSI
Deep Feedforward Networks
Environment Generation with GANs
Lecture 3. Fully Connected NN & Hello World of Deep Learning
Computer Science and Engineering, Seoul National University
Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.
Boosting and Additive Trees (2)
Deep Learning Libraries
Intro to NLP and Deep Learning
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Mini Presentations - part 2
Neural networks (3) Regularization Autoencoder
Overview of TensorFlow
A VERY Brief Introduction to Convolutional Neural Network using TensorFlow 李 弘
Comparison Between Deep Learning Packages
Urban Sound Classification with a Convolution Neural Network
TensorFlow and Clipper (Lecture 24, cs262a)
Deep Learning Convoluted Neural Networks Part 2 11/13/
Tensorflow in Deep Learning
INF 5860 Machine learning for image classification
CSC 578 Neural Networks and Deep Learning
A critical review of RNN for sequence learning Zachary C
Hyperparameters, bias-variance tradeoff, validation
Optimizing Channel Selection for Seizure Detection
MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,
GATED RECURRENT NETWORKS FOR SEIZURE DETECTION
ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power
AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS
Long Short Term Memory within Recurrent Neural Networks
Neural Networks Geoff Hulten.
feature extraction methods for EEG EVENT DETECTION
LECTURE 42: AUTOMATIC INTERPRETATION OF EEGS
Artificial Neural Networks
LECTURE 41: AUTOMATIC INTERPRETATION OF EEGS
Tensorflow Tutorial Presented By :- Ankur Mali
Neural networks (1) Traditional multi-layer perceptrons
EEG Event Classification Using Deep Learning
Image Classification & Training of Neural Networks
Neural networks (3) Regularization Autoencoder
Deep Residual Learning for Automatic Seizure Detection
EEG Event Classification Using Deep Learning
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Hello Edge: Keyword Spotting on Microcontrollers Yundong Zhang, Naveen Suda, Liangzhen Lai and Vikas Chandra ARM Research, Stanford University arXiv.org,
Automatic Handwriting Generation
The Updated experiment based on LSTM
Introduction to Neural Networks
Deep Learning Libraries
Tensorflow and Keras (draft)
CSC 578 Neural Networks and Deep Learning
Algorithms in Bioinformatics
Deep Learning with TensorFlow
Patterson: Chap 1 A Review of Machine Learning
Machine Learning for Cyber
Presentation transcript:

Tutorials on Deploying Neural Network Architectures using Keras and Tensorflow Vinit Shah, Joseph Picone and Iyad Obeid Neural Engineering Data Consortium Temple University

Outline Brief introduction to Tensorflow and Keras Comparison between Numpy and TF Examples of NN architectures LR tricks Regularizers Model analysis

Introduction to Tensorflow Tensorflow provides primitives for defining functions on tensors and automatically computing their derivatives. But wait, what is a Tensor ? Tensors are multilinear maps from vector spaces to real numbers Tensorflow is funded by google and constantly evolving to optimize performance for any hardware specifications. (i.e. multigpu training, memory allocation, cpu support, etc.) Gives us ability to develop systems from ground up. TF core layers allows us to edit the properties of the fundamental neurons. TFLearn and Keras are the higher-level APIs which use Tensorflow as its backend. This gives researchers a convenient way to develop models in a few lines of code.

Introduction to Tensorflow Tensorflow uses a dataflow graph to represent the model computation in terms of the dependencies between individual operators. Primary elements of TF: Graph Session Ops Variable scopes Placeholders, constants, variables and operations are added to the graphs by default. It is a good practice to define a Graph object explicitly which allows us to create multiple graphs/models in the same script. ## initiate a new model New_graph = tf.Graph() with new_graph.as_default(): ## assign a constant to the graph new_g_const = tf.constant([1.0, 2.0]) Sessions encapsulate the environment where operations are executed.

Comparison between Numpy and TF Scopes are used for breaking down the complexity of the models into individual pieces. Numpy Vs. TF commands:

Keras Keras allows you to quickly build and test a neural network architectures with minimal effort. Beginners should start deploying models using Keras first. This same model will take 50+ lines of code in Tensorflow which is hard to understand and prune to errors as well. # Keras model template model = Sequential() model.add(Dense(300, activation='relu', input_dim=100)) model.add(Dense(30, activation='relu')) model.add(Dense(30, activation='relu')) model.add(Dense(4, activation=‘softmax')) model.compile(optimizer='rmsprop', loss=‘categorical_crossentropy', metrics=['accuracy']) # Train the model, iterating on the data in batches of 32 samples model.fit(data, labels, epochs=10, batch_size=32)

Example of Time-series Data Temporal and spatial information is observed during EEG event classification. Events observed on adjacent channels are correlated within its locality. Raw signals is too much information (with much higher entropy) for ML algorithms to learn. Spatial domain (electrode positioning) Temporal domain (seconds) Frequency domain features: 10 frames per second 0.2 second analysis window 9 base features including energy, differential energy, and 7 cepstral coefficients. 1st and 2nd derivatives are used for most features. Total dimension: 26.

NN Architectures CNN-LSTM model: EEG data is represented as an image with a 26- dimensional feature vector. CNN layers can learn spatial correlation between the channels. LSTM layers learn long term dependencies of the signal. Postprocessing includes setting a threshold for hypothesis events and filtering out short sequences. Context information is concatenated with each feature vector. The model learns 1 second of data per sample and 7 second long sequence using RNNs (LSTM).

LR Tricks Learning rate tricks used in NN architectures are: Constant LR Annealing LR Snapshot Ensembling Preconditioned LRs Random Initialization Error surface generated by training data is non-convex and includes multiple minimas. The best solution is to choose features and models which can virtually create as smoother surface as possible. Normalization of features in deep learning plays a very crucial role in terms of convergence and speed.

Regularizers Dropout layers: Forces neural network layer to generalize better instead of relying on a few specific sets of neurons during training. Kernel initializers: Initializing weight and bias variables using specific distribution (i.e. Normal) Early stopping: Avoid overfitting by stopping training earlier. Data augmentation: Introduction to noise in data can help develop robust models. L1 & L2 regularization: Penalizing the cost function by Ridge and/or Lasso regression techniques to avoid under/over fitting. Steep Error surface

Tensorboard Analysis Tensorboard utility helps us in analysis of the network by collecting statistics related to the graph, layers, gradients, etc.. It provides projection methods such as t-SNE to visualize higher dimensional data.