Vinit Shah, Joseph Picone and Iyad Obeid

Slides:

Advertisements

Similar presentations

Neural Networks and Kernel Methods

Advertisements

Introduction to Neural Networks Computing

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.

11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering

Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.

Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Machine Learning Supervised Learning Classification and Regression

Neural networks and support vector machines

Welcome deep loria !.

Big data classification using neural network

Deep Learning Software: TensorFlow

Zheng ZHANG 1-st year PhD candidate Group ILES, LIMSI

Deep Feedforward Networks

Environment Generation with GANs

Lecture 3. Fully Connected NN & Hello World of Deep Learning

Computer Science and Engineering, Seoul National University

Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.

Boosting and Additive Trees (2)

Deep Learning Libraries

Intro to NLP and Deep Learning

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Mini Presentations - part 2

Neural networks (3) Regularization Autoencoder

Overview of TensorFlow

A VERY Brief Introduction to Convolutional Neural Network using TensorFlow 李弘

Comparison Between Deep Learning Packages

Urban Sound Classification with a Convolution Neural Network

TensorFlow and Clipper (Lecture 24, cs262a)

Deep Learning Convoluted Neural Networks Part 2 11/13/

Tensorflow in Deep Learning

INF 5860 Machine learning for image classification

CSC 578 Neural Networks and Deep Learning

A critical review of RNN for sequence learning Zachary C

Hyperparameters, bias-variance tradeoff, validation

Optimizing Channel Selection for Seizure Detection

MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,

GATED RECURRENT NETWORKS FOR SEIZURE DETECTION

ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power

AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS

Long Short Term Memory within Recurrent Neural Networks

Neural Networks Geoff Hulten.

feature extraction methods for EEG EVENT DETECTION

LECTURE 42: AUTOMATIC INTERPRETATION OF EEGS

Artificial Neural Networks

LECTURE 41: AUTOMATIC INTERPRETATION OF EEGS

Tensorflow Tutorial Presented By :- Ankur Mali

Neural networks (1) Traditional multi-layer perceptrons

EEG Event Classification Using Deep Learning

Image Classification & Training of Neural Networks

Neural networks (3) Regularization Autoencoder

Deep Residual Learning for Automatic Seizure Detection

EEG Event Classification Using Deep Learning

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Hello Edge: Keyword Spotting on Microcontrollers Yundong Zhang, Naveen Suda, Liangzhen Lai and Vikas Chandra ARM Research, Stanford University arXiv.org,

Automatic Handwriting Generation

The Updated experiment based on LSTM

Introduction to Neural Networks

Deep Learning Libraries

Tensorflow and Keras (draft)

CSC 578 Neural Networks and Deep Learning

Algorithms in Bioinformatics

Deep Learning with TensorFlow

Patterson: Chap 1 A Review of Machine Learning

Machine Learning for Cyber

Presentation transcript:

Tutorials on Deploying Neural Network Architectures using Keras and Tensorflow Vinit Shah, Joseph Picone and Iyad Obeid Neural Engineering Data Consortium Temple University

Outline Brief introduction to Tensorflow and Keras Comparison between Numpy and TF Examples of NN architectures LR tricks Regularizers Model analysis

Introduction to Tensorflow Tensorflow provides primitives for defining functions on tensors and automatically computing their derivatives. But wait, what is a Tensor ? Tensors are multilinear maps from vector spaces to real numbers Tensorflow is funded by google and constantly evolving to optimize performance for any hardware specifications. (i.e. multigpu training, memory allocation, cpu support, etc.) Gives us ability to develop systems from ground up. TF core layers allows us to edit the properties of the fundamental neurons. TFLearn and Keras are the higher-level APIs which use Tensorflow as its backend. This gives researchers a convenient way to develop models in a few lines of code.

Introduction to Tensorflow Tensorflow uses a dataflow graph to represent the model computation in terms of the dependencies between individual operators. Primary elements of TF: Graph Session Ops Variable scopes Placeholders, constants, variables and operations are added to the graphs by default. It is a good practice to define a Graph object explicitly which allows us to create multiple graphs/models in the same script. ## initiate a new model New_graph = tf.Graph() with new_graph.as_default(): ## assign a constant to the graph new_g_const = tf.constant([1.0, 2.0]) Sessions encapsulate the environment where operations are executed.

Comparison between Numpy and TF Scopes are used for breaking down the complexity of the models into individual pieces. Numpy Vs. TF commands:

Keras Keras allows you to quickly build and test a neural network architectures with minimal effort. Beginners should start deploying models using Keras first. This same model will take 50+ lines of code in Tensorflow which is hard to understand and prune to errors as well. # Keras model template model = Sequential() model.add(Dense(300, activation='relu', input_dim=100)) model.add(Dense(30, activation='relu')) model.add(Dense(30, activation='relu')) model.add(Dense(4, activation=‘softmax')) model.compile(optimizer='rmsprop', loss=‘categorical_crossentropy', metrics=['accuracy']) # Train the model, iterating on the data in batches of 32 samples model.fit(data, labels, epochs=10, batch_size=32)

Example of Time-series Data Temporal and spatial information is observed during EEG event classification. Events observed on adjacent channels are correlated within its locality. Raw signals is too much information (with much higher entropy) for ML algorithms to learn. Spatial domain (electrode positioning) Temporal domain (seconds) Frequency domain features: 10 frames per second 0.2 second analysis window 9 base features including energy, differential energy, and 7 cepstral coefficients. 1st and 2nd derivatives are used for most features. Total dimension: 26.

NN Architectures CNN-LSTM model: EEG data is represented as an image with a 26- dimensional feature vector. CNN layers can learn spatial correlation between the channels. LSTM layers learn long term dependencies of the signal. Postprocessing includes setting a threshold for hypothesis events and filtering out short sequences. Context information is concatenated with each feature vector. The model learns 1 second of data per sample and 7 second long sequence using RNNs (LSTM).

LR Tricks Learning rate tricks used in NN architectures are: Constant LR Annealing LR Snapshot Ensembling Preconditioned LRs Random Initialization Error surface generated by training data is non-convex and includes multiple minimas. The best solution is to choose features and models which can virtually create as smoother surface as possible. Normalization of features in deep learning plays a very crucial role in terms of convergence and speed.

Regularizers Dropout layers: Forces neural network layer to generalize better instead of relying on a few specific sets of neurons during training. Kernel initializers: Initializing weight and bias variables using specific distribution (i.e. Normal) Early stopping: Avoid overfitting by stopping training earlier. Data augmentation: Introduction to noise in data can help develop robust models. L1 & L2 regularization: Penalizing the cost function by Ridge and/or Lasso regression techniques to avoid under/over fitting. Steep Error surface

Tensorboard Analysis Tensorboard utility helps us in analysis of the network by collecting statistics related to the graph, layers, gradients, etc.. It provides projection methods such as t-SNE to visualize higher dimensional data.