ICS 491 Big Data Analytics Fall 2017 Deep Learning

Slides:



Advertisements
Similar presentations
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Advertisements

Neural Networks Basic concepts ArchitectureOperation.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
Artificial Neural Networks
Deep Learning Neural Network with Memory (1)
Machine Learning Chapter 4. Artificial Neural Networks
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
Classification / Regression Neural Networks 2
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Machine Learning Supervised Learning Classification and Regression
Neural networks and support vector machines
Fall 2004 Backpropagation CS478 - Machine Learning.
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
RNNs: An example applied to the prediction task
CS 388: Natural Language Processing: Neural Networks
SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Deep Learning Amin Sobhani.
Recursive Neural Networks
Recurrent Neural Networks for Natural Language Processing
COMP24111: Machine Learning and Optimisation
Matt Gormley Lecture 16 October 24, 2016
Deep Learning: Model Summary
Intelligent Information System Lab
Neural networks (3) Regularization Autoencoder
Deep Learning Qing LU, Siyuan CAO.
CSE P573 Applications of Artificial Intelligence Neural Networks
Department of Electrical and Computer Engineering
CS621: Artificial Intelligence
RNNs: Going Beyond the SRN in Language Prediction
A critical review of RNN for sequence learning Zachary C
RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect
Advanced Artificial Intelligence
Neural Networks Advantages Criticism
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
A First Look at Music Composition using LSTM Recurrent Neural Networks
Recurrent Neural Networks
Chapter 3. Artificial Neural Networks - Introduction -
CS 4501: Introduction to Computer Vision Training Neural Networks II
CSC 578 Neural Networks and Deep Learning
ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
CSE 573 Introduction to Artificial Intelligence Neural Networks
Understanding LSTM Networks
Neural Networks Geoff Hulten.
Other Classification Models: Recurrent Neural Network (RNN)
Lecture 16: Recurrent Neural Networks (RNNs)
Representation Learning with Deep Auto-Encoder
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
RNNs: Going Beyond the SRN in Language Prediction
Backpropagation Disclaimer: This PPT is modified based on
Artificial Neural Networks
Attention.
实习生汇报 ——北邮 张安迪.
Neural networks (3) Regularization Autoencoder
LSTM: Long Short Term Memory
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Introduction to Neural Networks
Baseline Model CSV Files Pandas DataFrame Sentence Lists
PYTHON Deep Learning Prof. Muhammad Saeed.
Modeling Price of ethereum with machine learning
Presentation transcript:

ICS 491 Big Data Analytics Fall 2017 Deep Learning Assoc. Prof. Lipyeow Lim Information & Computer Sciences University of Hawai`i at Manoa

Single Perceptron xT = [ x1 x2 … xn ] wT = [ w1 w2 … wn ] h = a(xTw) If activation function is rectified linear, a(i) = max(0,i) Sigmoid, a(i) = 1/(1+e-i) x1 w1 x2 w2 a h : wn xn Bias can be incorporated either as an additional weight with a 1-input or explicitly h = a(xTw + b)

Multiple Perceptron xT = [ x1 x2 … xn ] W = [ w1,1 … wk,1 ] [ … … … ] [ w1,n … wk,n ] Vector h = a(xTW) assume activation function is applied to each element of vector x1 w1,1 w1,2 a1 h1 w1,n x2 : wk,1 : wk,2 ak hk wk,n xn For notational convenience, also possible to formulate as h = a(WTx)

Multiple-Layer Perceptron xT = [ x1 x2 … xn ] W1=[ w1,1,1 … w1,k,1 ] [ w1,1,2 … w1,k,2 ] [ … …… ] [ w1,1,n … w1,k,,n ] h1 = a(xTW1) h2 = a(hTW2) x1 w1,1,1 w1,1,2 w2,1,1 a1,1 h1,1 w1,1,n x2 : a2,1 h2,1 w1,k,1 : w1,k,2 a1,k h1,k w2,1,n w1,k,n xn W1 W2 x a h1 a h2 Compressed Representation

Recurrent Neural Networks Unroll over time Modeling time-series data or data with sequential dependencies (eg. text/language). Feed previous “state of network” as additional input to current network

RNN Equations Hidden to output weights Regular NN weights What happens when you do gradient descent and back propagation? Recurrent loop weights

RNN to LSTM Persisting “memory” this way with backpropagation is problematic! Adapted from : http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Long Short Term Memory (LSTM)

Cell States & Memory Gates

Step 1: Forget Gate Layer

Step 2: Compute new Cell State

Step 3: How much of cell state to propagate

Step 4: Compute output of this time step

Variant : Peep Hole Connections

Variant: couple forget & input together

Variant: Gated Recurrent Units (GRU)

Unsupervised Learning with Auto-encoders Objective is to minimize reconstruction error Input data Compressed Representation Reconstruction

Auto-encoder for initializing deep networks Pretraining step: train a sequence of shallow autoencoders, greedily one layer at a time, using unsupervised data, Fine-tuning step 1: train the last layer using supervised data, Fine-tuning step 2: use backpropagation to ne- tune the entire network using supervised data. The reason is that training very deep neural networks is difficult: The magnitudes of gradients in the lower layers and in higher layers are different, The landscape or curvature of the objective function is difficult for stochastic gradient descent to find a good local optimum, Deep networks have many parameters, which can remember training data and do not generalize well.

Parallel Processing for Neural Networks Model Parallelism Neural Network model is too big for one machine Partition model across multiple machines What kind of co-ordination is needed? Data Parallelism Partition data across multiple machines

Model + Data Parallelism