ICS 491 Big Data Analytics Fall 2017 Deep Learning

Slides:

Advertisements

Similar presentations

Tuomas Sandholm Carnegie Mellon University Computer Science Department

Advertisements

Neural Networks Basic concepts ArchitectureOperation.

Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.

Artificial Neural Networks

Deep Learning Neural Network with Memory (1)

Machine Learning Chapter 4. Artificial Neural Networks

Backpropagation An efficient way to compute the gradient Hung-yi Lee.

Classification / Regression Neural Networks 2

Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.

Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Xintao Wu University of Arkansas Introduction to Deep Learning 1.

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Today’s Lecture Neural networks Training

Machine Learning Supervised Learning Classification and Regression

Neural networks and support vector machines

Fall 2004 Backpropagation CS478 - Machine Learning.

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.

RNNs: An example applied to the prediction task

CS 388: Natural Language Processing: Neural Networks

SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.

Deep Learning Amin Sobhani.

Recursive Neural Networks

Recurrent Neural Networks for Natural Language Processing

COMP24111: Machine Learning and Optimisation

Matt Gormley Lecture 16 October 24, 2016

Deep Learning: Model Summary

Intelligent Information System Lab

Neural networks (3) Regularization Autoencoder

Deep Learning Qing LU, Siyuan CAO.

CSE P573 Applications of Artificial Intelligence Neural Networks

Department of Electrical and Computer Engineering

CS621: Artificial Intelligence

RNNs: Going Beyond the SRN in Language Prediction

A critical review of RNN for sequence learning Zachary C

RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect

Advanced Artificial Intelligence

Neural Networks Advantages Criticism

Image Captions With Deep Learning Yulia Kogan & Ron Shiff

A First Look at Music Composition using LSTM Recurrent Neural Networks

Recurrent Neural Networks

Chapter 3. Artificial Neural Networks - Introduction -

CS 4501: Introduction to Computer Vision Training Neural Networks II

CSC 578 Neural Networks and Deep Learning

ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.

CSE 573 Introduction to Artificial Intelligence Neural Networks

Understanding LSTM Networks

Neural Networks Geoff Hulten.

Other Classification Models: Recurrent Neural Network (RNN)

Lecture 16: Recurrent Neural Networks (RNNs)

Representation Learning with Deep Auto-Encoder

Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions

RNNs: Going Beyond the SRN in Language Prediction

Backpropagation Disclaimer: This PPT is modified based on

Artificial Neural Networks

实习生汇报 ——北邮张安迪.

Neural networks (3) Regularization Autoencoder

LSTM: Long Short Term Memory

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.

Introduction to Neural Networks

Baseline Model CSV Files Pandas DataFrame Sentence Lists

PYTHON Deep Learning Prof. Muhammad Saeed.

Modeling Price of ethereum with machine learning

Presentation transcript:

ICS 491 Big Data Analytics Fall 2017 Deep Learning Assoc. Prof. Lipyeow Lim Information & Computer Sciences University of Hawai`i at Manoa

Single Perceptron xT = [ x1 x2 … xn ] wT = [ w1 w2 … wn ] h = a(xTw) If activation function is rectified linear, a(i) = max(0,i) Sigmoid, a(i) = 1/(1+e-i) x1 w1 x2 w2 a h : wn xn Bias can be incorporated either as an additional weight with a 1-input or explicitly h = a(xTw + b)

Multiple Perceptron xT = [ x1 x2 … xn ] W = [ w1,1 … wk,1 ] [ … … … ] [ w1,n … wk,n ] Vector h = a(xTW) assume activation function is applied to each element of vector x1 w1,1 w1,2 a1 h1 w1,n x2 : wk,1 : wk,2 ak hk wk,n xn For notational convenience, also possible to formulate as h = a(WTx)

Multiple-Layer Perceptron xT = [ x1 x2 … xn ] W1=[ w1,1,1 … w1,k,1 ] [ w1,1,2 … w1,k,2 ] [ … …… ] [ w1,1,n … w1,k,,n ] h1 = a(xTW1) h2 = a(hTW2) x1 w1,1,1 w1,1,2 w2,1,1 a1,1 h1,1 w1,1,n x2 : a2,1 h2,1 w1,k,1 : w1,k,2 a1,k h1,k w2,1,n w1,k,n xn W1 W2 x a h1 a h2 Compressed Representation

Recurrent Neural Networks Unroll over time Modeling time-series data or data with sequential dependencies (eg. text/language). Feed previous “state of network” as additional input to current network

RNN Equations Hidden to output weights Regular NN weights What happens when you do gradient descent and back propagation? Recurrent loop weights

RNN to LSTM Persisting “memory” this way with backpropagation is problematic! Adapted from : http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Long Short Term Memory (LSTM)

Cell States & Memory Gates

Step 1: Forget Gate Layer

Step 2: Compute new Cell State

Step 3: How much of cell state to propagate

Step 4: Compute output of this time step

Variant : Peep Hole Connections

Variant: couple forget & input together

Variant: Gated Recurrent Units (GRU)

Unsupervised Learning with Auto-encoders Objective is to minimize reconstruction error Input data Compressed Representation Reconstruction

Auto-encoder for initializing deep networks Pretraining step: train a sequence of shallow autoencoders, greedily one layer at a time, using unsupervised data, Fine-tuning step 1: train the last layer using supervised data, Fine-tuning step 2: use backpropagation to ne- tune the entire network using supervised data. The reason is that training very deep neural networks is difficult: The magnitudes of gradients in the lower layers and in higher layers are different, The landscape or curvature of the objective function is difficult for stochastic gradient descent to find a good local optimum, Deep networks have many parameters, which can remember training data and do not generalize well.

Parallel Processing for Neural Networks Model Parallelism Neural Network model is too big for one machine Partition model across multiple machines What kind of co-ordination is needed? Data Parallelism Partition data across multiple machines

Model + Data Parallelism