Progress Report 2019/5/5 PHHung.

Slides:

Advertisements

Similar presentations

Hopefully a clearer version of Neural Network. I1 O2 O1 H1 H2I2.

Advertisements

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

ImageNet Classification with Deep Convolutional Neural Networks

Machine Learning Neural Networks

Lecture 14 – Neural Networks

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Neural Networks Chapter Feed-Forward Neural Networks.

Text Classification: An Implementation Project Prerak Sanghvi Computer Science and Engineering Department State University of New York at Buffalo.

Spatial Pyramid Pooling in Deep Convolutional

Traffic Sign Recognition Using Artificial Neural Network Radi Bekker

Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.

Multiple-Layer Networks and Backpropagation Algorithms

Multi Layer NN and Bit-True Modeling of These Networks SILab presentation Ali Ahmadi September 2007.

Artificial Neural Networks An Overview and Analysis.

Explorations in Neural Networks Tianhui Cai Period 3.

Appendix B: An Example of Back-propagation algorithm

Backpropagation An efficient way to compute the gradient Hung-yi Lee.

Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

ECE 6504: Deep Learning for Perception

Multi-Layer Perceptron

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –(Finish) Backprop –Convolutional Neural Nets.

Deep Convolutional Nets

1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.

Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.

CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,

Neural Network and Deep Learning 王强昌 MLA lab.

Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.

Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.

Convolutional Neural Network

Philipp Gysel ECE Department University of California, Davis

ConvNets for Image Classification

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

1 Convolutional neural networks Abin - Roozgard. 2  Introduction  Drawbacks of previous neural networks  Convolutional neural networks  LeNet 5 

Feasibility of Using Machine Learning Algorithms to Determine Future Price Points of Stocks By: Alexander Dumont.

Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.

Lecture 3a Analysis of training of NN

Analysis of Sparse Convolutional Neural Networks

Deep Learning Amin Sobhani.

Chilimbi, et al. (2014) Microsoft Research

CSE 190 Neural Networks: How to train a network to look and see

Computer Science and Engineering, Seoul National University

Lecture 5 Smaller Network: CNN

Convolution Neural Networks

Structure learning with deep autoencoders

CS6890 Deep Learning Weizhen Cai

Prof. Carolina Ruiz Department of Computer Science

Power-Efficient Machine Learning using FPGAs on POWER Systems

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks

Artificial Neural Networks for Pattern Recognition

Computer Vision James Hays

Introduction to Neural Networks

Convolutional Neural Networks

network of simple neuron-like computing elements

Basics of Deep Learning No Math Required

CSC 578 Neural Networks and Deep Learning

Progress Report 2019/1/3 PHHung.

Optimization for Fully Connected Neural Network for FPGA application

Final Project presentation

Lecture: Deep Convolutional Neural Networks

Convolutional Neural Networks

Progress Report 2019/4/30 PHHung.

CSC 578 Neural Networks and Deep Learning

CSC321: Neural Networks Lecture 11: Learning in recurrent networks

CS295: Modern Systems: Application Case Study Neural Network Accelerator – 2 Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Semantic Segmentation

Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.

CSC 578 Neural Networks and Deep Learning

Prof. Carolina Ruiz Department of Computer Science

Presentation transcript:

Progress Report 2019/5/5 PHHung

Previously An efficient ConvNet (deep learning) analytics engine 2019/5/5 An efficient ConvNet (deep learning) analytics engine For Deep learning engine with different model : HMAX ASIC: 205mW in 256x256 (Neocortical from 332 , ISSCC 12) ConvNet FPGA: 10W in 500x375 (NeuFlow from NYU LeCun , CVPR 11) ASIC: 580mW in 500x375 (NeuFlow from Purdue & NYU , MWCAS 12) FPGA: 4W (TeraDeep from Purdue , CVPR 14) Server

Given a ConvNet model Total #W = 523,328 Input 46*46 image (RGB) 2019/5/5 Input 46*46 image (RGB) Layer 1 : Spatial conv with 32 hidden node , 7*7 kernel (#weight: 32*7*7*3 + 32) 2*2 Spatial pooling Layer 2: Spatial conv with 64 hidden node , 7*7 kernel (#weight: 64*7*7*32 + 64) Layer 3: Spatial conv with 128 hidden node , 7*7 kernel (#weight: 128*7*7*64 + 128) Layer 4: Fully connect with 128 hidden node (#weight: 128*128) Output 2 output (Person or not) (#weight: 2*128) 46*46 40*40 20* 20 14* 14 7*7 7*7 Total #W = 523,328

Result 2019/5/5 Person or not ?

Bottleneck 1. So many memory 2. Floating point operation 2019/5/5 1. So many memory Total #W 523,328 IEEE 754 floating point (32bits) => ~2.1MB 2. Floating point operation

Solution? Floating point -> Fix point Unfortunately… 2019/5/5 Floating point -> Fix point Unfortunately… “Fixed-Point Feedforward Deep Neural Network Design Using Weights +1, 0, and -1” SiPS 14 1. Naive approach : Quantize W directly 2. Quantize W directly and do backpropagation to refine W quantize BP quantize 示意圖示意圖

How about that ! 1. Naive approach : Quantize W directly (SiPS 14) 2019/5/5 1. Naive approach : Quantize W directly (SiPS 14) 2. Quantize W directly and do backpropagation to refine W (SiPS 14) 3. Add a quantize term @ loss function Loss function : 𝐽 𝑥 =𝛼 |ℎ 𝑥 −𝑦| 2 => Modify loss function : 𝐽 𝑥 =𝛼 |ℎ 𝑥 −𝑦| 2 +(1−𝛼) |𝑤 −𝑞| 2 𝑞=𝑛𝑒𝑎𝑟𝑒𝑠𝑡 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑒 𝑏𝑖𝑛 𝛼=0.9 𝛼=0.1 示意圖示意圖

𝐽 𝑥 =𝛼 |ℎ 𝑥 −𝑦| 2 +(1−𝛼) |𝑤 −𝑞| 2 2019/5/5 𝐽 𝑥 =𝛼 |ℎ 𝑥 −𝑦| 2 +(1−𝛼) |𝑤 −𝑞| 2 𝜕𝐽 𝑥 𝜕 𝑤 𝑗 =𝛼 h x −y × 𝜕ℎ(𝑥) 𝜕 𝑤 𝑗 + 1−𝛼 𝜕 𝑤 2 −2𝑤𝑞+ 𝑞 2 𝜕 𝑤 𝑗 =𝛼 h x −y × 𝜕(𝑓( 𝑗 𝑤 𝑗 × 𝑛𝑒𝑡 𝑗 )) 𝜕 𝑤 𝑗 + 1−𝛼 2 𝑤 𝑗 −2𝑞 =𝛼 h x −y × ℎ 𝑥 × 1−h x ∗ 𝑛𝑒𝑡 𝑗 + 1−𝛼 2 𝑤 𝑗 −2𝑞

Distributed Deep Networks ? 2019/5/5 Large Scale Distributed Deep Networks –NIPS2012 (google)

Another Distributed Deep Networks? 2019/5/5 Input layer Node (shallow) Compressed layer for transmit Weak Classifier @ Node Server (Deep) Strong Classifier @ Server

Some problem… 2019/5/5 1. Let’s put classifier accuracy aside…. Can it compress transmit data? 2. How about the classifier accuracy @ server? Especially when there is a compress layer 3. How about the classifier accuracy @ node? How many layer & node will “enough” ?

Let’s put classifier accuracy aside… Can it compress transmit data? 2019/5/5 Node (shallow) #Input : 𝐻×𝑊×3 (𝐻𝑖𝑔ℎ𝑡,𝑊𝑖𝑑𝑡ℎ,𝐶ℎ𝑎𝑛𝑛𝑒𝑙) #L1: 3× 𝐻− 𝐾 𝐻1 +1 × 𝑊− 𝐾 𝑊1 +1 × 𝑛 1 (𝐾:𝑘𝑒𝑟𝑛𝑒𝑙 𝑠𝑖𝑧𝑒 ,𝑛:ℎ𝑖𝑑𝑑𝑒𝑛 𝑙𝑎𝑦𝑒𝑟 𝑛𝑢𝑚𝑏𝑒𝑟) #L2: 𝑛 1 × 𝐻− 𝐾 𝐻1 − 𝐾 𝐻2 +2 × 𝑊− 𝐾 𝑊1 − 𝐾 𝑊2 +2 × 𝑛 2 How to determine n ,K to make #L2 < #Input ? => lower n , bigger K

How about the classifier accuracy @ server? 2019/5/5 When we squeeze #node2 from 64 -> 16 -> 8 -> 4 -> 2 #hidden node1_#hidden node2_ …

Conclusion A distributed model for node / server 2019/5/5 A distributed model for node / server A quantize approach to minimize computation A chip (brain) for sensor ?

About Vivotek 2019/5/5 What to do? How long? Basic neural network?