Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper.

Similar presentations


Presentation on theme: "Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper."— Presentation transcript:

1 Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper

2 ML’s Dirty Little Secret
We in the aggregate ran many different versions of Deep ML on our images, reporting the best testset results we found Would it be valid to report this in a paper? Should now really get a FRESH set of examples and test our best model(s) on them! Should also use more image categories

3 High-Level Lab3 Overview
Forward prop: look at PREV layer act’s to decide one’s activations Compute ‘deviations’: look at NEXT layers deviations and do a weighted sum (output layer looks at “teacher’s answers”) Calc gradients: look at prev layer’s activations and mult by use one’s deviations Wgt updates: look at one’s own gradients

4 Lab 3 Quick Questions? (More Q/A at Breaks and End)

5 What is a Tensor? Vectors: 1D Matrices: 2D Tensors: 3D or more

6 Why it is Called TensorFlow (one possible design)
double wgts_convLayer1[][][][] = new double [colorsPerPixel] [platesConv1] [kernelSizeConv1] [kernelSizeConv1] double wgts_convLayer2[][][][] = new double [platesPool1] [platesConv2] [kernelSizeConv2] [kernelSizeCon2] double wgts_flatHUs[][][][] = new double [platesPool1] [pool2_output_imageSize] [pool2_output_imageSize] [numberOfFlatHUs] double wgts_outputs[][][][] = new double [numberOfFlatHUs] [1] [1] [numberOfOutputUnits]

7 Matrix Multiply between Layer i (M nodes) and Layer i +1 (N nodes)
M Layer i Nodes M Layer i Acts x = N Layer I Nodes N Layer i Wgt’ed Sums WEIGHTS ACTIVATIONS

8 Lots of Possible Parallelism
Do a Bunch of Folds, Parameter Settings, Algo’s, Datasets, etc (Trivial Parallelism) Do a Batch of Examples Activate all Nodes at Layer I Do all Items in One Node’s Weighted Sum Compute ‘Deviations’ for all I Compute all Gradients Update all Weights ???

9 GPUs (http://www.nvidia.com/object/what-is-gpu-computing.html)
Neural Chips being Developed by Nvidia, Google (in Madison!), Intel, etc

10 Projects Can use any programming language (might need to learn some python; I used free, hrs of effort) BE SURE TO HAVE YOUR DATA (or simulator) ASAP (ie, find, don’t create) First get something simple working end-to-end, then add complexity as time permits) Need not be novel; can learn a lot from reimplementing a micro version of a successful Deep ML program Need not use the cloud, nor ‘commercial’ s/w (can run tensorflow on your desktop/laptop) Aim to have an experimental control

11 From Moodle’s Page for Turning in Project Reports
They should only be two pages long (use pt size 12)  Be sure to list all project members (all should turn in the same proposal and report) Discuss (a) the task to be addressed (b) the data used (c) from where you will get cpu cycles (d) any existing s/w you plan to use (e) the experimental methodology planned for evaluation

12 More on Projects Aim to do more than ‘download code and data, run in cloud’ In such cases do interesting, extensive experimentation Do aim to learn how to use TensorFlow, etc CS 838 students especially should aim to write/extend some Deep ML code Not a lot of time left in semester though … After spring break, some project proposal presentations and progress reports Might have poster session(s) instead of final reports (when?)

13 Preferences for DL Packages?
MXNet (Amazon)? Paddle (Baidu)? Torch (Facebook)? Tensor Flow (Google)? Power AI (IBM)? CNTK (Microsoft)? Caffe (U-Berkeley)? Theano (U-Montreal)? Other? (Keras, python lib for Theano & TensorFlow?)

14 Preferences for Cloud S/W? Needs to offer free student accounts
Amazon? Google? IBM? Microsoft? Other?

15 Some Project Directions
Give ‘advice’ to Deep ANNs use ‘domain knowledge’ and not just data Knowledge-based ANNs (KBANN algo) Explain what a Deep ANN learned ‘rule extraction’ (Trepan algo) Generative Adversarial Networks (GANs) Given an image, generate a description (captioning) Deep RL; LSTM, recurrent links Transfer Learning (TL) Chatbots

16 Deep Visual-Semantic Alignments for Generating Image Descriptions
"two young girls are playing with lego toy." "a young boy is holding a baseball bat."

17 Deep Reinforcement Learning: Pong from Pixels
Testbed available? Might need to use my ‘Agent World’

18 GANs (NIPS ‘16 tutorial: https://arxiv.org/abs/1701.00160)

19 Understanding What a Deep Net Learned
Implement method(s) for visualizing a trained Deep ANN ‘Rule Extraction’ (code available via this page) Use some trained networks available on the web

20 Performance on Target Task
Potential Benefits of Transfer Learning ( steeper slope higher asymptote Performance on Target Task with transfer without transfer higher start Amount of Training

21 On-Line Data Sets (last updated 2014?) Maybe check Aim to use REAL data (not artificially generated – only good for debugging) Lots of stuff here:

22 Additional Questions? Suggestions?


Download ppt "Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper."

Similar presentations


Ads by Google