Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper.

Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper

ML’s Dirty Little Secret
We in the aggregate ran many different versions of Deep ML on our images, reporting the best testset results we found Would it be valid to report this in a paper? Should now really get a FRESH set of examples and test our best model(s) on them! Should also use more image categories

High-Level Lab3 Overview
Forward prop: look at PREV layer act’s to decide one’s activations Compute ‘deviations’: look at NEXT layers deviations and do a weighted sum (output layer looks at “teacher’s answers”) Calc gradients: look at prev layer’s activations and mult by use one’s deviations Wgt updates: look at one’s own gradients

Lab 3 Quick Questions? (More Q/A at Breaks and End)

What is a Tensor? Vectors: 1D Matrices: 2D Tensors: 3D or more

Why it is Called TensorFlow (one possible design)
double wgts_convLayer1[][][][] = new double [colorsPerPixel] [platesConv1] [kernelSizeConv1] [kernelSizeConv1] double wgts_convLayer2[][][][] = new double [platesPool1] [platesConv2] [kernelSizeConv2] [kernelSizeCon2] double wgts_flatHUs[][][][] = new double [platesPool1] [pool2_output_imageSize] [pool2_output_imageSize] [numberOfFlatHUs] double wgts_outputs[][][][] = new double [numberOfFlatHUs] [1] [1] [numberOfOutputUnits]

Matrix Multiply between Layer i (M nodes) and Layer i +1 (N nodes)
M Layer i Nodes M Layer i Acts x = N Layer I Nodes N Layer i Wgt’ed Sums WEIGHTS ACTIVATIONS

Lots of Possible Parallelism
Do a Bunch of Folds, Parameter Settings, Algo’s, Datasets, etc (Trivial Parallelism) Do a Batch of Examples Activate all Nodes at Layer I Do all Items in One Node’s Weighted Sum Compute ‘Deviations’ for all I Compute all Gradients Update all Weights ???

GPUs (http://www.nvidia.com/object/what-is-gpu-computing.html)
Neural Chips being Developed by Nvidia, Google (in Madison!), Intel, etc

Projects Can use any programming language (might need to learn some python; I used free, hrs of effort) BE SURE TO HAVE YOUR DATA (or simulator) ASAP (ie, find, don’t create) First get something simple working end-to-end, then add complexity as time permits) Need not be novel; can learn a lot from reimplementing a micro version of a successful Deep ML program Need not use the cloud, nor ‘commercial’ s/w (can run tensorflow on your desktop/laptop) Aim to have an experimental control

From Moodle’s Page for Turning in Project Reports
They should only be two pages long (use pt size 12) Be sure to list all project members (all should turn in the same proposal and report) Discuss (a) the task to be addressed (b) the data used (c) from where you will get cpu cycles (d) any existing s/w you plan to use (e) the experimental methodology planned for evaluation

More on Projects Aim to do more than ‘download code and data, run in cloud’ In such cases do interesting, extensive experimentation Do aim to learn how to use TensorFlow, etc CS 838 students especially should aim to write/extend some Deep ML code Not a lot of time left in semester though … After spring break, some project proposal presentations and progress reports Might have poster session(s) instead of final reports (when?)

Preferences for DL Packages?
MXNet (Amazon)? Paddle (Baidu)? Torch (Facebook)? Tensor Flow (Google)? Power AI (IBM)? CNTK (Microsoft)? Caffe (U-Berkeley)? Theano (U-Montreal)? Other? (Keras, python lib for Theano & TensorFlow?)

Preferences for Cloud S/W? Needs to offer free student accounts
Amazon? Google? IBM? Microsoft? Other?

Some Project Directions
Give ‘advice’ to Deep ANNs use ‘domain knowledge’ and not just data Knowledge-based ANNs (KBANN algo) Explain what a Deep ANN learned ‘rule extraction’ (Trepan algo) Generative Adversarial Networks (GANs) Given an image, generate a description (captioning) Deep RL; LSTM, recurrent links Transfer Learning (TL) Chatbots

Deep Visual-Semantic Alignments for Generating Image Descriptions
"two young girls are playing with lego toy." "a young boy is holding a baseball bat."

Deep Reinforcement Learning: Pong from Pixels
Testbed available? Might need to use my ‘Agent World’

GANs (NIPS ‘16 tutorial: https://arxiv.org/abs/1701.00160)

Understanding What a Deep Net Learned
Implement method(s) for visualizing a trained Deep ANN ‘Rule Extraction’ (code available via this page) Use some trained networks available on the web

Performance on Target Task
Potential Benefits of Transfer Learning ( steeper slope higher asymptote Performance on Target Task with transfer without transfer higher start Amount of Training

On-Line Data Sets (last updated 2014?) Maybe check Aim to use REAL data (not artificially generated – only good for debugging) Lots of stuff here:

Additional Questions? Suggestions?

Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper.

Similar presentations

Presentation on theme: "Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper.

Similar presentations

Presentation on theme: "Why it is Called Tensor Flow Parallelism in ANNs Project Ideas and Discussion Glenn Fung Presents Batch Renormalizating Paper."— Presentation transcript:

Similar presentations

About project

Feedback