Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep Learning with TensorFlow

Similar presentations


Presentation on theme: "Deep Learning with TensorFlow"— Presentation transcript:

1 Deep Learning with TensorFlow

2 Easy-TensorFlow Team Aryan Mohammad Jahandar

3 Outline 1st Lecture (9:15 a.m. - 10:30 a.m.)
Intro. to Machine Learning Intro. to TensorFlow 2nd Lecture (10:45 a.m. - 12:00 p.m.) Neural Network 3rd Lecture (1:15 p.m. - 2:30 p.m.) Neural Network in TensorFlow & Keras Visualization in TensorBoard 4th Lecture (2:45 p.m. - 4:00 p.m.) Convolutional Neural Network (CNN) CNNs in TensorFlow/Keras

4 Deep Learning with TensorFlow
Lecture 1: Introduction to Machine Learning & TensorFlow

5 Machine Learning Design machines that automatically learn from data and experience

6 Machine Learning restores the color of old black & white photos

7 Machine Learning Google Translate "reads" the text and replaces it with a text in English in real-time

8 Machine Learning Generating new photos

9 Machine Learning Self-driving cars

10 Machine Learning Playing video games

11 Machine Learning Playing video games

12 Machine Learning Healthcare

13 Machine Learning Generate Music

14 Deep Learning A class of Machine Learning algorithms
Multiple cascade processing stages Each stage learns a representation of the data Oleksiy Ivakhnenko

15 What is TensorFlow? Created by researchers at Google “TensorFlow™ is an open source software library for numerical computation using data flow graphs.” “… software library for Machine Intelligence” TensorFlow has APIs available in several languages (Python, C++, Java, etc.) The Python API is at present the most complete and the easiest to use

16 Companies using TensorFlow

17

18 Why TensorFlow? Developed and maintained by Google
Very large and active community + Nice documentation Python API Multi-GPU support Tensorboard (A very Powerful visualization tool) Faster model compilation than Theano-based options High level APIs built on top of TensorFlow (such as Keras and TFlearn)

19 How to set it up?! Python – programming language
Anaconda - package manager (Optional; instead of installing Python directly) TensorFlow IDE – software application (preferably PyCharm)

20 Intro to TensorFlow What is a Tensor? Importing the library
Multi-dimensional array 0-d tensor: scalar (number) 1-d tensor: vector 2-d tensor: matrix Importing the library import tensorflow as tf Key feature:"computational graph" approach Part 1: building the graph which represents the data flow of the computations Part 2: running a session which executes the operations in the graph TensorFlow separates definition of computations from their execution

21 Graph and Session Graph Nodes = operations

22 Graph and Session Graph Nodes = operations Edges = Tensors

23 Graph and Session Graph Nodes = operations Edges = Tensors Session

24 Graph and Session Example 1: Graph
import tensorflow as tf c = tf.add(2, 3, name='Add') print(c) TF automatically names the nodes when you don’t explicitly name them. x = 3 y = 5

25 ? Graph and Session Example 1: Graph
import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') print(c) ? Tensor("Add:0", shape=(), dtype=int32) Variables TF automatically names the nodes when you don’t explicitly name them. x = 3 y = 5

26 Graph and Session Example 1: Graph Variables 5
import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') sess = tf.Session() print(sess.run(c)) sess.close() Variables 5 Create a session, assign it to variable sess so we can call it later Within the session, evaluate the graph to fetch the value of c

27 Graph and Session Example 1: Graph Variables 5
import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') with tf.Session() as sess: print(sess.run(c)) Variables 5

28 Graph and Session Example 2: Graph Variables
import tensorflow as tf x = 2 y = 3 add_op = tf.add(x, y, name='Add') mul_op = tf.multiply(x, y, name='Multiply') pow_op = tf.pow(add_op, mul_op, name='Power') with tf.Session() as sess: pow_out = sess.run(pow_op) Variables

29 Graph and Session Example 3: Graph Variables
import tensorflow as tf x = 2 y = 3 add_op = tf.add(x, y, name='Add') mul_op = tf.multiply(x, y, name='Multiply') pow_op = tf.pow(add_op, mul_op, name='Power') useless_op = tf.multiply(x, add_op, name='Useless') with tf.Session() as sess: pow_out = sess.run(pow_op) Variables

30 Graph and Session Example 3: Graph Variables
import tensorflow as tf x = 2 y = 3 add_op = tf.add(x, y, name='Add') mul_op = tf.multiply(x, y, name='Multiply') pow_op = tf.pow(add_op, mul_op, name='Power') useless_op = tf.multiply(x, add_op, name='Useless') with tf.Session() as sess: pow_out, useless_out = sess.run([pow_op, useless_op]) Variables

31 Data types 1. Constants are used to create constant values
tf.constant(value, dtype=None, shape=None, name='Const', verify_shape=False) Example: s = tf.constant(2, name='scalar') m = tf.constant([[1, 2], [3, 4]], name='matrix')

32 Data types 1. Constants are used to create constant values Graph
Before: Graph import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') with tf.Session() as sess: print(sess.run(c)) Variables 5

33 Data types 1. Constants are used to create constant values Graph Now:
import tensorflow as tf a = tf.constant(2, name='A') b = tf.constant(3, name='B') c = tf.add(a, b, name='Add') with tf.Session() as sess: print(sess.run(c)) Variables 5

34 Data types 2. Variables are stateful nodes (=ops) which output their current value They Can be saved and restored Gradient updates will apply to all variables in the graph get_variable(     name,     shape=None,     dtype=None,     initializer=None,     regularizer=None,     trainable=True,     collections=None,    caching_device=None,     partitioner=None,     validate_shape=True,     use_resource=None,     custom_getter=None,     constraint=None) ⇒ Network Parameters (weights and biases) Example: s1 = tf.get_variable(name='scalar1', initializer=2) s2 = tf.get_variable(name='scalar2', initializer=tf.constant(2)) m = tf.get_variable('matrix', initializer=tf.constant([[0, 1], [2, 3]])) M = tf.get_variable('big_matrix', shape=(784, 10), initializer=tf.zeros_initializer()) W = tf.get_variable('weight', shape=(784, 10), initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.01)) meaning that they can retain their value over multiple executions of a graph. 

35 Data types 2. Variables Graph Variables
import tensorflow as tf # create graph a = tf.get_variable(name="A", initializer=tf.constant([[0, 1], [2, 3]])) b = tf.get_variable(name="B", initializer=tf.constant([[4, 5], [6, 7]])) c = tf.add(a, b, name="Add") # launch the graph in a session with tf.Session() as sess: # now we can run the desired operation print(sess.run(c)) Variables FailedPreconditionError: Attempting to use uninitialized value meaning that they can retain their value over multiple executions of a graph. 

36 Data types Graph Variables
import tensorflow as tf # create graph a = tf.get_variable(name="A", initializer=tf.constant([[0, 1], [2, 3]])) b = tf.get_variable(name="B", initializer=tf.constant([[4, 5], [6, 7]])) c = tf.add(a, b, name="Add") # Add an Op to initialize variables init_op = tf.global_variables_initializer() # launch the graph in a session with tf.Session() as sess: # run the variable initializer sess.run(init_op) # now we can run the desired operation print(sess.run(c)) [[ 4  6]  [ 8 10]] Graph Variables meaning that they can retain their value over multiple executions of a graph. 

37 Data types 3. Placeholders are nodes whose value is fed in at execution time. ⇒ - Assemble the graph without knowing the values needed for computation - We can later supply the data at the execution time. ⇒ Input data (in classification task: Inputs and labels) tf.placeholder(dtype, shape=None, name=None) a = tf.placeholder(tf.float32, shape=[5]) b = tf.placeholder(dtype=tf.float32, shape=None, name=None) X = tf.placeholder(tf.float32, shape=[None, 784], name='input') Y = tf.placeholder(tf.float32, shape=[None, 10], name='label') meaning that they can retain their value over multiple executions of a graph. 

38 Data types 3. Placeholders Graph Variables
import tensorflow as tf a = tf.constant([5, 5, 5], tf.float32, name='A') b = tf.placeholder(tf.float32, shape=[3], name='B') c = tf.add(a, b, name="Add") with tf.Session() as sess: print(sess.run(c)) Variables You must feed a value for placeholder tensor 'B' with dtype float and shape [3] meaning that they can retain their value over multiple executions of a graph. 

39 Data types 3. Placeholders Graph Variables [6. 7. 8.]
import tensorflow as tf a = tf.constant([5, 5, 5], tf.float32, name='A') b = tf.placeholder(tf.float32, shape=[3], name='B') c = tf.add(a, b, name="Add") with tf.Session() as sess: # create a dictionary: d = {b: [1, 2, 3]} # feed it to the placeholder print(sess.run(c, feed_dict=d)) Variables [ ] meaning that they can retain their value over multiple executions of a graph. 

40 Example meaning that they can retain their value over multiple executions of a graph. 

41 Deep Learning with TensorFlow
Lecture 2: Classification using a Neural Network

42 Neural Network MODEL Output Input data
meaning that they can retain their value over multiple executions of a graph. 

43 Neural Network meaning that they can retain their value over multiple executions of a graph. 

44 MNIST Data 28 28

45 Logistic Classifier (linear classifier)
Set of N labeled inputs: D = {(X1, y1), …, (XN, yN)} WX+b = y 0.0 1.5 0.7 0.4 0.1 0.2 3.5 0.02 0.09 0.04 0.03 0.70 SOFTMAX Weight (784,10) Bias (1,10) (Superscript: index of elements) 28 (1,784) Linear model: takes the input, applies a linear function to generate its predictions 28 Logits Probs.

46 Logistic Classifier (linear classifier)
0.0 1.5 0.7 0.4 0.1 0.2 3.5 0.02 0.09 0.04 0.03 0.70 1 SOFTMAX WXn+b Cross-entropy (Superscript: index of elements) Logits (yn) Probs. (S(yn)) One-hot encoded labels (Ln) Input (Xn)

47 Logistic Classifier (linear classifier)
0.02 0.09 0.04 0.03 0.70 1 Cross-entropy Probs. (S(yn)) Yn=WXn+b One-hot encoded labels (Ln)

48 Gradient Descent Batch Gradient descent: Learning rate
Batch Gradient descent: Learning rate calculate the gradients for the whole dataset to perform just one update -> computationally expensive can be very slow and is intractable for datasets that don't fit in memory Instead of computing this loss, we’re going to estimate it Stochastic gradient descent, feed one example and update, loss func. Fluctuates a lot! - enables it to jump to new and potentially better local minima complicates convergence to the exact minimum, as SGD will keep overshooting. when we slowly decrease the learning rate, SGD shows the same convergence behaviour as batch gradient descent

49 Gradient Descent Mini-batch Gradient descent: Learning rate
Mini-batch Gradient descent: Learning rate Instead of computing this loss, we’re going to estimate it  Mini-batch gradient descent: takes the best of both worlds Take a small small batch of training samples randomly, compute L, compute derivative, and pretend that this derivative is the right direction to choose. (sometimes it’s not the correct direction and increases the loss). We’re going to compensate for that by running this procedure many many times. One challenge: find proper learning rate

50 Gradient Descent Momentum: Take advantage of accumulated knowledge Keep a running average of the gradients An overview of gradient descent optimization algorithms

51 Neural Network Introduce nonlinearity:
Sigmoid non-linearity squashes real numbers to range between [0,1] Sigmoids saturate and kill gradients Sigmoid outputs are not zero-centered -> zig-zagging dynamics ReLU: Alexnet, accelerates convergence, less expensive operation, simpler derivative

52 Neural Network

53 Neural Network #parameters = 200x x =159,010

54 Neural Network

55 Neural Network

56 Neural Network b = tf.get_variable(‘bias', initializer=tf.zeros(200)) W = tf.get_variable('weight', shape=(784, 200), initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.01)) X = tf.placeholder(tf.float32, shape=[None, 784], name='input') Y = tf.placeholder(tf.float32, shape=[None, 10], name='label')

57 Deep Learning with TensorFlow
Lecture 3: TensorBoard

58 Tensorboard is a flashlight for our Neural Net's black box.
1.What does the network graph look like? 2.What is the best network configuration? 3.How does the data look in high dimension?

59 What does the network graph look like?
Understanding connection of nodes and layers

60 What does the network graph look like?
Visualizing multiple runs simultaneously

61 How does the data look in high dimension?
Understanding relationship between samples

62 Deep Learning with TensorFlow
Lecture 4: Classification using a Convolutional Neural Network

63 Feed-forward Neural Network (NN)

64 NN problems: 1. Doesn’t use data structure! Translation invariance

65 NN problems: CNNs: Solution: Weight sharing
1. Doesn’t use data structure! W Solution: Weight sharing W CNNs: NNs that share their parameters across space

66 NN problems: 2. Doesn’t scale well to full images
784 Units 500 Units #parameters = 784 x = 392K !!!

67 NN problems: Solution: Weight sharing + 3D volume of neurons
2. Doesn’t scale well to full images Solution: Weight sharing + 3D volume of neurons Sharing the same set of weights (and biases)

68 Layers used to build CNNs

69 Convolution Layer What is convolution? 1. slide 2. multiply
a function derived from two given functions by integration that expresses how the shape of one is modified by the other. 1. slide 2. multiply 3. integrate (i.e. sum)

70 Convolution Layer - Spatial dimensions: 32x32 - Depth: 3
Feature maps (R+G+B)

71 Convolution Layer (Filter = Kernel = Patch)
Convolve the filter with the image i.e. “slide over the image spatially, Computing dot products, And sum over all”

72 Convolution Layer

73 Convolution Layer

74 Convolution Layer

75 Convolution Layer

76 Convolution Layer

77 Convolution Layer

78 Convolution Layer

79 Convolution Layer 14,455,392 !!! vs 456

80 Max-Pooling Layer - To reduce the spatial dimension of feature maps

81 Max-Pooling Layer - To reduce the spatial dimension of feature maps


Download ppt "Deep Learning with TensorFlow"

Similar presentations


Ads by Google