Download presentation
Presentation is loading. Please wait.
1
Deep Learning with TensorFlow
2
Easy-TensorFlow Team Aryan Mohammad Jahandar
3
Outline 1st Lecture (9:15 a.m. - 10:30 a.m.)
Intro. to Machine Learning Intro. to TensorFlow 2nd Lecture (10:45 a.m. - 12:00 p.m.) Neural Network 3rd Lecture (1:15 p.m. - 2:30 p.m.) Neural Network in TensorFlow & Keras Visualization in TensorBoard 4th Lecture (2:45 p.m. - 4:00 p.m.) Convolutional Neural Network (CNN) CNNs in TensorFlow/Keras
4
Deep Learning with TensorFlow
Lecture 1: Introduction to Machine Learning & TensorFlow
5
Machine Learning Design machines that automatically learn from data and experience
6
Machine Learning restores the color of old black & white photos
7
Machine Learning Google Translate "reads" the text and replaces it with a text in English in real-time
8
Machine Learning Generating new photos
9
Machine Learning Self-driving cars
10
Machine Learning Playing video games
11
Machine Learning Playing video games
12
Machine Learning Healthcare
13
Machine Learning Generate Music
14
Deep Learning A class of Machine Learning algorithms
Multiple cascade processing stages Each stage learns a representation of the data Oleksiy Ivakhnenko
15
What is TensorFlow? Created by researchers at Google “TensorFlow™ is an open source software library for numerical computation using data flow graphs.” “… software library for Machine Intelligence” TensorFlow has APIs available in several languages (Python, C++, Java, etc.) The Python API is at present the most complete and the easiest to use
16
Companies using TensorFlow
18
Why TensorFlow? Developed and maintained by Google
Very large and active community + Nice documentation Python API Multi-GPU support Tensorboard (A very Powerful visualization tool) Faster model compilation than Theano-based options High level APIs built on top of TensorFlow (such as Keras and TFlearn)
19
How to set it up?! Python – programming language
Anaconda - package manager (Optional; instead of installing Python directly) TensorFlow IDE – software application (preferably PyCharm)
20
Intro to TensorFlow What is a Tensor? Importing the library
Multi-dimensional array 0-d tensor: scalar (number) 1-d tensor: vector 2-d tensor: matrix Importing the library import tensorflow as tf Key feature:"computational graph" approach Part 1: building the graph which represents the data flow of the computations Part 2: running a session which executes the operations in the graph TensorFlow separates definition of computations from their execution
21
Graph and Session Graph Nodes = operations
22
Graph and Session Graph Nodes = operations Edges = Tensors
23
Graph and Session Graph Nodes = operations Edges = Tensors Session
24
Graph and Session Example 1: Graph
import tensorflow as tf c = tf.add(2, 3, name='Add') print(c) TF automatically names the nodes when you don’t explicitly name them. x = 3 y = 5
25
? Graph and Session Example 1: Graph
import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') print(c) ? Tensor("Add:0", shape=(), dtype=int32) Variables TF automatically names the nodes when you don’t explicitly name them. x = 3 y = 5
26
Graph and Session Example 1: Graph Variables 5
import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') sess = tf.Session() print(sess.run(c)) sess.close() Variables 5 Create a session, assign it to variable sess so we can call it later Within the session, evaluate the graph to fetch the value of c
27
Graph and Session Example 1: Graph Variables 5
import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') with tf.Session() as sess: print(sess.run(c)) Variables 5
28
Graph and Session Example 2: Graph Variables
import tensorflow as tf x = 2 y = 3 add_op = tf.add(x, y, name='Add') mul_op = tf.multiply(x, y, name='Multiply') pow_op = tf.pow(add_op, mul_op, name='Power') with tf.Session() as sess: pow_out = sess.run(pow_op) Variables
29
Graph and Session Example 3: Graph Variables
import tensorflow as tf x = 2 y = 3 add_op = tf.add(x, y, name='Add') mul_op = tf.multiply(x, y, name='Multiply') pow_op = tf.pow(add_op, mul_op, name='Power') useless_op = tf.multiply(x, add_op, name='Useless') with tf.Session() as sess: pow_out = sess.run(pow_op) Variables
30
Graph and Session Example 3: Graph Variables
import tensorflow as tf x = 2 y = 3 add_op = tf.add(x, y, name='Add') mul_op = tf.multiply(x, y, name='Multiply') pow_op = tf.pow(add_op, mul_op, name='Power') useless_op = tf.multiply(x, add_op, name='Useless') with tf.Session() as sess: pow_out, useless_out = sess.run([pow_op, useless_op]) Variables
31
Data types 1. Constants are used to create constant values
tf.constant(value, dtype=None, shape=None, name='Const', verify_shape=False) Example: s = tf.constant(2, name='scalar') m = tf.constant([[1, 2], [3, 4]], name='matrix')
32
Data types 1. Constants are used to create constant values Graph
Before: Graph import tensorflow as tf a = 2 b = 3 c = tf.add(a, b, name='Add') with tf.Session() as sess: print(sess.run(c)) Variables 5
33
Data types 1. Constants are used to create constant values Graph Now:
import tensorflow as tf a = tf.constant(2, name='A') b = tf.constant(3, name='B') c = tf.add(a, b, name='Add') with tf.Session() as sess: print(sess.run(c)) Variables 5
34
Data types 2. Variables are stateful nodes (=ops) which output their current value They Can be saved and restored Gradient updates will apply to all variables in the graph get_variable( name, shape=None, dtype=None, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True, use_resource=None, custom_getter=None, constraint=None) ⇒ Network Parameters (weights and biases) Example: s1 = tf.get_variable(name='scalar1', initializer=2) s2 = tf.get_variable(name='scalar2', initializer=tf.constant(2)) m = tf.get_variable('matrix', initializer=tf.constant([[0, 1], [2, 3]])) M = tf.get_variable('big_matrix', shape=(784, 10), initializer=tf.zeros_initializer()) W = tf.get_variable('weight', shape=(784, 10), initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.01)) meaning that they can retain their value over multiple executions of a graph.
35
Data types 2. Variables Graph Variables
import tensorflow as tf # create graph a = tf.get_variable(name="A", initializer=tf.constant([[0, 1], [2, 3]])) b = tf.get_variable(name="B", initializer=tf.constant([[4, 5], [6, 7]])) c = tf.add(a, b, name="Add") # launch the graph in a session with tf.Session() as sess: # now we can run the desired operation print(sess.run(c)) Variables FailedPreconditionError: Attempting to use uninitialized value meaning that they can retain their value over multiple executions of a graph.
36
Data types Graph Variables
import tensorflow as tf # create graph a = tf.get_variable(name="A", initializer=tf.constant([[0, 1], [2, 3]])) b = tf.get_variable(name="B", initializer=tf.constant([[4, 5], [6, 7]])) c = tf.add(a, b, name="Add") # Add an Op to initialize variables init_op = tf.global_variables_initializer() # launch the graph in a session with tf.Session() as sess: # run the variable initializer sess.run(init_op) # now we can run the desired operation print(sess.run(c)) [[ 4 6] [ 8 10]] Graph Variables meaning that they can retain their value over multiple executions of a graph.
37
Data types 3. Placeholders are nodes whose value is fed in at execution time. ⇒ - Assemble the graph without knowing the values needed for computation - We can later supply the data at the execution time. ⇒ Input data (in classification task: Inputs and labels) tf.placeholder(dtype, shape=None, name=None) a = tf.placeholder(tf.float32, shape=[5]) b = tf.placeholder(dtype=tf.float32, shape=None, name=None) X = tf.placeholder(tf.float32, shape=[None, 784], name='input') Y = tf.placeholder(tf.float32, shape=[None, 10], name='label') meaning that they can retain their value over multiple executions of a graph.
38
Data types 3. Placeholders Graph Variables
import tensorflow as tf a = tf.constant([5, 5, 5], tf.float32, name='A') b = tf.placeholder(tf.float32, shape=[3], name='B') c = tf.add(a, b, name="Add") with tf.Session() as sess: print(sess.run(c)) Variables You must feed a value for placeholder tensor 'B' with dtype float and shape [3] meaning that they can retain their value over multiple executions of a graph.
39
Data types 3. Placeholders Graph Variables [6. 7. 8.]
import tensorflow as tf a = tf.constant([5, 5, 5], tf.float32, name='A') b = tf.placeholder(tf.float32, shape=[3], name='B') c = tf.add(a, b, name="Add") with tf.Session() as sess: # create a dictionary: d = {b: [1, 2, 3]} # feed it to the placeholder print(sess.run(c, feed_dict=d)) Variables [ ] meaning that they can retain their value over multiple executions of a graph.
40
Example meaning that they can retain their value over multiple executions of a graph.
41
Deep Learning with TensorFlow
Lecture 2: Classification using a Neural Network
42
Neural Network MODEL Output Input data
meaning that they can retain their value over multiple executions of a graph.
43
Neural Network meaning that they can retain their value over multiple executions of a graph.
44
MNIST Data 28 28
45
Logistic Classifier (linear classifier)
Set of N labeled inputs: D = {(X1, y1), …, (XN, yN)} WX+b = y 0.0 1.5 0.7 0.4 0.1 0.2 3.5 0.02 0.09 0.04 0.03 0.70 SOFTMAX Weight (784,10) Bias (1,10) (Superscript: index of elements) 28 (1,784) Linear model: takes the input, applies a linear function to generate its predictions 28 Logits Probs.
46
Logistic Classifier (linear classifier)
0.0 1.5 0.7 0.4 0.1 0.2 3.5 0.02 0.09 0.04 0.03 0.70 1 SOFTMAX WXn+b Cross-entropy (Superscript: index of elements) Logits (yn) Probs. (S(yn)) One-hot encoded labels (Ln) Input (Xn)
47
Logistic Classifier (linear classifier)
0.02 0.09 0.04 0.03 0.70 1 Cross-entropy … Probs. (S(yn)) Yn=WXn+b One-hot encoded labels (Ln)
48
Gradient Descent Batch Gradient descent: Learning rate
Batch Gradient descent: Learning rate calculate the gradients for the whole dataset to perform just one update -> computationally expensive can be very slow and is intractable for datasets that don't fit in memory Instead of computing this loss, we’re going to estimate it Stochastic gradient descent, feed one example and update, loss func. Fluctuates a lot! - enables it to jump to new and potentially better local minima complicates convergence to the exact minimum, as SGD will keep overshooting. when we slowly decrease the learning rate, SGD shows the same convergence behaviour as batch gradient descent
49
Gradient Descent Mini-batch Gradient descent: Learning rate
Mini-batch Gradient descent: Learning rate Instead of computing this loss, we’re going to estimate it Mini-batch gradient descent: takes the best of both worlds Take a small small batch of training samples randomly, compute L, compute derivative, and pretend that this derivative is the right direction to choose. (sometimes it’s not the correct direction and increases the loss). We’re going to compensate for that by running this procedure many many times. One challenge: find proper learning rate
50
Gradient Descent Momentum: Take advantage of accumulated knowledge Keep a running average of the gradients An overview of gradient descent optimization algorithms
51
Neural Network Introduce nonlinearity:
Sigmoid non-linearity squashes real numbers to range between [0,1] Sigmoids saturate and kill gradients Sigmoid outputs are not zero-centered -> zig-zagging dynamics ReLU: Alexnet, accelerates convergence, less expensive operation, simpler derivative
52
Neural Network
53
Neural Network #parameters = 200x x =159,010
54
Neural Network
55
Neural Network
56
Neural Network b = tf.get_variable(‘bias', initializer=tf.zeros(200)) W = tf.get_variable('weight', shape=(784, 200), initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.01)) X = tf.placeholder(tf.float32, shape=[None, 784], name='input') Y = tf.placeholder(tf.float32, shape=[None, 10], name='label')
57
Deep Learning with TensorFlow
Lecture 3: TensorBoard
58
Tensorboard is a flashlight for our Neural Net's black box.
1.What does the network graph look like? 2.What is the best network configuration? 3.How does the data look in high dimension?
59
What does the network graph look like?
Understanding connection of nodes and layers
60
What does the network graph look like?
Visualizing multiple runs simultaneously
61
How does the data look in high dimension?
Understanding relationship between samples
62
Deep Learning with TensorFlow
Lecture 4: Classification using a Convolutional Neural Network
63
Feed-forward Neural Network (NN)
64
NN problems: 1. Doesn’t use data structure! Translation invariance
65
NN problems: CNNs: Solution: Weight sharing
1. Doesn’t use data structure! W Solution: Weight sharing W CNNs: NNs that share their parameters across space
66
NN problems: 2. Doesn’t scale well to full images
784 Units 500 Units #parameters = 784 x = 392K !!!
67
NN problems: Solution: Weight sharing + 3D volume of neurons
2. Doesn’t scale well to full images Solution: Weight sharing + 3D volume of neurons Sharing the same set of weights (and biases)
68
Layers used to build CNNs
69
Convolution Layer What is convolution? 1. slide 2. multiply
a function derived from two given functions by integration that expresses how the shape of one is modified by the other. 1. slide 2. multiply 3. integrate (i.e. sum)
70
Convolution Layer - Spatial dimensions: 32x32 - Depth: 3
Feature maps (R+G+B)
71
Convolution Layer (Filter = Kernel = Patch)
Convolve the filter with the image i.e. “slide over the image spatially, Computing dot products, And sum over all”
72
Convolution Layer
73
Convolution Layer
74
Convolution Layer
75
Convolution Layer
76
Convolution Layer
77
Convolution Layer
78
Convolution Layer
79
Convolution Layer 14,455,392 !!! vs 456
80
Max-Pooling Layer - To reduce the spatial dimension of feature maps
81
Max-Pooling Layer - To reduce the spatial dimension of feature maps
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.