Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Similar presentations


Presentation on theme: "Deep Learning Some slides are from Prof. Andrew Ng of Stanford."— Presentation transcript:

1 Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

2 Training set: Features extraction problem

3 Object detection

4

5

6 Raw image

7 Convolution 3x3 filter 5x5 or Feature Map

8 Activation map or feature map

9 Learning filters Convolution neural network learns the values of these filters on its own during the training process The AI guy specifies parameters such as number of filters, filter size, architecture of the network, etc. The number of filters is called the depth. The more number of filters we have, the more image features get extracted and the better our network becomes at recognizing patterns in unseen images.

10 Subsampling, down-sampling, or pooling
Pooling achieves dimensionality reduction. Stride is the number of pixels by which we slide our filter matrix over the input matrix.

11 Convolutional Neural Networks

12

13 Advantages of CNN Character recognition, natural images
Find edges, corners, endpoints, local 2-D structures Translation Invariance Convolution and sub-sampling layers are interleaved Sub-sampling smooth the data Exact position of detected feature not important, but the relative positions can be and can be captured by later layers 5x5, 4x4 common

14 Computer vision: Identify coffee mug

15 Why is computer vision hard?

16

17

18

19 Learning from tagged data (supervised)

20 Why deep learning? Deep Learning uses a neural network with several layers. The sequence of layers identifies features in stages like our brains seem to. On image and speech data, it often performs better than other methods

21 Building huge neural networks

22 AlexNet 2012 ImageNet computer image recognition competition
Alex Krizhevsky of the University of Toronto won. 5 convolutional layers 60 million parameters 650,000 neurons 1 million training images Trained on two NVIDIA GPUs for a week Used hidden-unit dropout to reduce overfitting

23 DNN 2015 Using deep learning, Google and Microsoft both beat the best human score in the ImageNet challenge. Microsoft and the China University of Science and Technology announced a DNN that achieved IQ test scores at the college post-graduate level. Baidu announced that a deep learning system called Deep Speech 2 had learned both English and Mandarin. Deep learning had achieved superhuman levels of perception for the challenge.

24 Deep Learning Overview
Train networks with many layers (vs. shallow nets with just a couple of layers) Multiple layers work to build an improved feature space First layer learns 1st order features (e.g. edges…) 2nd layer learns higher order features (combinations of first layer features, combinations of edges, etc.) In current models layers often learn in an unsupervised mode and discover general features of the input space – serving multiple tasks related to the unsupervised instances (image recognition, etc.) Then final layer features are fed into supervised layer(s) And entire network is often subsequently tuned using supervised training of the entire net, using the initial weightings learned in the unsupervised phase Could also do fully supervised versions, etc. (early BP attempts)

25

26 Learning from tagged data

27 AI will transform the internet

28 Deep network training We have always had good algorithms for learning the weights in networks with 1 hidden layer but these algorithms are not good at learning the weights for networks with many hidden layers What’s new: algorithms for training many-layer networks

29 Handwritten digits

30 What is this unit doing?

31 Hidden layer units become self-organised feature detectors
1 strong +ve weight low/zero weight

32 What does this unit detect?
1 strong +ve weight low/zero weight it will send strong signal for a horizontal line in the top row, ignoring everywhere else

33 What does this unit detect?
1 strong +ve weight low/zero weight 63

34 What does this unit detect?
1 strong +ve weight low/zero weight Strong signal for a dark area in the top left corner

35 What features might you expect a good NN
to learn, when trained with data like this?

36 vertical lines 1 63

37 Horizontal lines 1 63

38 Small circles 1 63

39 But what about position invariance?
Small circles 1 But what about position invariance? Our example unit detectors were tied to specific parts of the image 63

40 Successive layers can learn higher-level features
etc … 1st layer: detect lines in specific positions 2nd layer: horizontal line, vertical line, upper loop, etc. etc … v

41 etc … etc … v What does this unit detect? 1st layer: detect lines in
specific positions 2nd layer: horizontal line, vertical line, upper loop, etc. etc … v What does this unit detect?

42 Layers in brain

43 New way to train MLP

44 Train this layer first

45 Train this layer first then this layer

46 Train this layer first then this layer then this layer

47 Train this layer first then this layer then this layer then this layer

48 Train this layer first then this layer then this layer then this layer finally this layer

49 EACH of the (non-output) layers is trained to be an auto-encoder.
Basically, it is forced to learn good features that describe what comes from the previous layer

50 Auto-encoding Unsupervised training input = output, identity mapping
By making this happen with fewer units, this forces the hidden layer units to become good feature detectors Restricted Boltzmann Machine is an example of auto-encoder.

51 Deep auto-encoding Deep auto-encoder often performs dimensionality reduction better than principle component analysis.

52 Stacked Auto-Encoders
Stack many (sparse) auto-encoders in succession and train them using greedy layer-wise training Drop the decode output layer each time

53 Face recognition

54 Dropout – Overfit avoidance
Very common with current deep networks Won’t overfit one particular network structure Forces to regularize *Dropconnect – randomly drop connections Shakeout Instead of randomly discarding units as Dropout does at the training stage, our method randomly chooses to enhance or inverse the contributions of each unit to the next layer. Others – Dropin, Standout, etc. For each instance drop a node (hidden or input) and its connections with probability p and train Final net just has all averaged weights (actually scaled by 1-p) As if ensembling 2n different network substructures

55 Weaknesses of CNN Plain nets: stacking 3x3 convolution layers
56-layer net has higher training error and test error than 20-layers net

56 Google’s Artificial Brain
10 million randomly selected YouTube video thumbnails over the course of three days a neural network of 16,000 computer processors with one billion connections 20,000 output neurons 81.7% accuracy in detecting human faces, 76.7% accuracy when identifying human body parts 74.8% accuracy when identifying cats 15.8% accuracy in recognizing 20,000 object categories

57 can treat perturbation
Residual Network Difference between an original image and a changed image Preserving base information Some Network residual can treat perturbation

58 Residual Network Deeper ResNets have lower training error

59 Results Deep Resnets can be trained without difficulties
Deeper ResNets have lower training error, and also lower test error

60 Results 1st places in all five main tracks in “ILSVRC & COCO 2015 Competitions” ImageNet Classification ImageNet Detection ImageNet Localization COCO Detection COCO Segmentation

61

62

63 Deep net tools

64 user interface (UI)

65

66 Google’s Tensorflow Nodes represent operations
Edges represent the flow of data Data are tensors A tensor of rank n is represented by an n-dimensional array Tensorflow is the flow of arrays in a computational graph.

67 Deep learning libraries

68 Object detection

69 Summary Residual nets can train to a depth of 200 layers.
Deep networks naturally integrate low/mid/high level features and classifiers in an end-to-end multilayer fashion


Download ppt "Deep Learning Some slides are from Prof. Andrew Ng of Stanford."

Similar presentations


Ads by Google