Deep Learning Insights and Open-ended Questions

Deep Learning Insights and Open-ended Questions
CS 501:CS Seminar Min Xian Assistant Professor Department of Computer Science University of Idaho

Image from NVIDIA

Google Trends Researchers: … Geoff Hinton Yann LeCun Andrew Ng
Yoshua Bengio … Deep Learning Google Trends

Deep Learning in Industry
Companies Projects Investment Description Google DeepMind, Since 2010 $500 million Intel autonomous driving system $15.3 billion Facebook DeepFace, 2014 - Nine-layer, trained on 4 millions faces, 97% vs. 85%(FBI) Nvidia GPU, CUDA, since 2009 Increase the speed of deep learning system by more than 100 times Apple, Tesla, Baidu, …

Geoff Hinton’s Deep belief nets First functional NN with many layers
2006 2011, AlexNet (CNN) 2014, DeepFace 2014, Ian’s GAN 2015, AlphaGo Math model of NN 1943 Turing Test 1947 Deng’s improvement 2009 1982 SVM 1995 Apply the backpropagation algo. to NN LeCun, Handwritten digits recognition 1989 First functional NN with many layers 1965 1940 1969 Minsky’s two problems: XOR and computing power Nvidia’s NN-GPU Feifei Li’s ImageNet 2009 2017 1950 1960 1970 1980 1990 2000 2010

What is Deep Learning? Deep Learning is about Neural Networks (NNs)
An example of a feedforward NN The Mostly Complete Chart of Neural Networks by the team at the Asimov Institute.

What is Deep Learning? Deep Learning is about neural nets
An example of a shallow neural net Deep Learning is about neural nets Multiple layers of nonlinear processing units (node) Learn data representation by supervised or unsupervised learning Forming a hierarchical data representation from low-level to high- level The Mostly Complete Chart of Neural Networks by the team at the Asimov Institute.

Feedforward Neural Nets
Highly structured and comes in layers Group of classifiers Feedforward propagation Input Hidden Output class 1 class 2 Closer look at the F Nets: 1 descriptions, highly structured 2 feedforward 3 group of classifier 4

Feedforward Neural Nets: An Example
Input Hidden Output Height sick Weight Temperature healthy Closer look at the F Nets: 1 descriptions, highly structured 2 feedforward 3 group of classifier 4

From Shallow Nets To Deep Nets
Biological Neural Nets (100 billion neurons) Biological foundation Dealing with complex patterns with high representation capacity I has been reported that, our human brain has more than 100 neurons. If you want your algorithm to recognize or discover very complex patterns in data, you really need to start using deep learning

Ability to Recognize Complex Patterns
Break down complex patterns to simpler patterns Using simple patterns of building blocks to detect complex patterns An example of CNN for Face recognition

Why did it take 50 years? The Vanishing Gradient Problem makes it very hard to train a deep net Backpropagation (0.9)100≈0 Slow training process No high quality big data set No powerful computing devices NVIDIA GPU and deep learning machine, 2009 Our machine: 8×GTX 1080, 8×8GB memory, 8×2560 CUDA cores. ImageNet: Feifei Li, 2009 Total number of images: 14,197,122 Number of images with bounding box annotations: 1,034,908, 3000 classes

Choice of Deep Learning Models
Convolutional Neural Net (CNN): Machine vision problems, object detection, Yann LeCun Recursive Neural Tensor Net (RNTN): discover the hierarchical structure of data Recurrent Neural Net (RNN): do forecasting based on sequence input Deep Belief Net (DBN): small labelled dataset, pretraining, fine-tuning; Restricted Boltzmann machine (RBM): no vanishing gradient problem, automatically find patterns in data reconstructing the input (Geoff Hinton) Autoencoder

Choice of Deep Learning Models
Applications: Text/Document analysis: RNN, RNTN Image analysis: CNN, DBN Image captioning: RNN, CNN Video recognition: CNN Self-driving: RNN, CNN Statistic planning: RNN Speech recognition: RNN General Guideline: Classification: DBN, CNN Time series analysis and forecasting: RNN

When to Use Deep Learning?
Right amount of data Complex patterns Computing infrastructure Not to Not enough data Has inside knowledge of data, can design good features Not have the computing resources

How to get started with deep learning
Courses at the University of Idaho CS 404/504: Machine Learning CS 404/504: Deep Learning CS 470/570: Artificial Intelligence Other resources: Andrew Ng’s Machine Learning course (Coursera) Yoshua Bengio’ s book Deep Learning

Deep Learning Platforms: a set of tools and interface for building Deep nets Selection of deep nets, CNN, DBN, MLP, RNN, RNTN Data preprocessing UI Infrastructure, GPU

Software Platforms: install on your personal hardware H2O.ai: MLP, Dato GraphLab: CNN and MLP Full Platforms: handle all technic issues ersatz lab

Deep learning libraries: software libraries highly-qualified software team regularly maintained open sourced surrounded by a large community Commercial-Grade libraries: Deep learning4j, Torch, Caffe and TensorFlow Educational or scientific research libraries: theano, DeepMat and TensorFlow

Deep Learning Trends and Discussion
Scales of data and computation drive the progress of deep learning Amount of data Performance Q1: big data and Large models Good or Bad ? Deep Neural Nets Medium Neural Nets Traditional approaches, SVM, Random forest, logistic regression, etc. Q2: is big data necessary for learning ? big data and Large models: good or bad Does human intelligence build on the learning of big data?

Overfitting and underfitting or variance and bias training time error gap Test error training error Training set Test set Generalization ability Question 3: how to judge if a model is overfitting or underfitting

Human-level error? Underfitting: compare human-level error and training error Solution: Bigger model, training longer Overfitting: compare test error and training error Solution: early stopping, dropout, regularization, get more data time error Test error Human-level error training error

Overfitting and underfitting: a practical strategy Training set Validation Test yes Bigger model, training longer, new architecture Training error is high yes More data, regularization, new architecture Validation error is high Done From Andrew Ng

End-to-End Learning: output much more complex results not just numbers Object recognition: image Numbers: 1, 2, …,1000 Product review sentiment: positive (1) or negative (-1) Image captioning sentence audio transcript Medical image cancer Tumor detection, feature extraction and selection

End-to-End Learning Deep nets Medical image cancer Tumor detection, segmentation, feature extraction and selection Q4: is end-to-end learning good for all problems? Dataset size

Q5: Is unsupervised learning the future of AI/deep learning ? Deep learning started with unsupervised learning Exciting and difficult learning simple and complex concepts Expensive to collect labeled data Weakly supervised: large amount of unlabeled data + small set of unlabeled data Learning simple concept and complex concept

Questions? Min Xian, Assistant Professor
Department of Computer Science | UI-IF TAB 309 |

Deep Learning Insights and Open-ended Questions

Similar presentations

Presentation on theme: "Deep Learning Insights and Open-ended Questions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Deep Learning Insights and Open-ended Questions

Similar presentations

Presentation on theme: "Deep Learning Insights and Open-ended Questions"— Presentation transcript:

Similar presentations

About project

Feedback