Machine Learning and having it deep and structured

Slides:



Advertisements
Similar presentations
Introduction to Machine Learning BITS C464/BITS F464
Advertisements

1 Image Classification MSc Image Processing Assignment March 2003.
Machine Learning & Data Mining CS/CNS/EE 155 Lecture 2: Review Part 2.
1 Neural Networks - Basics Artificial Neural Networks - Basics Uwe Lämmel Business School Institute of Business Informatics
Brian Merrick CS498 Seminar.  Introduction to Neural Networks  Types of Neural Networks  Neural Networks with Pattern Recognition  Applications.
Neural NetworksNN 11 Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.
MACHINE LEARNING AND ARTIFICIAL NEURAL NETWORKS FOR FACE VERIFICATION
Part I: Classification and Bayesian Learning
Machine learning Image source:
Machine Learning Theory Maria-Florina Balcan Lecture 1, Jan. 12 th 2010.
Introduction to machine learning
Introduction to Data Mining Engineering Group in ACL.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Crash Course on Machine Learning
Machine learning Image source:
Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.
Machine Learning Theory Maria-Florina (Nina) Balcan Lecture 1, August 23 rd 2011.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Deep Learning Neural Network with Memory (1)
Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Machine Learning.
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Artificial Neural Network Building Using WEKA Software
LeCun, Bengio, And Hinton doi: /nature14539
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Deep Convolutional Nets
Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines Geoffrey Hinton.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
ConvNets for Image Classification
Business Analytics Several odds and ends Copyright © 2016 Curt Hill.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
First Lecture of Machine Learning Hung-yi Lee. Learning to say “yes/no” Binary Classification.
Bassem Makni SML 16 Click to add text 1 Deep Learning of RDF rules Semantic Machine Learning.
Neural Network Architecture Session 2
Convolutional Neural Network
The Relationship between Deep Learning and Brain Function
Deep Learning Amin Sobhani.
Lecture 8 Why deep? We explain deep learning from two aspects
Matt Gormley Lecture 16 October 24, 2016
CH. 1: Introduction 1.1 What is Machine Learning Example:
Machine Learning Dr. Mohamed Farouk.
AV Autonomous Vehicles.
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Introduction to Neural Networks
Brain Inspired Algorithms Dr. David Fagan
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
network of simple neuron-like computing elements
Introduction to Neural Networks And Their Applications - Basics
Basics of Deep Learning No Math Required
On Convolutional Neural Network
Pattern Recognition & Machine Learning
Deep Learning for Non-Linear Control
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Christoph F. Eick: A Gentle Introduction to Machine Learning
III. Introduction to Neural Networks And Their Applications - Basics
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Automatic Handwriting Generation
Sanguthevar Rajasekaran University of Connecticut
EE 193/Comp 150 Computing with Biological Parts
Huawei CBG AI Challenges
Presentation transcript:

Machine Learning and having it deep and structured Introduction

What is Machine Learning? Outline What is Machine Learning? Deep Learning Structured Learning

Some tasks are very complex You know how to write programs. One day, you are asked to write a program for speech recognition. Find the common patterns from the left waveforms When you try to make such rules precise, It seems hopeless. You quickly get lost in the exceptions and special cases. 你好 你好 It seems impossible to write a program for speech recognition 你好 你好

Let the machine learn by itself 你好 Learn how to do speech recognition 大家好 You said “你好” 人帥真好 You only have to write the program for learning A large amount of audio data

Learning ≈ Looking for a Function Speech Recognition Handwritten Recognition Weather forecast Play video games “你好” “2” weather today “sunny tomorrow” Positions and number of enemies “jump”

Types of Learning Supervised Learning Reinforcement Learning Unsupervised Learning

Pick the best Function f* Supervised Learning “你好” “2” (label) Model Hypothesis Function Set 大家好 “Best” Function Training: Pick the best Function f* Testing: Training Data x: function input y: function output

Reinforcement Learning Example: Dialogue System Model Hypothesis Function Set Bad! How are you? Good Bye x: Training Data know how good f(x) is No labels f1(x)=“Good Bye”

Reinforcement Learning Example: Dialogue System Model Hypothesis Function Set Good! Training: Pick the best Function f* How are you? Fine. Play video games “Good” is not “Correct” Training Data know how good f(x) is No labels f2(x)=“Fine”

Reinforcement Learning Model Hypothesis Function Set Machine: y’ = “hi” “Best” Function Training: Pick the best Function f* Testing: Training Data know how good f(x) is : x’ = “hello” No labels

Unsupervised Learning Training Data No labels What can I do with these data? Lots of audio without text annotation

What is Machine Learning? Outline What is Machine Learning? Deep Learning Structured Learning

Inspired from human brain

Human Brains are Deep In the Google/Stanford paper from 2012 "Building High-level Features Using Large Scale Unsupervised Learning" - they achieved a 70% improvement in cat-detection technology :)  http://static.googleusercontent.com/media/research.google.com/zh-TW//archive/unsupervised_icml2012.pdf Google cats: https://www.youtube.com/watch?v=-rIb_Meiylw http://www.nytimes.com/video/technology/personaltech/100000003519478/appsmart-modernize-your-meetings.html?playlistId=1194811622271 METAPHYSICAL - In looking at some of the research papers, and seeing the "Master Neuron" images of Cats and Faces - that wasn't any one cat or one face - I was struck by the parallels to Plato's Theory of Forms (Platonic realism is a philosophical term usually used to refer to the idea of realism regarding the existence of universals or abstract objects).

A Neuron for Machine Each neuron is a function … Activation function Sigmoid function bias

Deep Learning Neural Network: Cascading the neurons …… Layer 1 Layer 2 Input Output Deep learning refers to deep neuron networks For each neuron, the input can be input or hidden Send information to next hidden or output Given the input, how to compute the output Hidden Layer

Deep Learning Universality Theorem: Any continuous function f Can be realized by a network with one hidden layer I am not very surprised. Give me enough neuron I can…… Deep structure can realize the same function in a simpler way (less neurons, less parameters) than shallow structure. (given enough hidden neurons) Reference: http://neuralnetworksanddeeplearning.com/chap4.html

Popular 2006 -> Initialization 2009 -> GPU 2011 -> Speech recognition 2012 -> Google Brain -> New York Time image competition

(Deep neural network on TIMIT usually used 4 to 8 layers) Powerful Speech Recognition (TIMIT): HW1 + HW2 (Deep neural network on TIMIT usually used 4 to 8 layers)

Three misunderstandings about Deep Learning 1. Deep learning works because the model is more “complex”

Deep is simply more complex ….. Deep works better simply because it uses more parameters. …… …… Shallow Deep

Fat + Short v.s. Thin + Tall Which one is better? …… Shallow If a function can be realized by deep structure, using a shallower structure would be more difficult That is, more neurons and thus more parameters. To realize the same function The neural network with deep structure is simpler Since the model is simpler, less training data is needed. …… Deep

Deep Learning - Why? Toy Example Sample 10,0000 points as training data 1 …… 0 or 1 …… …… …… …… ……

Deep Learning - Why? Toy Example 1 hidden layer: 125 neurons 500 neurons 2500 neurons 3 hidden layers: How many neurons in each hidden layers? 100~200 25~50 50~100 Less than 25

Deeper: Using less parameters to achieve the same performance Deep Learning - Why? Experiments on Hand-writing digit classification Deeper: Using less parameters to achieve the same performance

Three misunderstandings about Deep Learning 2. When you are using deep learning, you need more training data.

Size of Training Data Different number of training examples 10,0000 5,0000 2,0000 1 hidden layer 3 hidden layers

Deeper: Using less training data to achieve the same performance Size of Training Data Experiments on Hand-writing digit classification Deeper: Using less training data to achieve the same performance

Three misunderstandings about Deep Learning 3. You can simply get the power of deep by cascading the neurons.

Hard to get the power of Deep … Can I get all the power of deep from this course? No, the researchers still do not understand all the mystery of deep learning.

What is Machine Learning? Outline What is Machine Learning? Deep Learning Structured Learning

In the real world …… X (Input domain): Sequence, graph structure, tree structure …… Just name a few Take human language processing and image processing as examples Y (Output domain): Sequence, graph structure, tree structure ……

Retrieval “Machine learning” (keyword) A list of web pages (Search Result)

(Another kind of sequence) Translation “Machine learning and having it deep and structured” “機器學習及其深層與結構化” (One kind of sequence) (Another kind of sequence)

(Another kind of sequence) Speech Recognition “大家好,歡迎大家來修機器學習及其深層與結構化” HMM (One kind of sequence) (Another kind of sequence)

Speech Summarization Record Lectures Summary Learning to rank Select the most informative segments to form a compact version Summary

Object Detection Haruhi Mikuru Image Object Positions Image Segmentation Haruhi Mikuru

Image Segmentation Image foreground http://msr-waypoint.com/en-us/um/people/pkohli/papers/skh_eccv08.pdf Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends® in Computer Graphics and Vision 6.3–4 (2011): P57.

Remote Image Ground Survey Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends® in Computer Graphics and Vision 6.3–4 (2011): P146.

Pose Estimation Image Pose Application? Source of images: http://groups.inf.ed.ac.uk/calvin/Publications/eichner-techreport10.pdf

Structured Learning The tasks above are developed separately in the past. Recently, people realize that there is a unified framework behind these approaches. Three steps Evaluation Inference Learning Hopefully, use new view to understand these approaches

What is Machine Learning? Concluding Remarks What is Machine Learning? Deep Learning Structured Learning

Reference Deep Learning Structure Learning Neural Networks and Deep Learning http://neuralnetworksanddeeplearning.com/ For more information http://deeplearning.net/ Structure Learning Structured Learning and Prediction in Computer Vision. http://www.nowozin.net/sebastian/papers/nowozin2011struct ured-tutorial.pdf Linguistic Structure Prediction http://www.cs.cmu.edu/afs/cs/Web/People/nasmith/LSP/PUBL ISHED-frontmatter.pdf

Thank you!

Powerful Inspired from human brain …… …… …… …… …… visual cortex Layer 1 Layer 2 Layer L …… visual cortex Retina …… …… http://techtalks.tv/talks/machine-learning-and-ai-via-brain-simulations/57862/ http://blog.csdn.net/visionhack/article/details/10229657 In each hemisphere of our brain, humans have a primary visual cortex, also known as V1, containing 140 million neurons, with tens of billions of connections between them. And yet human vision involves not just V1, but an entire series of visual cortices - V2, V3, V4, and V5 - doing progressively more complex image processing. We carry in our heads a supercomputer, tuned by evolution over hundreds of millions of years, and superbly adapted to understand the visual world. Recognizing handwritten digits isn't easy. Rather, we humans are stupendously, astoundingly good at making sense of what our eyes show us. But nearly all that work is done unconsciously. And so we don't usually appreciate how tough a problem our visual systems solve. …… Pixels Edges Primitive Shapes ……

Powerful Image Recognition 1st hidden layer 2nd hidden layer http://arxiv.org/pdf/1311.2901v3.pdf 1st hidden layer 2nd hidden layer 3rd hidden layer Reference: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833)