实习生汇报 ——北邮 张安迪.

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Convolutional Neural Network
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
Multinomial Regression and the Softmax Activation Function Gary Cottrell.
Big data classification using neural network
Sentiment analysis using deep learning methods
Convolutional Sequence to Sequence Learning
RNNs: An example applied to the prediction task
Convolutional Neural Network
Deep Feedforward Networks
Deep Learning Amin Sobhani.
ECE 5424: Introduction to Machine Learning
Recursive Neural Networks
Computer Science and Engineering, Seoul National University
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Recurrent Neural Networks for Natural Language Processing
COMP24111: Machine Learning and Optimisation
Matt Gormley Lecture 16 October 24, 2016
Intro to NLP and Deep Learning
Goodfellow: Chapter 9 Convolutional Networks
ICS 491 Big Data Analytics Fall 2017 Deep Learning
Deep Learning with TensorFlow online Training at GoLogica Technologies
Intelligent Information System Lab
Intro to NLP and Deep Learning
Different Units Ramakrishna Vedantam.
Mini Presentations - part 2
Neural networks (3) Regularization Autoencoder
ECE 6504 Deep Learning for Perception
Deep learning and applications to Natural language processing
Neural Networks 2 CS446 Machine Learning.
Convolutional Networks
Understanding the Difficulty of Training Deep Feedforward Neural Networks Qiyue Wang Oct 27, 2017.
Shunyuan Zhang Nikhil Malik
Neural Networks and Backpropagation
A brief introduction to neural network
Classification / Regression Neural Networks 2
RNNs: Going Beyond the SRN in Language Prediction
Introduction to Neural Networks
Goodfellow: Chap 6 Deep Feedforward Networks
Grid Long Short-Term Memory
A First Look at Music Composition using LSTM Recurrent Neural Networks
Recurrent Neural Networks
Tips for Training Deep Network
ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power
Smart Robots, Drones, IoT
network of simple neuron-like computing elements
Understanding LSTM Networks
LECTURE 35: Introduction to EEG Processing
Neural Networks Geoff Hulten.
Deep Learning for Non-Linear Control
Lecture Notes for Chapter 4 Artificial Neural Networks
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Convolutional Neural Networks
RNNs: Going Beyond the SRN in Language Prediction
Neural networks (1) Traditional multi-layer perceptrons
Mihir Patel and Nikhil Sardana
Image Classification & Training of Neural Networks
Neural networks (3) Regularization Autoencoder
COSC 4335: Part2: Other Classification Techniques
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Attention for translation
Automatic Handwriting Generation
Introduction to Neural Networks
Principles of Back-Propagation
An introduction to neural network and machine learning
Overall Introduction for the Lecture
Presentation transcript:

实习生汇报 ——北邮 张安迪

Tasks Deep learning by Bengio Tensorflow web docs One tensorflow example Jeff Dean’s talk at NIPS

Basic Theories of Deep Learning Feed forward networks Goal: approximate some function 𝑓 ∗ classfier: y= 𝑓 ∗ (𝑥) In general: y=𝑓(𝑥;𝜃)

Basic Theories of Deep Learning Feed forward networks Training:gradient descent Stochastic gradient descent --momentum Differences between linear model:cost function non-convex solution: initialize w and b to small random values Cost function: cross-entropy 𝐻 𝑝,𝑞 =− 𝑥 𝑝 𝑥 log 𝑞(𝑥) negative log-likelihood

Basic Theories of Deep Learning Feed forward networks Cost function: cross-entropy 𝐻 𝑝,𝑞 =− 𝑥 𝑝 𝑥 log 𝑞 𝑥 +𝛼Ω(𝜃) Regularization: 𝐿 2 Ω(𝜃) = 1 2 𝑤 2 2 = 𝑖 ( 𝑥 𝑖 ) 2 𝐿 1 Ω(𝜃) = 𝑤 1 = 𝑖 𝑤 𝑖 Data augmentation—fake data,noise Early stopping

Basic Theories of Deep Learning Feed forward networks Hidden units: RELU ℎ=𝑔( 𝑊 T 𝑥+𝑏) 𝑔 𝑧 =max⁡{0,𝑧}

Basic Theories of Deep Learning Feed forward networks Output units: Linear units for gaussian output distributions Sigmoid units for bernoulli output distributions Softmax units for multinoulli output distributions

Basic Theories of Deep Learning Feed forward networks back-propagation: a method for computing the gradient

Basic Theories of Deep Learning 2. Convolutional networks --neural networks that use convolution instead of general matrix multiplication 𝑠 𝑡 = 𝑥∗𝑤 𝑡 = 𝑎=−∞ ∞ 𝑥 𝑎 𝑤(𝑡−𝑎) 𝑆 𝑖,𝑗 = 𝐼∗𝐾 𝑖,𝑗 = 𝑚 𝑛 𝐼 𝑖+𝑚,𝑗+𝑚 𝐾(𝑚,𝑛)

Basic Theories of Deep Learning Convolutional networks ways to improve a machine learning system Sparse interactions Parameter sharing Equivariant representation

Basic Theories of Deep Learning Convolutional networks Pooling Make the representation invariant to small translation of input when we care more about whether a feature exists than where it is. Improve the computational efficiency of the network(also memory requirement, etc.) Essential for handling inputs of varying size adjust stride

Basic Theories of Deep Learning Convolutional networks problem: network size shrinks too fast solution: zero padding

Basic Theories of Deep Learning Recurrent networks(RNN) --a family of networks for processing sequential data ℎ (𝑡) =𝑓( ℎ 𝑡−1 , 𝑥 (𝑡) ;𝜃) --with same f and same 𝜃 at every time step t

Basic Theories of Deep Learning Recurrent networks Produce an output at each time step and have recurrent connections between hidden units Produce an output at each time step and have recurrent connections from output to hidden units --teacher forcing, lack info of the past; easy to train Produce one output and have recurrent connections between hidden units

Basic Theories of Deep Learning Recurrent networks 𝑎 (𝑡) =𝑏+𝑊 ℎ (𝑡−1) +𝑈 𝑥 (𝑡) ℎ (𝑡) =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑎 (𝑡) ) 𝑜 𝑡 =𝑐+𝑉 ℎ (𝑡) 𝑦 (𝑡) =𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑜 (𝑡) )

Basic Theories of Deep Learning Recurrent networks BPTT

Basic Theories of Deep Learning Recurrent networks Useful models (1)Encoder-decoder sequence-to-sequence architectures Input -> encoder -> context C -> decoder -> output (2) Recursive neural network depth reduce from 𝜏 𝑡𝑜 𝑂 (𝑙𝑜𝑔𝜏) (3)Long short-term memory gated RNN

II. A simple model using Tensorflow Convolution network MNIST Handwritten digits Training set – 60000 Test set -- 10000

II. A simple model using Tensorflow

II. A simple model using Tensorflow

II. A simple model using Tensorflow

II. A simple model using Tensorflow

II. A simple model using Tensorflow

II. A simple model using Tensorflow