Learning Deep L0 Encoders

Learning Deep L0 Encoders
Qing Ling Department of Automation, University of Science and Technology of China (USTC) Joint work with Zhangyang Wang and Thomas Huang (UIUC) The 2016 AAAI Conference on Artificial Intelligence (AAAI 2016) The 2015 Youth Symposium of Scientific and Engineering Computation (YSSEC 2015) 2015/12/11 1

Starter: A Joke about Deep Learning
The way to do machine learning research 5 years ago Collect Data Analyze Data Design Feature Build Model Verify Model Optimize Model Evaluate Model The way to do machine learning research now Collect Data Tune Network Collect Data Tune Network Collect Data Tune Network ……

Theme of This Talk Behind the success of deep learning are difficulties in Structure design & network initialization & parameter tuning Incorporation of problem-level prior & interpretation From engineering (or art) to science Statistic bounds Convergence analysis Bridging deep (big data) & shallow (small data) models Our goal: connection between deep learning & sparse coding 3

Outline A brief introduction to deep learning
Connection between deep learning & sparse coding Deep L0-regularized encoder Deep M-sparse L0 encoder Numerical experiments Conclusions 4 4

Learning Deep Representations/Features
Example: a feed-forward network Train with BIG input & output Inference from input to output In the train stage Learn nonlinear features Fi Linear weight + nonlinear neuron Stochastic (sub)gradient In the inference stage Transform with learned features Fast end-to-end inference 5 5

Power of Being Deep: Example of ILSVRC
Human 5.1% Data Algorithm System 6 6

Sparse Coding Revisited
Train in sparse coding Given (X,Y) where Y is a sparse representation of X Learn a dictionary D such that X = DY by some approach Inference in sparse coding Y = argminY ||X-DY||2 + r(Y) Regularization r(Y): enforce sparsity of Y View sparse coding from the perspective of deep learning Train & inference are done over different architectures Iterative algorithm for inference that is often slow Not end-to-end (classification, etc) 7 7

Connect Sparse Coding & Deep Learning
Idea: truncate iterative algorithm for train & inference Train & inference are done in the same architecture Fast & end-to-end inference (add a new operator/neuron) O1 O2 + X Y O3 Example: unfolded & truncated up to the second iteration O1 X Y O2 O3 O2 O3 O2 + + 8 8

Case Study: Deep L0-Regularized Encoder
L0-regularized least squares Y = argminY ||X-DY||2 + c2||Y||0 IHT: Yk+1 = hc(DTX+(I-DTD)Yk) = hc(DTX+WYk) O1 O2 DT hc + + X Y X Y O3 W Trained as a deep network; fast & end-to-end inference DT X Y hc W hc W hc + + 9 9

HELU: A New Nonlinear Neuron
hc: tolerate large values, strongly penalize small values HELU: compared with logistic, sigmoid & ReLU Discontinuous & hard to train with stochastic (sub)gradient HELU HELUd HELUd: close to HELU when d goes to 0; dynamic during train 10 10

Case Study: Deep M-Sparse L0 Encoder
M-sparse constrained least squares Y = argminY ||X-DY||2, s.t. ||Y||0 ≤ M PGD: Yk+1 = pM(DTX+(I-DTD)Yk) = pM(DTX+WYk) O1 O2 DT pM + + X Y X Y O3 W Similar train & inference as in deep L0-regularized encoder DT X Y pM W pM W pM + + 11 11

Interpreting Max-M Pooling/Unpooling
pM: keep coefficients with the top M largest absolute values Indeed, the well-known max-M pooling/unpooling operator Explains its success in deep learning: sparse representation Max-2 Pooling Unpooling Compare Max-M pooling/unpooling & HELU Different sparsification approaches Exact sparsity level or trained through samples 12 12

Implementation Issues
Use (small scale) sparse coding to initialize deep learning Simplify initialization of deep learning Make sparse coding scalable Train & test data follow the same distribution: no magic here 13 13

Numerical Experiments
MNIST dataset: 60,000 for train & 10,000 for test Outperform iterative sparse coding & existing deep networks L0-regularized encoder learns regularization parameter M-sparse L0 encoder incorporates prior of sparsity level M 14 14

Concluding Remarks Bridge deep learning & sparse coding
Explain & exploit structure design of deep learning Incorporate problem-level prior & interpret neurons Give an effective initialization strategy for deep learning A general coding scheme including sparse & nonsparse? Deep L1 encoder (LeCun et al 2012) Learning sparse & low-rank (Sprechmann et al 2015) Laplacian regularization (Wang et al 2015) Design & explanation of more general encoders 15 15

Thank you for your attention
凌青中国科学技术大学自动化系 16

Learning Deep L0 Encoders

Similar presentations

Presentation on theme: "Learning Deep L0 Encoders"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning Deep L0 Encoders

Similar presentations

Presentation on theme: "Learning Deep L0 Encoders"— Presentation transcript:

Similar presentations

About project

Feedback