Robust Optimization and Applications in Machine Learning.

Slides:



Advertisements
Similar presentations
Component Analysis (Review)
Advertisements

Matrix Factorization with Unknown Noise
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
The Stability of a Good Clustering Marina Meila University of Washington
1 Welcome to the Kernel-Class My name: Max (Welling) Book: There will be class-notes/slides. Homework: reading material, some exercises, some MATLAB implementations.
A Casual Chat on Convex Optimization in Machine Learning Data Mining at Iowa Group Qihang Lin 02/09/2014.
Nonlinear Regression Ecole Nationale Vétérinaire de Toulouse Didier Concordet ECVPT Workshop April 2011 Can be downloaded at
Continuous optimization Problems and successes
Computer vision: models, learning and inference
Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
Principal Component Analysis
Statistical Estimation of High Dimensional Covariance Matrices – a sampling from Prof. Hero’s research group Ted Tsiligkaridis SPEECS Friday, Sept. 9,
Efficient Sparse Coding Algorithms
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.
Direct Convex Relaxations of Sparse SVM Antoni B. Chan, Nuno Vasconcelos, and Gert R. G. Lanckriet The 24th International Conference on Machine Learning.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Principle of Locality for Statistical Shape Analysis Paul Yushkevich.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
3D Geometry for Computer Graphics
Cleaver – Classification of Expression Array Version 1.0 Hongli Li Spring Computational Biology Computer Science Department UMASS Lowell.
Optimal Placement and Selection of Camera Network Nodes for Target Localization A. O. Ercan, D. B. Yang, A. El Gamal and L. J. Guibas Stanford University.
Object Orie’d Data Analysis, Last Time
EE 290A: Generalized Principal Component Analysis Lecture 2 (by Allen Y. Yang): Extensions of PCA Sastry & Yang © Spring, 2011EE 290A, University of California,
Minimizing general submodular functions
Learning With Structured Sparsity
Sparse Inverse Covariance Estimation with Graphical LASSO J. Friedman, T. Hastie, R. Tibshirani Biostatistics, 2008 Presented by Minhua Chen 1.
Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,
Center for Evolutionary Functional Genomics Large-Scale Sparse Logistic Regression Jieping Ye Arizona State University Joint work with Jun Liu and Jianhui.
An Introduction to Support Vector Machines (M. Law)
The Group Lasso for Logistic Regression Lukas Meier, Sara van de Geer and Peter Bühlmann Presenter: Lu Ren ECE Dept., Duke University Sept. 19, 2008.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
A Clustering Method Based on Nonnegative Matrix Factorization for Text Mining Farial Shahnaz.
An Efficient Greedy Method for Unsupervised Feature Selection
A Semi-Blind Technique for MIMO Channel Matrix Estimation Aditya Jagannatham and Bhaskar D. Rao The proposed algorithm performs well compared to its training.
Robust Optimization and Applications Laurent El Ghaoui IMA Tutorial, March 11, 2003.
Lecture 2: Statistical learning primer for biologists
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
Model Selection, Regularization Machine Learning B Seyoung Kim Many of these slides are derived from Ziv-Bar Joseph. Thanks! 1.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Advanced Artificial Intelligence Lecture 8: Advance machine learning.
An Efficient Algorithm for a Class of Fused Lasso Problems Jun Liu, Lei Yuan, and Jieping Ye Computer Science and Engineering The Biodesign Institute Arizona.
Extraction of Individual Tracks from Polyphonic Music Nick Starr.
Introduction to several works and Some Ideas Songcan Chen
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project A Distributed Newton Method for Network Optimization Ali Jadbabaie and Asu Ozdaglar.
DEEP LEARNING BOOK CHAPTER to CHAPTER 6
Object Orie’d Data Analysis, Last Time
Linli Xu Martha White Dale Schuurmans University of Alberta
Multi-task learning approaches to modeling context-specific networks
Multiplicative updates for L1-regularized regression
Background on Classification
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Boosting and Additive Trees (2)
Menglong Li Ph.d of Industrial Engineering Dec 1st 2016
The Elements of Statistical Learning
Basic Algorithms Christina Gallner
Robust Optimization and Applications in Machine Learning
Lecture 25 Radial Basis Network (II)
Probabilistic Models with Latent Variables
Estimating Networks With Jumps
Dimension reduction : PCA and Clustering
Introduction to Radial Basis Function Networks
Sparse Principal Component Analysis
Outline Sparse Reconstruction RIP Condition
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Robust Optimization and Applications in Machine Learning

Part 4: Sparsity in Unsupervised Learning

Unsupervised learning

Sparse PCA: outline

Principal Component Analysis

PCA for visualization

First principal component

Sparse PCA: outline

Why sparse factors?

PCA: rank-one case

sparse PCA: rank-one case

Sparse PCA: outline

SDP relaxation

Dual problem

Sparsity and robustness

Sparse PCA decomposition?

Sparse PCA: outline

First-order algorithm

Sparse PCA: outline

PITPROPS data

PITPROPS data: numerical results

Financial example

Covariance matrix

Second factor

Gene expression data

Clustering of gene expression data

Conclusions on sparse PCA

Part 4: Sparsity in Unsupervised Learning

Sparse Gaussian networks: outline

Gaussian network problem

Correlation-based approach

Approach based on the precision matrix

Example

Relevance network vs. graphical model

Can we check this?

Sparse inverse covariance and conditional independence

Related work

Maximum-likelihood estimation

Problems with ordinary MLE

MLE with cardinality penalty

Convex relaxation

Link with robustness

Properties of estimate

Algorithms: challenges

First- vs. second-order algorithms

Black- vs. grey-box first-order algorithms

Algorithms: problem structure

Nesterov’s smooth minimization algorithm

Nesterov’s method

Putting the problem in Nesterov’s format

Making the problem smooth

Optimal scheme for smooth minimization

Application to our problem

Dual block-coordinate descent

Properties of dual block-coordinate descent

Link with LASSO

Example

Inverse covariance estimates

Average error on zeros

Computing time

Classification error

Recovering structure

Part 4: summary

References