Robust Optimization and Applications in Machine Learning
Part 4: Sparsity in Unsupervised Learning
Unsupervised learning
Sparse PCA: outline
Principal Component Analysis
PCA for visualization
First principal component
Sparse PCA: outline
Why sparse factors?
PCA: rank-one case
sparse PCA: rank-one case
Sparse PCA: outline
SDP relaxation
Dual problem
Sparsity and robustness
Sparse PCA decomposition?
Sparse PCA: outline
First-order algorithm
Sparse PCA: outline
PITPROPS data
PITPROPS data: numerical results
Financial example
Covariance matrix
Second factor
Gene expression data
Clustering of gene expression data
Conclusions on sparse PCA
Part 4: Sparsity in Unsupervised Learning
Sparse Gaussian networks: outline
Gaussian network problem
Correlation-based approach
Approach based on the precision matrix
Example
Relevance network vs. graphical model
Can we check this?
Sparse inverse covariance and conditional independence
Related work
Maximum-likelihood estimation
Problems with ordinary MLE
MLE with cardinality penalty
Convex relaxation
Link with robustness
Properties of estimate
Algorithms: challenges
First- vs. second-order algorithms
Black- vs. grey-box first-order algorithms
Algorithms: problem structure
Nesterov’s smooth minimization algorithm
Nesterov’s method
Putting the problem in Nesterov’s format
Making the problem smooth
Optimal scheme for smooth minimization
Application to our problem
Dual block-coordinate descent
Properties of dual block-coordinate descent
Link with LASSO
Example
Inverse covariance estimates
Average error on zeros
Computing time
Classification error
Recovering structure
Part 4: summary
References