An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.

Slides:

Advertisements

Similar presentations

Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.

Advertisements

Introduction to Support Vector Machines (SVM)

Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Multi-Label Prediction via Compressed Sensing By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009) Presented by: Lingbo Li ECE, Duke University.

Sparse Modeling for Finding Representative Objects Ehsan Elhamifar Guillermo Sapiro Ren´e Vidal Johns Hopkins University University of Minnesota Johns.

SVM—Support Vector Machines

Support vector machine

Structured Sparse Principal Component Analysis Reading Group Presenter: Peng Zhang Cognitive Radio Institute Friday, October 01, 2010 Authors: Rodolphe.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.

Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang

Human Action Recognition by Learning Bases of Action Attributes and Parts.

2008 SIAM Conference on Imaging Science July 7, 2008 Jason A. Palmer

Extensions of wavelets

* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad

1 Inzell, Germany, September 17-21, 2007 Agnieszka Lisowska University of Silesia Institute of Informatics Sosnowiec, POLAND

Mean transform, a tutorial KH Wong mean transform v.5a1.

Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban.

1 Applications on Signal Recovering Miguel Argáez Carlos A. Quintero Computational Science Program El Paso, Texas, USA April 16, 2009.

Principal Component Analysis CMPUT 466/551 Nilanjan Ray.

Principal Component Analysis

Sparse and Overcomplete Data Representation

Image Denoising via Learned Dictionaries and Sparse Representations

An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.

Dimensional reduction, PCA

Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.

Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.

Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

1 Decentralized Jointly Sparse Optimization by Reweighted Lq Minimization Qing Ling Department of Automation University of Science and Technology of China.

Summarized by Soo-Jin Kim

Game Theory Meets Compressed Sensing

Non Negative Matrix Factorization

Cs: compressed sensing

Fast and incoherent dictionary learning algorithms with application to fMRI Authors: Vahid Abolghasemi Saideh Ferdowsi Saeid Sanei. Journal of Signal Processing.

Local Non-Negative Matrix Factorization as a Visual Representation Tao Feng, Stan Z. Li, Heung-Yeung Shum, HongJiang Zhang 2002 IEEE Presenter : 張庭豪.

Universit at Dortmund, LS VIII

Scientific Writing Abstract Writing. Why ? Most important part of the paper Number of Readers ! Make people read your work. Sell your work. Make your.

Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.

Compressive Sampling Jan Pei Wu. Formalism The observation y is linearly related with signal x: y=Ax Generally we need to have the number of observation.

Learning to Sense Sparse Signals: Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization Julio Martin Duarte-Carvajalino, and Guillermo Sapiro.

Weikang Qian. Outline Intersection Pattern and the Problem Motivation Solution 2.

Design of PCA and SVM based face recognition system for intelligent robots Department of Electrical Engineering, Southern Taiwan University, Tainan County,

A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER

Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:

Image Priors and the Sparse-Land Model

+ Quadratic Programming and Duality Sivaraman Balakrishnan.

NONNEGATIVE MATRIX FACTORIZATION WITH MATRIX EXPONENTIATION Siwei Lyu ICASSP 2010 Presenter : 張庭豪.

Application: Multiresolution Curves Jyun-Ming Chen Spring 2001.

SuperResolution (SR): “Classical” SR (model-based) Linear interpolation (with post-processing) Edge-directed interpolation (simple idea) Example-based.

2D-LDA: A statistical linear discriminant analysis for image matrix

Parametric Quadratic Optimization Oleksandr Romanko Joint work with Alireza Ghaffari Hadigheh and Tamás Terlaky McMaster University January 19, 2004.

Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.

From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein (Technion), David L. Donoho (Stanford), Michael.

Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.

Compressive Coded Aperture Video Reconstruction

Multiplicative updates for L1-regularized regression

Computing and Compressive Sensing in Wireless Sensor Networks

Mean transform , a tutorial

Floating Point Representations: Accuracy and Stability

Principal Component Analysis

Optimal sparse representations in general overcomplete bases

Improving K-SVD Denoising by Post-Processing its Method-Noise

Aishwarya sreenivasan 15 December 2006.

INFONET Seminar Application Group

CIS 700: “algorithms for Big Data”

Outline Sparse Reconstruction RIP Condition

Presentation transcript:

An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication Engineering, National Taiwan University 1

Outline Introduction The fundamental of optimization The idea of sparsity: coding V.S. sensing The solution The importance of dictionary Applications 2

Introduction 3

What is sparsity? Usage:  Compression  Analysis  Representation  Fast / sparse sensing 4 Projection bases Reconstruction bases

Introduction Why do we use Fourier transform and its modifications for image and acoustic compression?  Differentiability (theoretical)  Intrinsic sparsity (data-dependent)  Human perception (human-centric) Better bases for compression or representation?  Wavelets  How about data-dependent bases?  How about learning? 5

Introduction Optimization  Frequently faced in algorithm design  Used to implement you creative idea Issue  What kinds of mathematical form and its corresponding optimization algorithms do guarantee the convergence to local or global optima? 6

The Fundamental of Optimization 7

A Warming-up Question How do you solve the following problems? (1) (2) 8 Local minima Global minima (a) Plot 5 (b) Take derivatives, check = 0

An Advanced Question How about the following questions? (3) (4) (5) 9 (a) Plot? (b) Take derivative = 0? 53 Derivative? How to do?

Illustration 2-D case: 10

How to Solve? Thanks to……  Lagrange multiplier  Linear programming, quadratic programming, and recently, convex optimization Standard form: 11

Fallacy A quadratic programming problem with constraints 12 The importance of each food Personal nutrient need Nutrient content of each food (1)Take derivative (x) (2) Quadratic programming (o) (3) Sparse coding (o)

The Idea of Sparsity 13

What is Sparsity? Think about a problem? 14 Which do you want? Assume full rank, N > d Choose the x with the least nonzero component

Why Sparsity? The more concise, the more better In some domain, there naturally exists a sparse latent vector that controls the data we saw. (ex. MRI, music) In some domain, samples from the same class have the sparse property. The domain can be learned. 15 A k-sparse domain means that each b can be constructed by a x vector with at most k nonzero element

Sparse Sensing VS. Sparse Coding Assume that: 16 Sparse coding Sparse sensing Note: p is based on the sparsity of the data (on k)

Sparse Sensing 17

Sparse Sensing VS. Sparse Coding Sparse sensing (compressed sensing):  It spends much time or money to get b, so get y first then recover b Sparse coding (sparse representation):  Believe that there exists the sparse property in the data, otherwise sparse representation means nothing.  x is used to be the feature of b  x can be used to efficiently store b and reconstruct b 18

The Solution 19

How to Get The Sparse Solution? There is no algorithm other than exhaustively searching to solve: While in some situations (ex. special form of A), the solution of l 1 minimization approaches the one of l 0 minimization 20

Why l 1 ? Question 1: Why l 1 can result in a sparse solution? 21

Why l 1 ? Question 2: Why the sparse solution achieved by l 1 minimization approaches the one of l 0 minimization?  This is a matter or Mathematics  No matter how, sparse representation based on l 1 minimization has been widely used for pattern recognition.  In addition, if one doesn’t care about using the sparse solution for representation (feature), it seems OK if these two solutions are not the same. 22

Noise Sometimes, the data is observed with nose The answer seems to be negative 23 possibly not sparse

Noise Several ways to overcome this: What is the difference between: 24

Equivalent form You may also see several forms for the problem: These equivalent forms are derived from Lagrange multiplier There have been several publications aiming at how solving the l 1 minimization problem. 25

The Importance of Dictionary 26

Dictionary generation If the preceding sections, we generally assume that the (over-complete) bases A is existed and known However in practice, we usually need to build it:  Wavelet + Fourier + Haar + ……  Learning based on data How to learn? May result in over-fitting 27

Applications 28

Back to the problem we have A quadratic programming problem with constraints 29 The importance of each food Personal nutrient need Nutrient content of each food (1)Take derivative (x) (2) Quadratic programming (o) (3) Sparse coding (o)

Face Recognition (1) 30

Face Recognition (2) 31

An important issue When using sparse representation as a way of feature extraction, you may wonder, even if there exists the sparsity property in the data, does sparse feature really come up with better results? Does it contain any semantic meaning? Successful areas:  Face recognition  Digit recognition  Object recognition (with carful design): Ex. K-means  Sparse representation 32

De-noising 33 Learn a patch dictionary. For each patch, compute the sparse representation then use it to reconstruct the patch.

Detection based on reconstruction 34 Learn a patch dictionary for a specific object. For each patch in the image, compute the sparse representation and use it to reconstruct the image. Check the error for each patch, and identify those with small error as detected object. Maybe not over-complete Other cases: Foreground-background detection, pedestrian detection, ……

Conclusion 35

What you should know What is the form of standard optimization? What is sparsity? What is sparse coding and sparse sensing? What kind of optimization method to solve it? Try to use it !! 36

Thank you for listening 37