Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.

Slides:



Advertisements
Similar presentations
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
Advertisements

05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Olivier Duchenne , Armand Joulin , Jean Ponce Willow Lab , ICCV2011.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
A Geometric Perspective on Machine Learning 何晓飞 浙江大学计算机学院 1.
Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
SVM—Support Vector Machines
Machine learning continued Image source:
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
An Efficient Multigrid Solver for (Evolving) Poisson Systems on Meshes Misha Kazhdan Johns Hopkins University.
Jue Wang Michael F. Cohen IEEE CVPR Outline 1. Introduction 2. Failure Modes For Previous Approaches 3. Robust Matting 3.1 Optimized Color Sampling.
Data Visualization STAT 890, STAT 442, CM 462
Corp. Research Princeton, NJ Computing geodesics and minimal surfaces via graph cuts Yuri Boykov, Siemens Research, Princeton, NJ joint work with Vladimir.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010 Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill.
Lecture 21: Spectral Clustering
Principal Component Analysis
Semi-Supervised Classification by Low Density Separation Olivier Chapelle, Alexander Zien Student: Ran Chang.
A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Graph-based Iterative Hybrid Feature Selection Erheng Zhong † Sihong Xie † Wei Fan ‡ Jiangtao Ren † Jing Peng # Kun Zhang $ † Sun Yat-sen University ‡
Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (NYU) Yair Weiss (Hebrew U.) Antonio Torralba (MIT) TexPoint fonts used in EMF. Read.
Chapter 7 – Poisson’s and Laplace Equations
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Graph-Based Semi-Supervised Learning with a Generative Model Speaker: Jingrui He Advisor: Jaime Carbonell Machine Learning Department
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Diffusion Maps and Spectral Clustering
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
Random Walks and Semi-Supervised Learning Longin Jan Latecki Based on : Xiaojin Zhu. Semi-Supervised Learning with Graphs. PhD thesis. CMU-LTI ,
Semisupervised Learning A brief introduction. Semisupervised Learning Introduction Types of semisupervised learning Paper for review References.
Random Walk with Restart (RWR) for Image Segmentation
Scientific Computing Partial Differential Equations Poisson Equation.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Transductive Regression Piloted by Inter-Manifold Relations.
Nonlinear Learning Using Local Coordinate Coding K. Yu, T. Zhang and Y. Gong, NIPS 2009 Improved Local Coordinate Coding Using Local Tangents K. Yu and.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
An Efficient Greedy Method for Unsupervised Feature Selection
Unsupervised Feature Selection for Multi-Cluster Data Deng Cai, Chiyuan Zhang, Xiaofei He Zhejiang University.
Lecture 2: Statistical learning primer for biologists
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Machine Learning Saarland University, SS 2007 Holger Bast Marjan Celikik Kevin Chang Stefan Funke Joachim Giesen Max-Planck-Institut für Informatik Saarbrücken,
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.
Fisher Information and Applications MLCV Reading Group 3Mar16.
Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Rongjie Lai University of Southern California Joint work with: Jian Liang, Alvin Wong, Hongkai Zhao 1 Geometric Understanding of Point Clouds using Laplace-Beltrami.
Relaxation Methods in the Solution of Partial Differential Equations
Semi-Supervised Learning Using Label Mean
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Semi-Supervised Clustering
Intrinsic Data Geometry from a Training Set
Unsupervised Riemannian Clustering of Probability Density Functions
Machine Learning Basics
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Learning with information of features
Estimating Networks With Jumps
Using Manifold Structure for Partially Labeled Classification
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Restructuring Sparse High Dimensional Data for Effective Retrieval
Presentation transcript:

Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University

Research Overview Machine Learning AlgorithmsApplications OptimizationProbabilistic Information- Theoretic Computer Vision Multimedia Analysis Information Retrieval … … Bioinformatics

Research Overview Department of Automation Tsinghua University ICML06 CVPR06 Graph Based Semi-supervised Learning ICDM06

Research Overview Department of Automation Tsinghua University SDM08 ICML08 Maximum Margin Clustering KDD08: Semi-supervised Support Vector Machine ICDM08: Maximum Margin Feature Extraction CVPR09: Maximum Margin Feature Selection with Manifold Regularization

Research Overview IJCAI09, ICDM09 DMKD, submitted Department of Automation Tsinghua University School of CIS FIU

Research Overview ISMIR2009

Research Overview School of CIS FIU present Department of Statistical Science Cornell University Large Scale Statistical Machine Learning Random Projection for NMF

Linear Neighborhood Propagation (LNP) Graph Construction

Machine Learning Supervised Learning Unsupervised Learning Semi-supervised Learning

Data Relationships Traditional machine learning algorithms usually make the i.i.d. assumption There are relationships among data points Relationship is everywhere

Graph Is Everywhere Internet Graph Internet Graph Friendship Graph friend-graph-facebo1.jpg Protein Interaction Graph

Graph Based SSL The graph nodes are the data points The graph edges correspond to the data relationships

Label Propagation Initial label vector if is labeled, otherwise Zhu & Gaharamani, 2002

Label Propagation If is labeled, ; otherwise Matrix form W y f

Label Propagation If is labeled, ; otherwise Matrix form

Label Propagation The process will finally converge.

The Construction of W (Zhu el al, ICML 2003) (Zhou et al, NIPS 2004) Similarity Matrix Degree Matrix

Linear Neighborhood Each data point can be linearly reconstructed from its neighborhood

A Toy Example

Application on Image Segmentation

Comparisons LNP Partially Labeled Original Graph Cut Random Walk For more examples see “Linear Neighborhood Propagation and Its Applications”. PAMI 2009

Application on Text Classification

Poisson Propagation Problem Solution

Optimization Framework Local Predicted Label Variation Predicted Label Smoothness

Data Manifold The high-dimensional data points are not everywhere in the data space They usually (nearly) reside on a low dimensional manifold embedded in the high dimensional space A manifold is a mathematical space that on a small enough scale resembles the Euclidean space of a certain dimension, called the dimension of the manifold From Wiki

Laplace-Beltrami Operator Laplace Operator : a second order differential operator defined in an Euclidean space Hessian Laplace-Beltrami Operator is a second order differential operator in a Riemannian manifold, it is an analog of the Laplace operator in the Euclidean space

Graph Laplacian An operator in the continuous space will be degenerated to a matrix in the discrete space Graph Laplacian is the discrete analog of the Laplace Beltrami operator on continuous manifold Similarity Matrix Degree Matrix Theorem: Assume the data set is sampled from a continuous manifold, and the neighborhood points are uniformly distributed on the sphere around the center point. If W is constructed by LNP, then statistically L = I - W provides a discrete approximation of the L-B Operator

Laplace’s Equation Dirichlet Boundary Condition LNP Traditional GBSSL

Label Field vs. Electric Field r The data graph can be viewed as the discretized form of the data manifold There is a label field on the data manifold. The predicted data labels are just the label potentials at their corresponding places Vacuum Permittivity Q Q

Poisson’s Equation Assume that the charges are continuously distributed in the Euclidean space with charge density, then the electric potential satisfies Laplace Operator Poisson’s Equation Consider a Riemannian space, where the Laplace operator becomes the Laplace-Beltrami operator

Laplace’s Equation vs. Poisson’s Equation Poisson’s equation The value of the electric potential V in an electric field on a Riemannian manifold with charge density satisfies the following Poisson’s equation Laplace’s equation Generally SSL on the data manifold solves the following Laplace’s equation There is no label sources on the data manifold Where does the label come from?

GBSSL by Solving Poisson’s Equation Assume the label sources are placed at the positions of the labeled points, then the label source distribution becomes Green’s Function Green’s function

Discrete Green’s Function The discrete Green’s function is defined as the inverse of the graph Laplacian by discarding its zero eigen-mode, i.e. Chung & Yau. Discrete Green’s Functions. J. Combinatorial Theory. 2000

Poisson Propagation Predicted Label Vector Discrete Green’s Function Initial Label Vector

Experiments SVMTSVMGRFPPcLNPPPl g214c g241d Digit COIL USPS BCI Text

Conclusions Linear Neighborhood Propagation: Construct the graphs through linear reconstruction of the neighborhoods Poisson Propagation: Get the data label predictions through solving a Poisson’s Equation, rather than Laplace’s Equation Efficient Implementation: 1.Approximating the eigen-system of the graph Laplacian 2.Algebraic Multigrid

Thank You Q&A