Download presentation
Presentation is loading. Please wait.
Published byMollie Willingham Modified over 10 years ago
1
Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems (JMLR 2006) Tijl De Bie and Nello Cristianini Presented by Lihan He March 16, 2007
2
Outline Statement of the problem Spectral relaxation and eigenvector SDP relaxation and Lagrange dual Generalization: between spectral and SDP Transduction and side information Experiments Conclusions
3
Statement of the problem Data set S: Affinity matrix A: Objective: graph cut clustering -- divide the data points into two set, P and N, such that No label: clustering With some labels: transduction
4
Statement of the problem Normalized graph cut problem (NCut) where Cut cost How well the clusters are balanced Cost function:
5
Statement of the problem Normalized graph cut problem (NCut) Unknown label vector Let Write Rewrite the NCut problem as a combinatorial optimization problem NP-complete problem, the exponent is very high. (1)
6
Spectral Relaxation Let the problem becomes Relax the constraints by adding and dropping the combinatorial constraints on, we obtain the spectral clustering relaxation (2)
7
Spectral Relaxation: eigenvector Solution: the eigenvector corresponding to the second smallest generalized eigenvalue. Solve the constrained optimization by Lagrange dual: The second constraint is automatically satisfied:
8
SDP Relaxation Let the problem becomes Note that Relax the constraints by adding the above constraints and dropping and Letandwe obtain the SDP relaxation (3)
9
SDP Relaxation: Lagrange dual Lagrangian: We obtain the dual problem (strong dual is hold): (4) n+1 variables
10
Generalization: between spectral and SDP A cascade of relaxations tighter than spectral and looser than SDP where m+1 variables n constraintsm constraints, Looser than SDP Design the structure of W design how to relax the constraints
11
Generalization: between spectral and SDP rank(W)=n: original SDP relaxation. rank(W)=1: m=1, W=d: spectral relaxation. A relaxation is tighter than another if the column space of the matrix W used in the first one contains the full column space of W of the second. If choose d within the column space of W, then all relaxations in the cascade are tighter than the spectral relaxation. One approach of designing W proposed by the author: Sort the entries of the label vector (2 nd eigenvector) from spectral relaxation; Construct partition: m subsets are roughly equally large; Reorder the data points by this sorted order; W ~ n/m W=W= 1 … 1 1 … 1 1 … 1 … 12m …
12
Transduction Given some labels, written as label vector y t -- transductive problem Reparameterize Label constraints are imposed: L= ytyt 0 0I Labeled Unlabeled Rows (columns) corresponding to oppositely labeled training points then automatically are each other’s opposite; Rows (columns) corresponding to same-labeled training points are equal to each other.
13
Transduction Transductive NCut relaxation: n test +2 variables
14
General constraints An equivalence constraint between two sets of data points specifies that they belong to the same class; An inequivalence constraint specifies two set of data points to belong to opposite classes. No detailed label information provided.
15
Experiments 1. Toy problems Affinity matrix:
16
Experiments 2. Clustering and transduction on text Data set: 195 articles 4 languages several topics Affinity matrix: 20-nearest neighbor: A(i,j)= 1 0.5 0 Distance of two articles: cosine distance on the bag of words representation Define dictionary
17
Experiments 2. Clustering and transduction on text: cost By languageBy topic Spectral (randomized rounding) SDP (randomized rounding) Spectral (lower bound) SDP (lower bound) Cost: randomized rounding ≥ opt ≥ lower bound Cost Fraction of labeled data points
18
Experiments 2. Clustering and transduction on text: accuracy By languageBy topic Spectral (randomized rounding) SDP (randomized rounding) Accuracy Fraction of labeled data points
19
Conclusions Proposed a new cascade of SDP relaxations of the NP-complete normalized graph cut optimization problem; One extreme: spectral relaxation; The other extreme: newly proposed SDP relaxation; For unsupervised and semi-supervised learning, and more general constraints; Balance the computational cost and the accuracy.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.