Download presentation
Presentation is loading. Please wait.
Published byAliza Wagner Modified over 10 years ago
1
Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov
2
Plan of the Talk A. Clustering (Brief overview). B. Deterministic Perturbation Theory. C. Statistical Perturbation Theory.
3
Graph Clustering 3 4 1 2 6 7 5
4
3 4 1 2 6 7 5
5
Graph Clustering + Perturbation 3 4 1 2 6 7 5 ?
6
Gene Expression Data Clustering An Application There are over 10 000 genes expressed in any one tissue; DNA arrays typically produce very noisy data. 1.Genes in same cluster behave similarly? 2. Genes in different clusters behave differently? 1.Genes in same cluster behave similarly? 2. Genes in different clusters behave differently? Issues:
7
Bi-partite Graphs 1 2 3 4 1 2 3
8
Matrix Form
9
A Real Data Matrix (Leukemia)
10
Spectral Clustering: General Idea Discrete Optimisation Problem (NP - Hard) Discrete Optimisation Problem (NP - Hard) Real Optimisation Problem (Tractable) Real Optimisation Problem (Tractable) Approximation Exact - Impractical Heuristic - Practical
11
Discrete Optimisation SVD Active Inactive Active Solution: Singular Value Decomposition of W scaled
12
Clustering Algorithm: Summary ACTIVE INACTIVE
13
Literature
14
Types of Graph Matrices
15
How we Cluster
16
Leukemia Data
17
Clustered Leukemia Data
18
Inaccuracies in the Data (Perturbation Theory)
19
Perturbation Theory (Deterministic Noise)
20
Deterministic Perturbation (Symmetric Matrix)
21
Linear Solve
22
Taylor Expansions
23
Rectangular Case Symmetric
24
Random Perturbations (plan) The Model Issues with the Theory A Possible Solution via Simulations? Experiments
25
The Model 3 4 1 2 6 7 5
26
Difficulties with Random Matrix Theory (RMT)
27
Deterministic Perturbation Stochastic Perturbation (simple eigenvector)
28
Deterministic Perturbation Stochastic Perturbation (simple eigenvalues)
29
PP Plot -Test for Normality (Largest eigenvalue of a Symmetric Matrix)
30
Simulated Random Perturbation (Largest eigenvalue of a Symmetric Matrix)
31
Deterministic Perturbation Stochastic Perturbation (simple eigenvectors)
32
Results for Laplacian Matrices
33
Functional of the Eigenvector
34
Results for h T v 2
35
PP Plot of h T v’(0) - Test for Normality (h = e j )
36
Histogram of h T v’(0) - Simulations (h = e j )
37
PP Plot of Simulated v [j] ( ) (Distribution close to Normal)
38
Histogram of Simulated v [j] ( ) (Distribution close to Normal)
39
Extension to the Rectangular Case
40
Probability of “Wrong Clustering”
41
Issues with Numerics
42
Efficient Simulations
43
Solution via Simulations?
44
Solution via Simulations? (Algorithm)
45
Comparing: Direct Calculation Vs. Repeated Linear Solve
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.