אשכול בעזרת אלגורתמים בתורת הגרפים

Slides:



Advertisements
Similar presentations
Partitional Algorithms to Detect Complex Clusters
Advertisements

Clustering.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
1 Perceptual Organization and Linear Algebra Charless Fowlkes Computer Science Dept. University of California at Berkeley.
Modularity and community structure in networks
Information Networks Graph Clustering Lecture 14.
Normalized Cuts and Image Segmentation
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Clustering II CMPUT 466/551 Nilanjan Ray. Mean-shift Clustering Will show slides from:
Author: Jie chen and Yousef Saad IEEE transactions of knowledge and data engineering.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Graph Clustering. Why graph clustering is useful? Distance matrices are graphs  as useful as any other clustering Identification of communities in social.
Lecture 21: Spectral Clustering
Spectral Clustering Scatter plot of a 2D data set K-means ClusteringSpectral Clustering U. von Luxburg. A tutorial on spectral clustering. Technical report,
Spectral Clustering 指導教授 : 王聖智 S. J. Wang 學生 : 羅介暐 Jie-Wei Luo.
CS 376b Introduction to Computer Vision 04 / 08 / 2008 Instructor: Michael Eckmann.
Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malik, Presented by: Alireza Tavakkoli.
3D Geometry for Computer Graphics
Region Segmentation. Find sets of pixels, such that All pixels in region i satisfy some constraint of similarity.
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Dominic Rizzo and Giota Stratou.
A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts
אשכול - clustering Clustering הוא תחום הקשור לבינה מלאכותית, ותת תחום של למידה לא מבוקרת.
Fast algorithm for detecting community structure in networks.
Segmentation Graph-Theoretic Clustering.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University.
Spectral Clustering Royi Itzhak. Spectral Clustering Algorithms that cluster points using eigenvectors of matrices derived from the dataAlgorithms that.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Graph Partitioning and Clustering E={w ij } Set of weighted edges indicating pair-wise similarity between points Similarity Graph.
Segmentation using eigenvectors Papers: “Normalized Cuts and Image Segmentation”. Jianbo Shi and Jitendra Malik, IEEE, 2000 “Segmentation using eigenvectors:
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
Spectral Clustering Jianping Fan Dept of Computer Science UNC, Charlotte.
Learning Spectral Clustering, With Application to Speech Separation F. R. Bach and M. I. Jordan, JMLR 2006.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Clustering – Part III: Spectral Clustering COSC 526 Class 14 Arvind Ramanathan Computational Science & Engineering Division Oak Ridge National Laboratory,
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
Signal & Weight Vector Spaces
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Spectral Clustering Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford.
Mesh Segmentation via Spectral Embedding and Contour Analysis Speaker: Min Meng
A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.
Normalized Cuts and Image Segmentation Patrick Denis COSC 6121 York University Jianbo Shi and Jitendra Malik.
Document Clustering with Prior Knowledge Xiang Ji et al. Document Clustering with Prior Knowledge. SIGIR 2006 Presenter: Suhan Yu.
Clustering (2) Center-based algorithms Fuzzy k-means Density-based algorithms ( DBSCAN as an example ) Evaluation of clustering results Figures and equations.
Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1.
Spectral Methods for Dimensionality
Spectral partitioning works: Planar graphs and finite element meshes
Graph Theory and Algorithm 01
Dipartimento di Ingegneria «Enzo Ferrari»,
Jianping Fan Dept of CS UNC-Charlotte
Segmentation Graph-Theoretic Clustering.
Grouping.
Discovering Clusters in Graphs
Graph Operations And Representation
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
3.3 Network-Centric Community Detection
Text Book: Introduction to algorithms By C L R S
“Traditional” image segmentation
Most slides are from Eyal David’s presentation
Analysis of Large Graphs: Community Detection
Presentation transcript:

אשכול בעזרת אלגורתמים בתורת הגרפים רועי יצחק

Elements of Graph Theory A graph G = (V,E) consists of a vertex set V and an edge set E. If G is a directed graph, each edge is an ordered pair of vertices A bipartite graph is one in which the vertices can be divided into two groups, so that all edges join vertices in different groups.

העברת הנק' לגרף קבע את הנק' במישור קבע מרחק בין כל זוג נקודות 0 1 1.5 2 5 6 7 9 1 0 2 1 6.5 6 8 8 1.5 2 0 1 4 4 6 5.5 . graph representation distance matrix n-D data points

Minimal Cut Create a graph G(V,E) from the data Compute all the distances in the graph The weight of each edge is the distance Remove edges from the graph according to some threshold

Similarity Graph Distance decrease similarty increase Represent dataset as a weighted graph G(V,E) 1 2 3 4 5 6 V={xi} Set of n vertices representing data points 0.1 0.2 0.8 0.7 0.6 E={Wij} Set of weighted edges indicating pair-wise similarity between points

Similarity Graph Wij represent similarity between vertex If Wij=0 where isn’t similarity Wii=0

Graph Partitioning Clustering can be viewed as partitioning a similarity graph Bi-partitioning task: Divide vertices into two disjoint groups (A,B) A B 1 5 2 4 6 3 V=A U B

Clustering Objectives Traditional definition of a “good” clustering: Points assigned to same cluster should be highly similar. Points assigned to different clusters should be highly dissimilar. 1 2 3 4 5 6 Apply these objectives to our graph representation Minimize weight of between-group connections 0.1 0.2 0.8 0.7 0.6

Graph Cuts Express partitioning objectives as a function of the “edge cut” of the partition. Cut: Set of edges with only one vertex in a group.we wants to find the minimal cut beetween groups. The groups that has the minimal cut would be the partition 0.1 0.2 0.8 0.7 0.6 1 2 3 4 5 6 A B cut(A,B) = 0.3

Graph Cut Criteria min cut(A,B) Criterion: Minimum-cut Minimise weight of connections between groups min cut(A,B) Degenerate case: Optimal cut Minimum cut Problem: Only considers external cluster connections Does not consider internal cluster density

Graph Cut Criteria (continued) Criterion: Normalised-cut (Shi & Malik,’97) Consider the connectivity between groups relative to the density of each group. Normalise the association between groups by volume. Vol(A): The total weight of the edges originating from group A. Why use this criterion? Minimising the normalised cut is equivalent to maximising normalised association. Produces more balanced partitions.

עץ פורש מינימאלי (MST-Prim) קבע קודקוד מקור והכנס אותו לסט A (עץ) מצא את הקשת הקלה ביותר אשר המחברת בין הסט B (שאר הקודקודים בגרף) לעץ (A) חזור על התהליך עד שלא ישארו קודקודים בסט B

מציאת clustring קבע את כיוון ההתקדמות בעץ (כל הוספה של צומת) בפונקציה של משקל הקשת שהוספה כל "עמק" בגרף מייצג cluster

Example  a b c d e h g f i 4  8 a b c d e h g f i 8 7 4 9 11 14 6 10 2 14 9 10 6 4  8 a b c d e h g f i 4 8 7 11 1 2 14 9 10 6

Example 4  8  8 a b c d e h g f i  4  8 7 a b c d e h g f i 2 4 8 11 1 2 14 9 10 6  4  8 7 a b c d e h g f i 4 8 7 11 1 2 14 9 10 6 2 4

Example 4 7  8 2 a b c d e h g f i 7 6 4 7  6 8 2 a b c d e h g f i 11 1 2 14 9 10 6 7 6 4 7  6 8 2 a b c d e h g f i 4 8 7 11 1 2 14 9 10 6 10 2

Example 4 7 10 2 8 a b c d e h g f i 1 4 7 10 1 2 8 a b c d e h g f i 11 1 2 14 9 10 6 1 4 7 10 1 2 8 a b c d e h g f i 4 8 7 11 1 2 14 9 10 6

Example 4 7 10 1 2 8 a b c d e h g f i 4 8 7 11 1 2 14 9 10 6 9

Similarity Graph Wij represent similarity between vertex If Wij=0 where isn’t similarity Wii=0

Example – 2 Spirals Dataset exhibits complex cluster shapes K-means performs very poorly in this space due bias toward dense spherical clusters. In the embedded space given by two leading eigenvectors, clusters are trivial to separate.

Spectral Graph Theory Possible approach Spectral Graph Theory Represent a similarity graph as a matrix Apply knowledge from Linear Algebra… The eigenvalues and eigenvectors of a matrix provide global information about its structure. Spectral Graph Theory Analyse the “spectrum” of matrix representing a graph. Spectrum : The eigenvectors of a graph, ordered by the magnitude(strength) of their corresponding eigenvalues.

Matrix Representations Adjacency matrix (A) n x n matrix : edge weight between vertex xi and xj x1 x2 x3 x4 x5 x6 0.8 0.6 0.1 0.2 0.7 0.1 0.8 0.8 1 5 0.8 0.6 6 2 4 0.7 0.8 0.2 3 Important properties: Symmetric matrix Eigenvalues are real Eigenvector could span orthogonal base

Matrix Representations (continued) Degree matrix (D) n x n diagonal matrix : total weight of edges incident to vertex xi x1 x2 x3 x4 x5 x6 1.5 1.6 1.7 0.1 5 0.8 0.8 1 0.8 0.6 6 2 4 0.7 0.8 0.2 3 Important application: Normalise adjacency matrix

Matrix Representations (continued) L = D - A Laplacian matrix (L) n x n symmetric matrix x1 x2 x3 x4 x5 x6 1.5 -0.8 -0.6 -0.1 1.6 -0.2 1.7 -0.7 0.8- 0.1 5 0.8 0.8 1 0.8 0.6 6 2 4 0.7 0.8 0.2 3 Important properties: Eigenvalues are non-negative real numbers Eigenvectors are real and orthogonal Eigenvalues and eigenvectors provide an insight into the connectivity of the graph…

Another option – normalized laplasian Laplacian matrix (L) n x n symmetric matrix 0.00 -0.06 -0.39 -0.52 1.00 -0.50 -0.12 -0.44 -0.47 0.47- 0.1 5 0.8 0.8 1 0.8 0.6 6 2 4 0.7 0.8 0.2 3 Important properties: Eigenvectors are real and normalize Each Aij which i,j is not equal =

Spectral Clustering Algorithms Three basic stages: Pre-processing Construct a matrix representation of the dataset. Decomposition Compute eigenvalues and eigenvectors of the matrix. Map each point to a lower-dimensional representation based on one or more eigenvectors. Grouping Assign points to two or more clusters, based on the new representation.

Spectral Bi-partitioning Algorithm x1 x2 x3 x4 x5 x6 1.5 -0.8 -0.6 -0.1 1.6 -0.2 1.7 -0.7 Pre-processing Build Laplacian matrix L of the graph 0.9 0.8 0.5 -0.2 -0.7 0.4 -0.6 -0.8 -0.4 0.2 0.6 0.0 0.3 -0. 0.1 -0.9 3.0 2.5 2.3 2.2 Λ = X = Decomposition Find eigenvalues X and eigenvectors Λ of the matrix L -0.7 x6 x5 -0.4 x4 0.2 x3 x2 x1 Map vertices to corresponding components of λ2

Spectral Bi-partitioning Algorithm The matrix which represents the eigenvector of the laplacian the eigenvector matched to the corresponded eigenvalues with increasing order 0.11 -0.38 -0.31 -0.65 -0.41 0.41 0.22 0.71 0.30 0.01 -0.44 -0.37 -0.39 0.04 0.64 0.61 0.00 -0.45 0.34 0.37 0.35 -0.30 -0.17 0.09 -0.29 0.72 -0.18 0.45

Spectral Bi-partitioning (continued) Grouping Sort components of reduced 1-dimensional vector. Identify clusters by splitting the sorted vector in two. How to choose a splitting point? Naïve approaches: Split at 0, mean or median value More expensive approaches Attempt to minimise normalised cut criterion in 1-dimension -0.7 x6 x5 -0.4 x4 0.2 x3 x2 x1 Split at 0 Cluster A: Positive points Cluster B: Negative points 0.2 x3 x2 x1 -0.7 x6 x5 -0.4 x4 A B