Using Sparse Matrix Reordering Algorithms for Cluster Identification Chris Mueller Dec 9, 2004.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Algorithms (and Datastructures) Lecture 3 MAS 714 part 2 Hartmut Klauck.
Lecture 19: Parallel Algorithms
ECE 552 Numerical Circuit Analysis Chapter Four SPARSE MATRIX SOLUTION TECHNIQUES Copyright © I. Hajj 2012 All rights reserved.
Graphs Chapter 12. Chapter Objectives  To become familiar with graph terminology and the different types of graphs  To study a Graph ADT and different.
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Clustering II CMPUT 466/551 Nilanjan Ray. Mean-shift Clustering Will show slides from:
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Author: Jie chen and Yousef Saad IEEE transactions of knowledge and data engineering.
Maths for Computer Graphics
Graph & BFS.
Branch and Bound Similar to backtracking in generating a search tree and looking for one or more solutions Different in that the “objective” is constrained.
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Tirgul 9 Amortized analysis Graph representation.
Sparse Matrix Algorithms CS 524 – High-Performance Computing.
1 Lecture 25: Parallel Algorithms II Topics: matrix, graph, and sort algorithms Tuesday presentations:  Each group: 10 minutes  Describe the problem,
Graph COMP171 Fall Graph / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D E A C F B Vertex Edge.
Graph & BFS Lecture 22 COMP171 Fall Graph & BFS / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D.
1 Data Structures and Algorithms Graphs I: Representation and Search Gal A. Kaminka Computer Science Department.
Segmentation Graph-Theoretic Clustering.
Graphs Chapter 12. Chapter 12: Graphs2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study a Graph.
Graphs & Graph Algorithms 2 Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
CS 206 Introduction to Computer Science II 11 / 05 / 2008 Instructor: Michael Eckmann.
1 Parallel Algorithms III Topics: graph and sort algorithms.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Sparse Matrix Methods Day 1: Overview Matlab and examples Data structures Ax=b Sparse matrices and graphs Fill-reducing matrix permutations Matching and.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
1 EE 616 Computer Aided Analysis of Electronic Networks Lecture 4 Instructor: Dr. J. A. Starzyk, Professor School of EECS Ohio University Athens, OH,
Graphs & Graph Algorithms 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Clustering Unsupervised learning Generating “classes”
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Solving Scale Linear Systems (Example system continued) Lecture 14 MA/CS 471 Fall 2003.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
Matrices. A matrix, A, is a rectangular collection of numbers. A matrix with “m” rows and “n” columns is said to have order m x n. Each entry, or element,
Based on slides by Y. Peng University of Maryland
Graphs. 2 Graph definitions There are two kinds of graphs: directed graphs (sometimes called digraphs) and undirected graphs Birmingham Rugby London Cambridge.
I MPROVING C O -C LUSTER Q UALITY WITH A PPLICATION TO P RODUCT R ECOMMENDATIONS Michail Vlachos et al. Distributed Application Systems Presentation by.
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
JAVA AND MATRIX COMPUTATION
Solution of Sparse Linear Systems
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Meeting 18 Matrix Operations. Matrix If A is an m x n matrix - that is, a matrix with m rows and n columns – then the scalar entry in the i th row and.
Data Structures CSCI 132, Spring 2014 Lecture 38 Graphs
Direct Methods for Sparse Linear Systems Lecture 4 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen.
 The Sinkhorn-Knopp Algorithm and Fixed Point Problem  Solutions for 2 × 2 and special n × n cases  Circulant matrices for 3 × 3 case  Ongoing work.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
A Parallel, High Performance Implementation of the Dot Plot Algorithm Chris Mueller July 8, 2004.
COSC 2007 Data Structures II
Graphs Chapter 12. Chapter 12: Graphs2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study a Graph.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
Graphs 황승원 Fall 2010 CSE, POSTECH. 2 2 Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects.
Computer Graphics Matrices
1 Chapter 2 Notation and Definitions Data Structures Transformations.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Great Theoretical Ideas in Computer Science for Some.
Main Index Contents 11 Main Index Contents Graph Categories Graph Categories Example of Digraph Example of Digraph Connectedness of Digraph Connectedness.
Week 11 - Wednesday.  What did we talk about last time?  Graphs  Paths and circuits.
1 Lecture 5 (part 2) Graphs II (a) Circuits; (b) Representation Reading: Epp Chp 11.2, 11.3
Matrix Representation of Graphs
Depth-First Search.
CS120 Graphs.
Graphs.
Graph Operations And Representation
Reminder: Array Representation In C++
Algorithms CSCI 235, Spring 2019 Lecture 32 Graphs I
Chapter 9 Graph algorithms
Presentation transcript:

Using Sparse Matrix Reordering Algorithms for Cluster Identification Chris Mueller Dec 9, 2004

Visualizing a Graph as a Matrix Each row and column in the matrix corresponds to a node in the graph. The nodes are ordered the same in the rows and columns, so node 10 is represented by row=10 and col=10. Each edge between two nodes (a,b) is rendered as a dot at (i,j) where i is the row for a and j is the column for b. The solid diagonal shows the identity relationship for each node. Undirected graphs can be rendered as lower triangles, with each edge is displayed so that i <= j.

Visually Identifying Clusters Reordering the nodes (rows/cols) can reduce the noise in the display and highlight clusters. Dense areas in the matrix reveal potential clusters. Some dense areas may be in the same row or column as others, suggesting a relationship.

(Some) Previous Work The basic idea of visualizing relational data as a reordered matrix has been around since the early days of computer science. Some examples are: Bertin (1981), Graphics and Graphic Information Processing. From The Reorderable Matrix (Bertin, 1981) Block Clustering (Hartigan, 1972) GAP Generalized Association Plots (Chen, 2002) 吳漢銘 -Cluster_Lecture_ new.pdf

Sparse Matrices Sparse matrices can be stored in memory in data structures that are more compact that 2D arrays: The banded representation stores only the diagonals that have values: Matrices are the basic data structure for most numerical computations: Sparse matrices are matrices that do not need explicit values for each element: Note that zeros may be important and cannot always be excluded from that matrix [ n n 8 4 n ] The bandwidth is the number of diagonals required to store the matrix. In this example, the bandwidth is 4. Sparse matrix reordering algorithms reorder the elements in the matrix to achieve better use of memory or computational resources: [ n ] Swapping column 1 and 2 reduced the bandwidth to 3, decreased the amount of storage required by 2 elements, and removed 2 empty elements.

Sparse Matrix Reordering Algorithms Bandwidth Minimization: Reverse Cuthill-McKee and King’s Algorithm RCM(matrix): Represent the matrix as a graph Choose a suitable starting node For each node reachable from the current node: Output the node Find all unvisited neighbors Order them based on increasing degree Visit them in that order Minimizing Non-Zero Structure: Modified Minimum Degree MMD(matrix): Represent the matrix as a graph Order nodes based on degree Note that these algorithms are stochastic in the choice of starting nodes and ordering for nodes with the same degree. King’s algorithm is similar but it orders based on edges out of the current cluster rather than total edges.

Reordering the COG Database Basic Protocol: 1.Filter edges based on FASTA score 1.cmp2 is original data, cmp90, cmp200 are filtered 2.Shuffle the data 3.For each sorted and shuffled graph 1.Identify the connected components 2.Apply RCM and King’s algorithm to each component 3.Apply MMD to the entire graph

Results by the Numbers (but the pictures show sooo much more…)

Visualization Key Red dots are edges Both axes have the nodes in the same order Blue dots are the COG families for the node in column j. Green lines show the extent of a COG family. Black dots show the elements in the family.

Discussion All algorithms worked as expected However, the matrix ordering goals were too simple to yield good cluster clusters. Possible Future Work –Extended algorithms that allow more information to be used –Exploit features of ordering strategies to do a second pass that generates better clusters? –Hypergraph reordering (demo of reordering by hand)