Graph Algorithms for Irregular, Unstructured Data John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory July, 2010.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Multiple Processor Systems
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Large Scale Computing Systems
Distributed Systems CS
Parallel Programming Motivation and terminology – from ACM/IEEE 2013 curricula.
Seunghwa Kang David A. Bader Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System.
1/1/ /e/e eindhoven university of technology Microprocessor Design Course 5Z008 Dr.ir. A.C. (Ad) Verschueren Eindhoven University of Technology Section.
Today’s topics Single processors and the Memory Hierarchy
Classification of Distributed Systems Properties of Distributed Systems n motivation: advantages of distributed systems n classification l architecture.
Computer Architecture Introduction to MIMD architectures Ola Flygt Växjö University
Distributed Process Scheduling Summery Distributed Process Scheduling Summery BY:-Yonatan Negash.
Small-World Graphs for High Performance Networking Reem Alshahrani Kent State University.
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
Revisiting a slide from the syllabus: CS 525 will cover Parallel and distributed computing architectures – Shared memory processors – Distributed memory.
Graph & BFS.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Graph Analysis with High Performance Computing by Bruce Hendrickson and Jonathan W. Berry Sandria National Laboratories Published in the March/April 2008.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution.
Massive Streaming Data Analytics: A Case Study with Clustering Coefficients David Ediger Karl Jiang Jason Riedy David A. Bader Georgia Institute of Technology.
COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.
A Lightweight Infrastructure for Graph Analytics Donald Nguyen Andrew Lenharth and Keshav Pingali The University of Texas at Austin.
Chapter 5 Array Processors. Introduction  Major characteristics of SIMD architectures –A single processor(CP) –Synchronous array processors(PEs) –Data-parallel.
Analysis and Modeling of the Open Source Software Community Yongqin Gao, Greg Madey Computer Science & Engineering University of Notre Dame Vincent Freeh.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
XMT BOF SC09 XMT Status And Roadmap Shoaib Mufti Director Knowledge Management.
Early Experience with Out-of-Core Applications on the Cray XMT Daniel Chavarría-Miranda §, Andrés Márquez §, Jarek Nieplocha §, Kristyn Maschhoff † and.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Virtualization. Virtualization  In computing, virtualization is a broad term that refers to the abstraction of computer resources  It is "a technique.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
Multiprocessor systems Objective n the multiprocessors’ organization and implementation n the shared-memory in multiprocessor n static and dynamic connection.
Introduction Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
Software and Hardware Requirements for Next-Generation Data Analytics John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
1 Next Few Classes Networking basics Protection & Security.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Supercomputing ‘99 Parallelization of a Dynamic Unstructured Application using Three Leading Paradigms Leonid Oliker NERSC Lawrence Berkeley National Laboratory.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
PARALLEL APPLICATIONS EE 524/CS 561 Kishore Dhaveji 01/09/2000.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Data Structures and Algorithms in Parallel Computing Lecture 1.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Lecture on Central Process Unit (CPU)
Data Structures and Algorithms in Parallel Computing
Extreme Computing’05 Parallel Graph Algorithms: Architectural Demands of Pathological Applications Bruce Hendrickson Jonathan Berry Keith Underwood Sandia.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Background Computer System Architectures Computer System Software.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
1  2004 Morgan Kaufmann Publishers Fallacies and Pitfalls Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never.
CS120 Graphs.
Component Frameworks:
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
Multiple Processor Systems
Gary M. Zoppetti Gagan Agrawal
Prof. Onur Mutlu Carnegie Mellon University
Presentation transcript:

Graph Algorithms for Irregular, Unstructured Data John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory July, 2010

Analytic methods and applications Community thought leaders Blog Analysis Community Activities FaceBook M users Connect-the-dots Bus Hayashi Zaire Train Anthrax Money Endo National Security People, Places, & Actions Semantic Web Anomaly detection Security N-x contingency analysis SmartGrid

Data analytics Sample queries: Allegiance switching: identify entities that switch communities. Community structure: identify the genesis and dissipation of communities Phase change: identify significant change in the network structure Traditional graph partitioning often fails: Topology: Interaction graph is low-diameter and has no good separators Irregularity: Communities are not uniform in size Overlap: individuals are members of one or more communities 1000x growth in 3 years! has more than 300 million active users

Graphs are not grids Graphs arising in informatics are very different from the grids used in scientific computing Static or slowly involving Planar Nearest neighbor communication Work performed per cell or node Work modifies local data Scientific Grids Dynamic Non-planar Communications are non-local and dynamic Work performed by crawlers or autonomous agents Work modifies data in many places Graphs for Data Informatics

Small-world and scale-free In low diameter graphs work explodes difficult to partition high percentage of nodes are visited “Six degrees of separation” Large hubs are in grey In scale-free graphs difficult to partition work concentrates in a few nodes

Paths Shortest path Betweenness Min/max flow Structures Spanning trees Connected components Graph isomorphism Groups Matching/Coloring Partitioning Equivalence Graph methods Influential Factors Degree distribution Normal Scale-free Planar or non-planar Static or dynamic Weighted or unweighted Weight distribution Typed or untyped edges Load imbalance Non-planar Concurrent inserts and deletions Difficult to partition

Challenges Problem size Ton of bytes, not ton of flops Little data locality Have only parallelism to tolerate latencies Low computation to communication ratio Single word access Threads limited by loads and stores Frequent synchronization Node, edge, record Work tends to be dynamic and imbalanced Let any processor execute any thread

Grids, Uniform, and Scale-Free Graphs USA Roadmap Uniform Scale-Free METIS Partitioner

System requirements Global shared memory No simple data partitions Local storage for thread private data Network support for single word accesses Transfer multiple words when locality exists Multi-threaded processors Hide latency with parallelism Single cycle context switching Multiple outstanding loads and stores per thread Full-and-empty bits Efficient synchronization Wait in memory Message driven operations Dynamic work queues Hardware support for thread migration Cray XMT

Center for Adaptive Supercomputer Software Driving Development of Next-Generation Massively Multithreading Architectures Sponsored by DOD

Summary The new HPC is irregular and sparse There are commercial and consumer applications If the applications are important enough, machines will be built HPC is too large and too diverse for “one size fits all” We need to build the right machines for the problems we have to solve