Massive Streaming Data Analytics: A Case Study with Clustering Coefficients David Ediger Karl Jiang Jason Riedy David A. Bader Georgia Institute of Technology.

Slides:



Advertisements
Similar presentations
Hazard Pointers: Safe Memory Reclamation of Lock-Free Objects
Advertisements

Hashing.
Threads, SMP, and Microkernels
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Seunghwa Kang David A. Bader Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System.
Topological Sort and Hashing
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture Author: Jahangir Hasan, Srihari Cadambi, Venkatta Jakkula Srimat.
Continuously Recording Program Execution for Deterministic Replay Debugging.
Graph Analysis with High Performance Computing by Bruce Hendrickson and Jonathan W. Berry Sandria National Laboratories Published in the March/April 2008.
Threads - Definition - Advantages using Threads - User and Kernel Threads - Multithreading Models - Java and Solaris Threads - Examples - Definition -
Review for Test 2 i206 Fall 2010 John Chuang. 2 Topics  Operating System and Memory Hierarchy  Algorithm analysis and Big-O Notation  Data structures.
1 Chapter 4 Threads Threads: Resource ownership and execution.
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
XMT BOF SC09 XMT Status And Roadmap Shoaib Mufti Director Knowledge Management.
Early Experience with Out-of-Core Applications on the Cray XMT Daniel Chavarría-Miranda §, Andrés Márquez §, Jarek Nieplocha §, Kristyn Maschhoff † and.
Process Management. Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication.
1 Chapter 04 Authors: John Hennessy & David Patterson.
Computers Data Representation Chapter 3, SA. Data Representation and Processing Data and information processors must be able to: Recognize external data.
Graph Algorithms for Irregular, Unstructured Data John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory July, 2010.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Course Introduction Andy Wang COP 5611 Advanced Operating Systems.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
IT253: Computer Organization
System Architecture Directions for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister Presented by Yang Zhao.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Memory Management Fundamentals Virtual Memory. Outline Introduction Motivation for virtual memory Paging – general concepts –Principle of locality, demand.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
Operating System Structure A key concept of operating systems is multiprogramming. –Goal of multiprogramming is to efficiently utilize all of the computing.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
Lecture on Central Process Unit (CPU)
Mining of Massive Datasets Ch4. Mining Data Streams
1 Efficient System-on-Chip Energy Management with a Segmented Counting Bloom Filter Mrinmoy Ghosh- Georgia Tech Emre Özer- ARM Ltd Stuart Biles- ARM Ltd.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
Extreme Computing’05 Parallel Graph Algorithms: Architectural Demands of Pathological Applications Bruce Hendrickson Jonathan Berry Keith Underwood Sandia.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
CITA 171 Section 1 DOS/Windows Introduction. DOS Disk operating system (DOS) –Term most often associated with MS-DOS –Single-tasking operating system.
1 Chapter 5: Threads Overview Multithreading Models & Issues Read Chapter 5 pages
Our Graphics Environment Landscape Rendering. Hardware  CPU  Modern CPUs are multicore processors  User programs can run at the same time as other.
Computer Engg, IIT(BHU)
Author: Heeyeol Yu; Mahapatra, R.; Publisher: IEEE INFOCOM 2008
Algorithmic Improvements for Fast Concurrent Cuckoo Hashing
Andy Wang COP 5611 Advanced Operating Systems
Course Introduction Dr. Eggen COP 6611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
Chapter 4: Multithreaded Programming
Process Management Presented By Aditya Gupta Assistant Professor
CS510 Concurrent Systems Jonathan Walpole.
Operating System Concepts
Operating Systems and Systems Programming
Characterizing and Analyzing Massive Spatio-Temporal Graphs
Threads, SMP, and Microkernels
CS510 Concurrent Systems Jonathan Walpole.
NVIDIA Fermi Architecture
Symmetric Multiprocessing (SMP)
Predictive Analysis of Massive Streaming Graphs
Memory Management Tasks
Implementing an OpenFlow Switch on the NetFPGA platform
Andy Wang COP 5611 Advanced Operating Systems
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Fully-dynamic aimGraph
Andy Wang COP 5611 Advanced Operating Systems
Operating System Overview
Andy Wang COP 5611 Advanced Operating Systems
Threads -For CSIT.
Presentation transcript:

Massive Streaming Data Analytics: A Case Study with Clustering Coefficients David Ediger Karl Jiang Jason Riedy David A. Bader Georgia Institute of Technology Atlanta, GA USA 1

STINGER Data Structure Spatio-temporal Interaction Networks and Graphs (STING) Extensible Representation General-purpose data structure for dynamic graphs Efficient edge insertion/deletion (updates) with concurrent readers (analysis) 2

STINGER Data Structure Array of linked lists, which may have empty slots (from deleting edges) Additional stored info not in paper Efficient updates Concurrent reads (no locking) 3

Assumptions for parallelism Single streaming source for inserts/deletes Changes are scattered widely – Batches are sufficiently independent Analysis kernels have small range – Graph change only requires access to local portions and affects small portion of output 4

Assumptions (continued) 5

Case Study: Updating Clustering Coefficients Clustering coefficients measure density of closed triangles: One way of determining if a graph is a small- world graph 6

Bloom filter Consider an edge list represented as a bit array (1 bit per edge) => O(n) storage space Bloom filter is a bit array with an arbitrary, smaller number of bits A hash function maps a vertex to a specific bit Small number of bits == high collision rate To reduce false-positives, use k independent hash functions to set multiple bits 7

Bloom filter 8

Testbed Massively multi-threaded Cray XMT – 64 Threadstorm processors Each running at 500MHz Each has 128 hardware streams maintaining a thread context Context switches occur every cycle 512 GiB globally addressable shared memory – (holds 2 billion vertices and 17 billion edges) Synthetic data – 16 million vertices, ~500 million edges 9