SALSASALSA Harp: Collective Communication on Hadoop Judy Qiu, Indiana University.

Slides:



Advertisements
Similar presentations
SALSA HPC Group School of Informatics and Computing Indiana University.
Advertisements

Introduction to Programming Paradigms Activity at Data Intensive Workshop Shantenu Jha represented by Geoffrey Fox
Twister4Azure Iterative MapReduce for Windows Azure Cloud Thilina Gunarathne Indiana University Iterative MapReduce for Azure Cloud.
Data-Intensive Computing with MapReduce/Pig Pramod Bhatotia MPI-SWS Distributed Systems – Winter Semester 2014.
SCALABLE PARALLEL COMPUTING ON CLOUDS : EFFICIENT AND SCALABLE ARCHITECTURES TO PERFORM PLEASINGLY PARALLEL, MAPREDUCE AND ITERATIVE DATA INTENSIVE COMPUTATIONS.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
Big Data Open Source Software and Projects ABDS in Summary XIII: Level 14A I590 Data Science Curriculum August Geoffrey Fox
Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.
Dimension Reduction and Visualization of Large High-Dimensional Data via Interpolation Seung-Hee Bae, Jong Youl Choi, Judy Qiu, and Geoffrey Fox School.
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
Panel Session The Challenges at the Interface of Life Sciences and Cyberinfrastructure and how should we tackle them? Chris Johnson, Geoffrey Fox, Shantenu.
Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
School of Informatics and Computing Indiana University
SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Portable Parallel Programming on Cloud and HPC: Scientific Applications of Twister4Azure Thilina Gunarathne Bingjing Zhang, Tak-Lon.
Big Data Ogres and their Facets Geoffrey Fox, Judy Qiu, Shantenu Jha, Saliya Ekanayake Big Data Ogres are an attempt to characterize applications and algorithms.
SALSASALSASALSASALSA MSR Internship – Final Presentation Jaliya Ekanayake School of Informatics and Computing Indiana University.
SALSASALSASALSASALSA Design Pattern for Scientific Applications in DryadLINQ CTP DataCloud-SC11 Hui Li Yang Ruan, Yuduo Zhou Judy Qiu, Geoffrey Fox.
Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.
Big Data Open Source Software and Projects ABDS in Summary XVIII: Layer 14A Data Science Curriculum March Geoffrey Fox
Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.
SALSASALSASALSASALSA CloudComp 09 Munich, Germany Jaliya Ekanayake, Geoffrey Fox School of Informatics and Computing Pervasive.
SALSA HPC Group School of Informatics and Computing Indiana University.
6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS), Nov. 17, 2013 Judy Qiu SALSA hpc.indiana.edu.
SALSASALSASALSASALSA Clouds Ball Aerospace March Geoffrey Fox
Internet of Things (Smart Grid) Storm Archival Storage – NOSQL like Hbase Streaming Processing (Iterative MapReduce) Batch Processing (Iterative MapReduce)
Towards a Collective Layer in the Big Data Stack Thilina Gunarathne Judy Qiu
Looking at Use Case 19, 20 Genomics 1st JTC 1 SGBD Meeting SDSC San Diego March Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox
Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics,
SALSA Group Research Activities April 27, Research Overview  MapReduce Runtime  Twister  Azure MapReduce  Dryad and Parallel Applications 
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Parallel Applications And Tools For Cloud Computing Environments CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Memcached Integration with Twister Saliya Ekanayake - Jerome Mitchell - Yiming Sun -
SALSASALSASALSASALSA Data Intensive Biomedical Computing Systems Statewide IT Conference October 1, 2009, Indianapolis Judy Qiu
SALSASALSASALSASALSA IU Twister Supports Data Intensive Science Applications School of Informatics and Computing Indiana University.
SALSASALSA Large-Scale Data Analysis Applications Computer Vision Complex Networks Bioinformatics Deep Learning Data analysis plays an important role in.
SALSA HPC Group School of Informatics and Computing Indiana University Workshop on Petascale Data Analytics: Challenges, and.
Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.
EpiC: an Extensible and Scalable System for Processing Big Data Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian Lee Tan, Sai Wu School of Computing, National.
Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Image taken from: slideshare
Digital Science Center II
Some slides adapted from those of Yuan Yu and Michael Isard
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
Spark Presentation.
NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.
Interactive Website (
Distinguishing Parallel and Distributed Computing Performance
MapReduce and Data Intensive Applications XSEDE’12 BOF Session
I590 Data Science Curriculum August
Applying Twister to Scientific Applications
High Performance Big Data Computing in the Digital Science Center
Convergence of HPC and Clouds for Large-Scale Data enabled Science
Data Science Curriculum March
Tutorial Overview February 2017
SC09 Doctoral Symposium, Portland, 11/18/2009
Scientific Data Analytics on Cloud and HPC Platforms
Scalable Parallel Interoperable Data Analytics Library
Cloud DIKW based on HPC-ABDS to integrate streaming and batch Big Data
Distinguishing Parallel and Distributed Computing Performance
Parallel Applications And Tools For Cloud Computing Environments
Clouds from FutureGrid’s Perspective
Group 15 Swathi Gurram Prajakta Purohit
Twister2: Design of a Big Data Toolkit
Big Data, Simulations and HPC Convergence
Motivation Contemporary big data tools such as MapReduce and graph processing tools have fixed data abstraction and support a limited set of communication.
Convergence of Big Data and Extreme Computing
Presentation transcript:

SALSASALSA Harp: Collective Communication on Hadoop Judy Qiu, Indiana University

SALSASALSA Prof. David Crandall Computer Vision Prof. Filippo Menczer Complex Networks and Systems Bingjing Zhang Acknowledgement Xiaoming GaoStephen Wu Thilina Gunarathne Yuan Young Prof. Haixu Tang Bioinformatics SALSA HPC Group School of Informatics and Computing Indiana University Zhenghao Gu Prof. Madhav Marath Network Science and HCI Prof. Andrew Ng Machine Learning

SALSASALSA Machine Learning on Big Data Mahout on Hadoop MLlib on Spark GraphLab Toolkits GraphLab Computer Vision Toolkit Extracting Knowledge with Data Analytics

MapReduce Model DAG Model Graph Model BSP/Collective Model Storm Twister For Iterations/ Learning For Streaming For Query S4 Drill Hadoop MPI Dryad/ DryadLINQ Pig/PigLatin Spark Shark Spark Streaming MRQL Hive Tez Giraph Hama GraphLab Harp GraphX HaLoop Samza The World of Big Data Tools Stratosphere Reef Do we need 140 software packages?

SALSASALSA Programming Runtimes High-level programming models such as MapReduce adopt a data-centered design Computation starts from data Support moving computation to data Shows promising results for data-intensive computing ( Google, Yahoo, Amazon, Microsoft …) Challenges: traditional MapReduce and classical parallel runtimes cannot solve iterative algorithms efficiently Hadoop: repeated data access to HDFS, no optimization to (in memory) data caching and (collective) intermediate data transfers MPI: no natural support of fault tolerance; programming interface is complicated MPI, PVM, Hadoop MapReduce Chapel, X10, HPF Chapel, X10, HPF Classic Cloud: Queues, Workers DAGMan, BOINC Workflows, Swift, Falkon PaaS: Worker Roles PaaS: Worker Roles Perform Computations Efficiently Achieve Higher Throughput Pig Latin, Hive Pig Latin, Hive

SALSASALSA (a) Map Only (Pleasingly Parallel) (b) Classic MapReduce (c) Iterative MapReduce (d) Loosely Synchronous - CAP3 Gene Analysis -Smith-Waterman Distances - Document conversion (PDF -> HTML) - Brute force searches in cryptography - Parametric sweeps - PolarGrid MATLAB data analysis - High Energy Physics (HEP) Histograms - Distributed search - Distributed sorting - Information retrieval - Calculation of Pairwise Distances for sequences (BLAST) -Expectation maximization algorithms -Linear Algebra - Data mining, includes K-means clustering -Deterministic Annealing Clustering - Multidimensional Scaling (MDS) - PageRank Many MPI scientific applications utilizing wide variety of communication constructs, including local interactions - Solving Differential Equations and particle dynamics with short range forces Pij Collective CommunicationMPI Input Output map Input map reduce Input map iterations No Communication reduce Applications & Different Interconnection Patterns Domain of MapReduce and Iterative Extensions

SALSASALSA Iterative MapReduce Mapreduce is a Programming Model instantiating the paradigm of bringing computation to data Iterative Mapreduce extends Mapreduce programming model and support iterative algorithms for Data Mining and Data Analysis Is it possible to use the same computational tools on HPC and Cloud? Enabling scientists to focus on science not programming distributed systems

SALSASALSA Data Analysis Tools MapReduce optimized for iterative computations Twister: the speedy elephant In-Memory Cacheable map/reduce tasks Data Flow Iterative Loop Invariant Variable data Thread Lightweight Local aggregation Map-Collective Communication patterns optimized for large intermediate data transfer Portability HPC (Java) Azure Cloud (C#) Supercomputer (C++, Java) Abstractions

SALSASALSA Reduce (Key, List ) Map(Key, Value) Loop Invariant Data Loaded only once Loop Invariant Data Loaded only once Faster intermediate data transfer mechanism Combiner operation to collect all reduce outputs Cacheable map/reduce tasks (in memory) Cacheable map/reduce tasks (in memory) Configure() Combine(Map ) Programming Model for Iterative MapReduce Distinction on loop invariant data and variable data (data flow vs. δ flow) Cacheable map/reduce tasks (in-memory) Combine operation Main Program while(..) { runMapReduce(..) } Variable data

SALSASALSA 10 Broadcast Comparison: Twister vs. MPI vs. Spark At least a factor of 120 on 125 nodes, compared with the simple broadcast algorithm The new topology-aware chain broadcasting algorithm gives 20% better performance than best C/C++ MPI methods (four times faster than Java MPJ) A factor of 5 improvement over non-optimized (for topology) pipeline-based method over 150 nodes. Tested on IU Polar Grid with 1 Gbps Ethernet connection High Performance Data Movement

SALSASALSA Harp Map-Collective Communication Model Parallelism Model Architecture Shuffle M M MM Collective Communication M M MM RR Map-Collective Model MapReduce Model YARN MapReduce V2 Harp MapReduce Applications Map-Collective Applications Application Framework Resource Manager We generalize the Map-Reduce concept to Map-Collective, noting that large collectives are a distinguishing feature of data intensive and data mining applications. Hadoop Plugin (on Hadoop and Hadoop 2.2.0)

SALSASALSA Vertex Table KeyValue Partition Array Commutable Key-Values Vertices, Edges, Messages Double Array Int Array Long Array Array Partition Struct Object Vertex Partition Edge Partition Array Table Message Partition KeyValue Table Byte Array Message Table Edge Table Broadcast, Send, Gather Broadcast, Allgather, Allreduce, Regroup-(combine/reduce), Message-to-Vertex, Edge-to-Vertex Broadcast, Send Table Partition Basic Types Hierarchical Data Abstraction and Collective Communication

SALSASALSA K-means Clustering Parallel Efficiency Shantenu Jha et al. A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures

SALSASALSA WDA-MDS Performance on Big Red II

SALSASALSA Data Intensive Kmeans Clustering ─ Image Classification: 7 million images ; 512 features per image; 1 million clusters 10K Map tasks; 64G broadcasting data (1GB data transfer per Map task node); 20 TB intermediate data in shuffling.

SALSASALSA Provides system authors with a centralized (pluggable) control flow Embeds a user-defined system controller called the Job Driver Event driven control Package a variety of data-processing libraries (e.g., high-bandwidth shuffle, relational operators, low-latency group communication, etc.) in a reusable form. To cover different models such as MapReduce, query, graph processing and stream data processing Apache Open Source Project

SALSASALSA Research run times that will run Algorithms on a much larger scale Provide Data Service on Clustering and MDS Algorithms Future Work