SALSA HPC Group School of Informatics and Computing Indiana University Workshop on Petascale Data Analytics: Challenges, and.

Slides:



Advertisements
Similar presentations
SALSA HPC Group School of Informatics and Computing Indiana University.
Advertisements

Introduction to Programming Paradigms Activity at Data Intensive Workshop Shantenu Jha represented by Geoffrey Fox
Twister4Azure Iterative MapReduce for Windows Azure Cloud Thilina Gunarathne Indiana University Iterative MapReduce for Azure Cloud.
SCALABLE PARALLEL COMPUTING ON CLOUDS : EFFICIENT AND SCALABLE ARCHITECTURES TO PERFORM PLEASINGLY PARALLEL, MAPREDUCE AND ITERATIVE DATA INTENSIVE COMPUTATIONS.
Hybrid MapReduce Workflow Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US.
SALSASALSASALSASALSA Using MapReduce Technologies in Bioinformatics and Medical Informatics Computing for Systems and Computational Biology Workshop SC09.
SALSASALSASALSASALSA Using Cloud Technologies for Bioinformatics Applications MTAGS Workshop SC09 Portland Oregon November Judy Qiu
Interpolative Multidimensional Scaling Techniques for the Identification of Clusters in Very Large Sequence Sets April 27, 2011.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.
MapReduce in the Clouds for Science CloudCom 2010 Nov 30 – Dec 3, 2010 Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox {tgunarat, taklwu,
Scalable Parallel Computing on Clouds Thilina Gunarathne Advisor : Prof.Geoffrey Fox Committee : Prof.Judy Qiu,
Dimension Reduction and Visualization of Large High-Dimensional Data via Interpolation Seung-Hee Bae, Jong Youl Choi, Judy Qiu, and Geoffrey Fox School.
SALSASALSASALSASALSA Digital Science Center June 25, 2010, IIT Geoffrey Fox Judy Qiu School.
Iterative MapReduce Enabling HPC-Cloud Interoperability
SALSASALSASALSASALSA High Performance Biomedical Applications Using Cloud Technologies HPC and Grid Computing in the Cloud Workshop (OGF27 ) October 13,
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Panel Session The Challenges at the Interface of Life Sciences and Cyberinfrastructure and how should we tackle them? Chris Johnson, Geoffrey Fox, Shantenu.
Big Data Challenge Mega 10^6 Giga 10^9 Tera 10^12 Peta 10^15 Pig Latin.
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
School of Informatics and Computing Indiana University
Science in Clouds SALSA Team salsaweb/salsa Community Grids Laboratory, Digital Science Center Pervasive Technology Institute Indiana University.
SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute.
DISTRIBUTED COMPUTING
Portable Parallel Programming on Cloud and HPC: Scientific Applications of Twister4Azure Thilina Gunarathne Bingjing Zhang, Tak-Lon.
FutureGrid Dynamic Provisioning Experiments including Hadoop Fugang Wang, Archit Kulshrestha, Gregory G. Pike, Gregor von Laszewski, Geoffrey C. Fox.
SALSASALSASALSASALSA MSR Internship – Final Presentation Jaliya Ekanayake School of Informatics and Computing Indiana University.
SALSASALSASALSASALSA Design Pattern for Scientific Applications in DryadLINQ CTP DataCloud-SC11 Hui Li Yang Ruan, Yuduo Zhou Judy Qiu, Geoffrey Fox.
Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.
Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.
SALSASALSASALSASALSA CloudComp 09 Munich, Germany Jaliya Ekanayake, Geoffrey Fox School of Informatics and Computing Pervasive.
SALSA HPC Group School of Informatics and Computing Indiana University.
Cloud Technologies and Data Intensive Applications INGRID 2010 Workshop Poznan Poland May Geoffrey Fox
Cloud Technologies and Data Intensive Applications INGRID 2010 Workshop Poznan Poland May Geoffrey Fox
6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS), Nov. 17, 2013 Judy Qiu SALSA hpc.indiana.edu.
SALSASALSASALSASALSA FutureGrid Venus-C June Geoffrey Fox
SALSASALSASALSASALSA Scalable Programming and Algorithms for Data Intensive Life Science Applications Data Intensive Seattle, WA Judy Qiu
SALSA Group’s Collaborations with Microsoft SALSA Group Principal Investigator Geoffrey Fox Project Lead Judy Qiu Scott Beason,
SALSASALSASALSASALSA Clouds Ball Aerospace March Geoffrey Fox
SALSA HPC Group School of Informatics and Computing Indiana University.
Towards a Collective Layer in the Big Data Stack Thilina Gunarathne Judy Qiu
Looking at Use Case 19, 20 Genomics 1st JTC 1 SGBD Meeting SDSC San Diego March Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox
Performance of MapReduce on Multicore Clusters
Security: systems, clouds, models, and privacy challenges iDASH Symposium San Diego CA October Geoffrey.
Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics,
SALSA Group Research Activities April 27, Research Overview  MapReduce Runtime  Twister  Azure MapReduce  Dryad and Parallel Applications 
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
SALSASALSASALSASALSA Digital Science Center February 12, 2010, Bloomington Geoffrey Fox Judy Qiu
Parallel Applications And Tools For Cloud Computing Environments CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Memcached Integration with Twister Saliya Ekanayake - Jerome Mitchell - Yiming Sun -
SALSASALSASALSASALSA Data Intensive Biomedical Computing Systems Statewide IT Conference October 1, 2009, Indianapolis Judy Qiu
SALSASALSA Dynamic Virtual Cluster provisioning via XCAT on iDataPlex Supports both stateful and stateless OS images iDataplex Bare-metal Nodes Linux Bare-
SALSASALSA Harp: Collective Communication on Hadoop Judy Qiu, Indiana University.
SALSASALSASALSASALSA IU Twister Supports Data Intensive Science Applications School of Informatics and Computing Indiana University.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
SALSASALSASALSASALSA Data Intensive Biomedical Computing Systems Statewide IT Conference October 1, 2009, Indianapolis Judy Qiu
Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.
MapReduce and Data Intensive Applications XSEDE’12 BOF Session
Our Objectives Explore the applicability of Microsoft technologies to real world scientific domains with a focus on data intensive applications Expect.
FutureGrid: a Grid Testbed
Applying Twister to Scientific Applications
SC09 Doctoral Symposium, Portland, 11/18/2009
Scientific Data Analytics on Cloud and HPC Platforms
Scalable Parallel Interoperable Data Analytics Library
Twister4Azure : Iterative MapReduce for Azure Cloud
Clouds from FutureGrid’s Perspective
Presentation transcript:

SALSA HPC Group School of Informatics and Computing Indiana University Workshop on Petascale Data Analytics: Challenges, and Opportunities, SC11

Twister Bingjing Zhang, Richard Teng Twister4Azure Thilina Gunarathne High-Performance Visualization Algorithms For Data-Intensive Analysis Seung-Hee Bae and Jong Youl Choi

DryadLINQ CTP Evaluation Hui Li, Yang Ruan, and Yuduo Zhou Cloud Storage, FutureGrid Xiaoming Gao, Stephen Wu Million Sequence Challenge Saliya Ekanayake, Adam Hughs, Yang Ruan Cyberinfrastructure for Remote Sensing of Ice Sheets Jerome Mitchell

SALSASALSA Intel’s Application Stack

SALSASALSA Linux HPC Bare-system Linux HPC Bare-system Amazon Cloud Windows Server HPC Bare-system Windows Server HPC Bare-system Virtualization Cross Platform Iterative MapReduce (Collectives, Fault Tolerance, Scheduling) Kernels, Genomics, Proteomics, Information Retrieval, Polar Science, Scientific Simulation Data Analysis and Management, Dissimilarity Computation, Clustering, Multidimensional Scaling, Generative Topological Mapping CPU Nodes Virtualization Applications Programming Model Infrastructure Hardware Azure Cloud Security, Provenance, Portal High Level Language Distributed File Systems Data Parallel File System Grid Appliance GPU Nodes Support Scientific Simulations (Data Mining and Data Analysis) Runtime Storage Services and Workflow Object Store

SALSASALSA

What are the challenges? Providing both cost effectiveness and powerful parallel programming paradigms that is capable of handling the incredible increases in dataset sizes. (large-scale data analysis for Data Intensive applications ) Research issues  portability between HPC and Cloud systems  scaling performance  fault tolerance These challenges must be met for both computation and storage. If computation and storage are separated, it’s not possible to bring computing to data. Data locality  its impact on performance;  the factors that affect data locality;  the maximum degree of data locality that can be achieved. Factors beyond data locality to improve performance To achieve the best data locality is not always the optimal scheduling decision. For instance, if the node where input data of a task are stored is overloaded, to run the task on it will result in performance degradation. Task granularity and load balance In MapReduce, task granularity is fixed. This mechanism has two drawbacks 1)limited degree of concurrency 2)load unbalancing resulting from the variation of task execution time.

8 Programming Models and Tools 8 MapReduce in Heterogeneous Environment MICROSOFT

Motivation Data Deluge MapReduce Classic Parallel Runtimes (MPI) Experiencing in many domains Data Centered, QoSEfficient and Proven techniques Input Output map Input map reduce Input map reduce iterations Pij Expand the Applicability of MapReduce to more classes of Applications Map-OnlyMapReduce Iterative MapReduce More Extensions

Distinction on static and variable data Configurable long running (cacheable) map/reduce tasks Pub/sub messaging based communication/data transfers Broker Network for facilitating communication

configureMaps(..) configureReduce(..) runMapReduce(..) while(condition){ } //end while updateCondition() close() Combine() operation Reduce() Map() Worker Nodes Communications/data transfers via the pub-sub broker network & direct TCP Iterations May send pairs directly Local Disk Cacheable map/reduce tasks Main program may contain many MapReduce invocations or iterative MapReduce invocations Main program’s process space

Worker Node Local Disk Worker Pool Twister Daemon Master Node Twister Driver Main Program B B B B Pub/sub Broker Network Worker Node Local Disk Worker Pool Twister Daemon Scripts perform: Data distribution, data collection, and partition file creation map reduce Cacheable tasks One broker serves several Twister daemons

Components of Twister

14

Azure Queues for scheduling, Tables to store meta-data and monitoring data, Blobs for input/output/intermediate data storage.

Distributed, highly scalable and highly available cloud services as the building blocks. Utilize eventually-consistent, high-latency cloud services effectively to deliver performance comparable to traditional MapReduce runtimes. Decentralized architecture with global queue based dynamic task scheduling Minimal management and maintenance overhead Supports dynamically scaling up and down of the compute resources. MapReduce fault tolerance

Cap3 Sequence Assembly Smith Waterman Sequence Alignment BLAST Sequence Search

Performance with/without data caching Speedup gained using data cache Scaling speedup Increasing number of iterations Number of Executing Map Task Histogram Strong Scaling with 128M Data Points Weak Scaling Task Execution Time Histogram

Performance with/without data caching Speedup gained using data cache Scaling speedup Increasing number of iterations Azure Instance Type StudyNumber of Executing Map Task Histogram Weak Scaling Data Size Scaling Task Execution Time Histogram

This demo is for real time visualization of the process of multidimensional scaling(MDS) calculation. We use Twister to do parallel calculation inside the cluster, and use PlotViz to show the intermediate results at the user client computer. The process of computation and monitoring is automated by the program.

MDS projection of 100,000 protein sequences showing a few experimentally identified clusters in preliminary work with Seattle Children’s Research Institute

Master Node Twister Driver Twister-MDS ActiveMQ Broker ActiveMQ Broker MDS Monitor PlotViz I. Send message to start the job II. Send intermediate results Client Node

MDS Output Monitoring Interface Pub/Sub Broker Network Worker Node Worker Pool Twister Daemon Master Node Twister Driver Twister-MDS Worker Node Worker Pool Twister Daemon map reduce map reduce calculateStress calculateBC

Input map reduce Iterations Initial Collective Step Network of Brokers Final Collective Step Network of Brokers User Program Map-Collective Model

Twister Driver Node Twister Daemon Node ActiveMQ Broker Node Broker-Daemon Connection Broker-Broker Connection Broker-Driver Connection 7 Brokers and 32 Computing Nodes in total 5 Brokers and 4 Computing Nodes in total

Twister New Architecture Worker Node Twister DaemonTwister Driver Worker Node Twister Daemon map reduce merge Broker Map Reduce Master Node Configure Mapper Add to MemCache Broker Cacheable tasks map broadcasting chain reduce collection chain

Chain/Ring Broadcasting Twister Daemon Node Twister Driver Node Driver sender: send broadcasting data get acknowledgement send next broadcasting data … Daemon sender: receive data from the last daemon (or driver) cache data to daemon Send data to next daemon (waits for ACK) send acknowledgement to the last daemon

Chain Broadcasting Protocol send get ack send Driver Daemon 0 receive handle data send Daemon 1 ack receive handle data send ack receive handle data get ack ack receive handle data get ack send get ack Daemon 2 receive handle data ack receive handle data ack get ack ack get ack I know this is the end of Daemon Chain I know this is the end of Cache Block

Broadcasting Time Comparison

SALSASALSA Applications & Different Interconnection Patterns Map OnlyClassic MapReduce Iterative Reductions Twister Loosely Synchronous CAP3 Analysis Document conversion (PDF -> HTML) Brute force searches in cryptography Parametric sweeps High Energy Physics (HEP) Histograms SWG gene alignment Distributed search Distributed sorting Information retrieval Expectation maximization algorithms Clustering Linear Algebra Many MPI scientific applications utilizing wide variety of communication constructs including local interactions - CAP3 Gene Assembly - PolarGrid Matlab data analysis - Information Retrieval - HEP Data Analysis - Calculation of Pairwise Distances for ALU Sequences - Kmeans - Deterministic Annealing Clustering - Multidimensional Scaling MDS - Solving Differential Equations and - particle dynamics with short range forces Input Output map Input map reduce Input map reduce iterations Pij Domain of MapReduce and Iterative ExtensionsMPI

SALSASALSA Scheduling vs. Computation of Dryad in a Heterogeneous Environment 34

SALSASALSA Runtime Issues 35

Development of library of Collectives to use at Reduce phase Broadcast and Gather needed by current applications Discover other important ones Implement efficiently on each platform – especially Azure Better software message routing with broker networks using asynchronous I/O with communication fault tolerance Support nearby location of data and computing using data parallel file systems Clearer application fault tolerance model based on implicit synchronizations points at iteration end points Later: Investigate GPU support Later: run time for data parallel languages like Sawzall, Pig Latin, LINQ

Convergence is Happening Multicore Clouds Data Intensive Paradigms Data intensive application (three basic activities): capture, curation, and analysis (visualization) Cloud infrastructure and runtime Parallel threading and processes

SALSASALSA FutureGrid: a Grid Testbed IU Cray operational, IU IBM (iDataPlex) completed stability test May 6 UCSD IBM operational, UF IBM stability test completes ~ May 12 Network, NID and PU HTC system operational UC IBM stability test completes ~ May 27; TACC Dell awaiting delivery of components NID : Network Impairment Device Private Public FG Network

SALSASALSA Switchable clusters on the same hardware (~5 minutes between different OS such as Linux+Xen to Windows+HPCS) Support for virtual clusters SW-G : Smith Waterman Gotoh Dissimilarity Computation as an pleasingly parallel problem suitable for MapReduce style applications Pub/Sub Broker Network Summarizer Switcher Monitoring Interface iDataplex Bare- metal Nodes XCAT Infrastructure Virtual/Physical Clusters Monitoring & Control Infrastructure iDataplex Bare-metal Nodes (32 nodes) iDataplex Bare-metal Nodes (32 nodes) XCAT Infrastructure Linux Bare- system Linux Bare- system Linux on Xen Windows Server 2008 Bare-system SW-G Using Hadoop SW-G Using DryadLINQ Monitoring Infrastructure Dynamic Cluster Architecture SALSAHPC Dynamic Virtual Cluster on FutureGrid -- Demo at SC09 Demonstrate the concept of Science on Clouds on FutureGrid

SALSASALSA SALSAHPC Dynamic Virtual Cluster on FutureGrid -- Demo at SC09 Top: 3 clusters are switching applications on fixed environment. Takes approximately 30 seconds. Bottom: Cluster is switching between environments: Linux; Linux +Xen; Windows + HPCS. Takes approxomately 7 minutes SALSAHPC Demo at SC09. This demonstrates the concept of Science on Clouds using a FutureGrid iDataPlex. Demonstrate the concept of Science on Clouds using a FutureGrid cluster

SALSA HPC Group Indiana University