Scalable Parallel Computing on Clouds Thilina Gunarathne Advisor : Prof.Geoffrey Fox Committee : Prof.Judy Qiu,

Slides:

Advertisements

Similar presentations

Scalable High Performance Dimension Reduction

Advertisements

EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.

SALSA HPC Group School of Informatics and Computing Indiana University.

Twister4Azure Iterative MapReduce for Windows Azure Cloud Thilina Gunarathne Indiana University Iterative MapReduce for Azure Cloud.

SCALABLE PARALLEL COMPUTING ON CLOUDS : EFFICIENT AND SCALABLE ARCHITECTURES TO PERFORM PLEASINGLY PARALLEL, MAPREDUCE AND ITERATIVE DATA INTENSIVE COMPUTATIONS.

Hybrid MapReduce Workflow Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US.

Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.

SALSASALSASALSASALSA Using MapReduce Technologies in Bioinformatics and Medical Informatics Computing for Systems and Computational Biology Workshop SC09.

Interpolative Multidimensional Scaling Techniques for the Identification of Clusters in Very Large Sequence Sets April 27, 2011.

Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.

Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.

MapReduce in the Clouds for Science CloudCom 2010 Nov 30 – Dec 3, 2010 Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox {tgunarat, taklwu,

Scalable Parallel Computing on Clouds (Dissertation Proposal)

Dimension Reduction and Visualization of Large High-Dimensional Data via Interpolation Seung-Hee Bae, Jong Youl Choi, Judy Qiu, and Geoffrey Fox School.

SALSASALSA Programming Abstractions for Multicore Clouds eScience 2008 Conference Workshop on Abstractions for Distributed Applications and Systems December.

SALSASALSASALSASALSA High Performance Biomedical Applications Using Cloud Technologies HPC and Grid Computing in the Cloud Workshop (OGF27 ) October 13,

Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,

Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.

Panel Session The Challenges at the Interface of Life Sciences and Cyberinfrastructure and how should we tackle them? Chris Johnson, Geoffrey Fox, Shantenu.

Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.

By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.

Cloud MapReduce ： a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.

School of Informatics and Computing Indiana University

Science in Clouds SALSA Team salsaweb/salsa Community Grids Laboratory, Digital Science Center Pervasive Technology Institute Indiana University.

SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute.

Portable Parallel Programming on Cloud and HPC: Scientific Applications of Twister4Azure Thilina Gunarathne Bingjing Zhang, Tak-Lon.

FutureGrid Dynamic Provisioning Experiments including Hadoop Fugang Wang, Archit Kulshrestha, Gregory G. Pike, Gregor von Laszewski, Geoffrey C. Fox.

The Limitation of MapReduce: A Probing Case and a Lightweight Solution Zhiqiang Ma Lin Gu Department of Computer Science and Engineering The Hong Kong.

A Cloudy View on Computing Workshop and CReSIS Field Data Accessibility Jerome E. Mitchell Indiana University.

Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.

SALSA HPC Group School of Informatics and Computing Indiana University.

Scalable Parallel Computing on Clouds : Efficient and scalable architectures to perform pleasingly parallel, MapReduce and iterative data intensive computations.

A Hierarchical MapReduce Framework Yuan Luo and Beth Plale School of Informatics and Computing, Indiana University Data To Insight Center, Indiana University.

Performance Model for Parallel Matrix Multiplication with Dryad: Dataflow Graph Runtime Hui Li School of Informatics and Computing Indiana University 11/1/2012.

Using SWARM service to run a Grid based EST Sequence Assembly Karthik Narayan Primary Advisor : Dr. Geoffrey Fox 1.

Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce Leonidas Akritidis Panayiotis Bozanis Department of Computer & Communication.

SALSASALSASALSASALSA Clouds Ball Aerospace March Geoffrey Fox

SALSA HPC Group School of Informatics and Computing Indiana University.

Towards a Collective Layer in the Big Data Stack Thilina Gunarathne Judy Qiu

CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.

Looking at Use Case 19, 20 Genomics 1st JTC 1 SGBD Meeting SDSC San Diego March Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox

Security: systems, clouds, models, and privacy challenges iDASH Symposium San Diego CA October Geoffrey.

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics,

SALSA Group Research Activities April 27, Research Overview  MapReduce Runtime  Twister  Azure MapReduce  Dryad and Parallel Applications 

A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.

Parallel Applications And Tools For Cloud Computing Environments CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.

HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox

Memcached Integration with Twister Saliya Ekanayake - Jerome Mitchell - Yiming Sun -

SALSASALSASALSASALSA Data Intensive Biomedical Computing Systems Statewide IT Conference October 1, 2009, Indianapolis Judy Qiu

SALSASALSA Dynamic Virtual Cluster provisioning via XCAT on iDataPlex Supports both stateful and stateless OS images iDataplex Bare-metal Nodes Linux Bare-

SALSASALSA Harp: Collective Communication on Hadoop Judy Qiu, Indiana University.

PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.

COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University

SALSA HPC Group School of Informatics and Computing Indiana University Workshop on Petascale Data Analytics: Challenges, and.

Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.

Thilina Gunarathne, Bimalee Salpitkorala, Arun Chauhan, Geoffrey Fox

MapReduce and Data Intensive Applications XSEDE’12 BOF Session

Our Objectives Explore the applicability of Microsoft technologies to real world scientific domains with a focus on data intensive applications Expect.

I590 Data Science Curriculum August

Applying Twister to Scientific Applications

Data Science Curriculum March

Biology MDS and Clustering Results

SC09 Doctoral Symposium, Portland, 11/18/2009

Scientific Data Analytics on Cloud and HPC Platforms

Twister4Azure : Iterative MapReduce for Azure Cloud

Clouds from FutureGrid’s Perspective

Group 15 Swathi Gurram Prajakta Purohit

Towards High Performance Data Analytics with Java

CS639: Data Management for Data Science

MapReduce: Simplified Data Processing on Large Clusters

Presentation transcript:

Scalable Parallel Computing on Clouds Thilina Gunarathne Advisor : Prof.Geoffrey Fox Committee : Prof.Judy Qiu, Prof.Beth Plale, Prof.David Leake

Clouds for scientific computations No upfront cost Zero maintenance Horizontal scalability Compute, storage and other services Loose service guarantees Not trivial to utilize effectively 

Scalable Parallel Computing on Clouds Programming Models Scalability Performance Fault Tolerance Monitoring

Pleasingly Parallel Frameworks Map() Redu ce Results Optional Reduce Phase HDFS Input Data Set Data File Executable Classic Cloud Frameworks Map Reduce Cap3 Sequence Assembly

Map Reduce Programming Model Moving Computation to Data Scalable Fault Tolerance – Simple programming model – Excellent fault tolerance – Moving computations to data – Works very well for data intensive pleasingly parallel applications Ideal for data intensive parallel applications

MRRoles4Azure First MapReduce framework for Azure Cloud Use highly-available and scalable Azure cloud services Hides the complexity of cloud & cloud services Co-exist with eventual consistency & high latency of cloud services Decentralized control – avoids single point of failure Azure Cloud Services Highly-available and scalable Utilize eventually-consistent, high-latency cloud services effectively Minimal maintenance and management overhead Decentralized Avoids Single Point of Failure Global queue based dynamic scheduling Dynamically scale up/down MapReduce First pure MapReduce for Azure Typical MapReduce fault tolerance

MRRoles4Azure Azure Queues for scheduling, Tables to store meta-data and monitoring data, Blobs for input/output/intermediate data storage.

MRRoles4Azure

SWG Sequence Alignment Smith-Waterman-GOTOH to calculate all-pairs dissimilarity Costs less than EMR Performance comparable to Hadoop, EMR

Data Intensive Iterative Applications Growing class of applications – Clustering, data mining, machine learning & dimension reduction applications – Driven by data deluge & emerging computation fields Compute CommunicationReduce/ barrier New Iteration Larger Loop- Invariant Data Smaller Loop- Variant Data Broadcast

 In-Memory Caching of static data  Programming model extensions to support broadcast data  Merge Step  Hybrid intermediate data transfer Iterative MapReduce for Azure Cloud Merge step Extensions to support broadcast data Hybrid intermediate data transfer In-Memory/Disk caching of static data

Hybrid Task Scheduling  Cache aware hybrid scheduling  Decentralized  Fault Tolerant  Multiple MapReduce applications within an iteration First iteration through queues New iteration in Job Bulleting Board Data in cache + Task meta data history Left over tasks

Performance with/without data caching Speedup gained using data cache Scaling speedup Increasing number of iterations Number of Executing Map Task Histogram Strong Scaling with 128M Data Points Weak Scaling Task Execution Time Histogram First iteration performs the initial data fetch Overhead between iterations Scales better than Hadoop on bare metal

Applications Bioinformatics pipeline Gene Sequences Pairwise Alignment & Distance Calculation Distance Matrix Clustering Multi- Dimensional Scaling Visualization Cluster Indices Coordinates 3D Plot O(NxN)

X: Calculate invV (BX) Map Reduce Merge Multi-Dimensional-Scaling Many iterations Memory & Data intensive 3 Map Reduce jobs per iteration X k = invV * B(X (k-1) ) * X (k-1) 2 matrix vector multiplications termed BC and X BC: Calculate BX Map Reduce Merge Calculate Stress Map Reduce Merge New Iteration

Performance with/without data caching Speedup gained using data cache Scaling speedup Increasing number of iterations Azure Instance Type StudyNumber of Executing Map Task Histogram Weak Scaling Data Size Scaling Task Execution Time Histogram First iteration performs the initial data fetch Performance adjusted for sequential performance difference

BLAST Sequence Search BLAST Scales better than Hadoop & EC2- Classic Cloud

Current Research Collective communication primitives Exploring additional data communication and broadcasting mechanisms – Fault tolerance Twister4Cloud – Twister4Azure architecture implementations for other cloud infrastructures

Contributions Twister4Azure – Decentralized iterative MapReduce architecture for clouds – More natural Iterative programming model extensions to MapReduce model – Leveraging eventual consistent cloud services for large scale coordinated computations Performance comparison of applications in Clouds, VM environments and in bare metal Exploration of the effect of data inhomogeneity for scientific MapReduce run times Implementation of data mining and scientific applications for Azure cloud as well as using Hadoop/DryadLinq GPU OpenCL implementation of iterative data analysis algorithms

Acknowledgements My PhD advisory committee Present and past members of SALSA group – Indiana University National Institutes of Health grant 5 RC2 HG FutureGrid Microsoft Research Amazon AWS

Selected Publications 1.Gunarathne, T., Wu, T.-L., Choi, J. Y., Bae, S.-H. and Qiu, J. Cloud computing paradigms for pleasingly parallel biomedical applications. Concurrency and Computation: Practice and Experience. doi: /cpe Ekanayake, J.; Gunarathne, T.; Qiu, J.;, Cloud Technologies for Bioinformatics Applications, Parallel and Distributed Systems, IEEE Transactions on, vol.22, no.6, pp , June doi: /TPDS Thilina Gunarathne, BingJing Zang, Tak-Lon Wu and Judy Qiu. Portable Parallel Programming on Cloud and HPC: Scientific Applications of Twister4Azure. In Proceedings of the forth IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2011), Melbourne, Australia To appear. 4.Gunarathne, T., J. Qiu, and G. Fox, Iterative MapReduce for Azure Cloud, Cloud Computing and Its Applications, Argonne National Laboratory, Argonne, IL, 04/12-13/ Gunarathne, T.; Tak-Lon Wu; Qiu, J.; Fox, G.; MapReduce in the Clouds for Science, Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on, vol., no., pp , Nov Dec doi: /CloudCom Thilina Gunarathne, Bimalee Salpitikorala, and Arun Chauhan. Optimizing OpenCL Kernels for Iterative Statistical Algorithms on GPUs. In Proceedings of the Second International Workshop on GPUs and Scientific Applications (GPUScA), Galveston Island, TX Gunarathne, T., C. Herath, E. Chinthaka, and S. Marru, Experience with Adapting a WS-BPEL Runtime for eScience Workflows. The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'09), Portland, OR, ACM Press, pp. 7, 11/20/ Judy Qiu, Jaliya Ekanayake, Thilina Gunarathne, Jong Youl Choi, Seung-Hee Bae, Yang Ruan, Saliya Ekanayake, Stephen Wu, Scott Beason, Geoffrey Fox, Mina Rho, Haixu Tang. Data Intensive Computing for Bioinformatics, Data Intensive Distributed Computing, Tevik Kosar, Editor. 2011, IGI Publishers.

Questions? Thank You!

Background – Web services Apache Axis2 committer, release manager, PMC member – Workflow BPEL-Mora WSO2 Mashup server LEAD (Linked environments – Cloud computing Hadoop, Twister, EMR

Broadcast Data Loop invariant data (static data) – traditional MR key-value pairs – Comparatively larger sized data – Cached between iterations Loop variant data (dynamic data) – broadcast to all the map tasks in beginning of the iteration – Comparatively smaller sized data Map(Key, Value, List of KeyValue-Pairs(broadcast data),…) Can be specified even for non-iterative MR jobs

In-Memory Data Cache Caches the loop-invariant (static) data across iterations – Data that are reused in subsequent iterations Avoids the data download, loading and parsing cost between iterations – Significant speedups for data-intensive iterative MapReduce applications Cached data can be reused by any MR application within the job

Cache Aware Scheduling Map tasks need to be scheduled with cache awareness – Map task which process data ‘X’ needs to be scheduled to the worker with ‘X’ in the Cache Nobody has global view of the data products cached in workers – Decentralized architecture – Impossible to do cache aware assigning of tasks to workers Solution: workers pick tasks based on the data they have in the cache – Job Bulletin Board : advertise the new iterations

Merge Step Extension to the MapReduce programming model to support iterative applications – Map -> Combine -> Shuffle -> Sort -> Reduce -> Merge Receives all the Reduce outputs and the broadcast data for the current iteration User can add a new iteration or schedule a new MR job from the Merge task. – Serve as the “loop-test” in the decentralized architecture Number of iterations Comparison of result from previous iteration and current iteration – Possible to make the output of merge the broadcast data of the next iteration

Multiple Applications per Deployment Ability to deploy multiple Map Reduce applications in a single deployment Possible to invoke different MR applications in a single job Support for many application invocations in a workflow without redeployment