Data-Intensive and High Performance Computing on Cloud Environments Gagan Agrawal 1.

Slides:



Advertisements
Similar presentations
The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.
Advertisements

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.
Future Work Needed Kenneth Wade Najim Yaqubie. Outline 1.Model is simple 2.Too many assumptions 3.Conflicting internal architectures 4.Security Challenges.
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Cloud based Dynamic workflow with QOS for Mass Spectrometry Data Analysis Thesis Defense: Ashish Nagavaram Graduate student Computer Science and Engineering.
New Challenges in Cloud Datacenter Monitoring and Management
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.
Ch 4. The Evolution of Analytic Scalability
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
MapReduce.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Department of Computer Science Engineering SRM University
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining Wei Jiang and Gagan Agrawal.
Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.
Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
1 A Framework for Data-Intensive Computing with Cloud Bursting Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The Ohio.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
The Limitation of MapReduce: A Probing Case and a Lightweight Solution Zhiqiang Ma Lin Gu Department of Computer Science and Engineering The Hong Kong.
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
The Ohio State University Efficient and Effective Sampling Methods for Aggregation Queries on the Hidden Web Fan Wang Gagan Agrawal Presented By: Venu.
A Map-Reduce System with an Alternate API for Multi-Core Environments Wei Jiang, Vignesh T. Ravi and Gagan Agrawal.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Data-Intensive Computing: From Clouds to GPUs Gagan Agrawal June 1,
Optimizing MapReduce for GPUs with Effective Shared Memory Usage Department of Computer Science and Engineering The Ohio State University Linchuan Chen.
Data-Intensive Computing: From Clouds to GPUs Gagan Agrawal December 3,
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Elastic Cloud Caches for Accelerating Service-Oriented Computations Gagan Agrawal Ohio State University Columbus, OH David Chiu Washington State University.
1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),
ApproxHadoop Bringing Approximations to MapReduce Frameworks
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
MATE-CG: A MapReduce-Like Framework for Accelerating Data-Intensive Computations on Heterogeneous Clusters Wei Jiang and Gagan Agrawal.
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.
Research Overview Gagan Agrawal Associate Professor.
Sunpyo Hong, Hyesoon Kim
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Resource Specification Prediction Model Richard Huang joint work with Henri Casanova and Andrew Chien.
A Dynamic Scheduling Framework for Emerging Heterogeneous Systems
Optimizing MapReduce for GPUs with Effective Shared Memory Usage
Ch 4. The Evolution of Analytic Scalability
Wei Jiang Advisor: Dr. Gagan Agrawal
Data-Intensive Computing: From Clouds to GPU Clusters
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Presentation transcript:

Data-Intensive and High Performance Computing on Cloud Environments Gagan Agrawal 1

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Data-intensive computing Middleware System ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 2

3 Utilities: Things We Can’t Live without

4 Utility Costs Depend on Usage Utility ProvidersConsumers Resource on Demand

5 Utility Costs Depend on Usage Utility ProvidersConsumers Pay Per Usage $ $ $

Utilities of Today Haven’t Always Been Utilities 6 Hand-pump A Horse Cart: Your purchase and `maintain’ the source of power for your transportation

7 How Do We Currently Do Computing? Resources are co-located on site Computing ResourcesSupport Personnel Computing Consumer

8 How Do We Currently Do Computing? Resources are co-located on site Computing ResourcesSupport Personnel Computing Consumer

9 Computing as a Utility Cloud “Utility” Providers: Amazon AWS, Azure, Cloudera, Google App Engine Consumers: Companies, labs, schools, et al.

10 Processed Results Computing as a Utility Algorithms & Data Cloud “Utility” Providers: Amazon AWS, Azure, Cloudera, Google App Engine Consumers: Companies, labs, schools, et al.

11 Processed Results Computing as a Utility Algorithms & Data Cloud “Utility” Providers: Amazon AWS, Azure, Cloudera, Google App Engine Consumers: Companies, labs, schools, et al.

12 Why Now? It has finally become cost-effective to offer computing as a service Large companies, e.g., Amazon, Microsoft, Google, Yahoo! ‣ Already have the computing personnel, infrastructure in place ‣ Decreasing costs of hardware ‣ Virtualization advancements

13 Example of Cost Effectiveness at the Provider

14 Why Now? This creates a win-win situation For the provider: ‣ They get paid to fully utilize otherwise idle hardware For the user: ‣ They save on costs ‣ Example: Amazon’s Cloud is $0.10 per machine-hour

15 Promises of Cloud Computing Cost Associativity ‣ Running 1 machine for 10 hours = running 10 machines for 1 hour Elasticity ‣ Cloud applications can stretch and contract their resource requirements “Infinite resources”

Research Challenges How do we exploit cost associativity and elasticity of the cloud for various applications? How do the cloud providers provide adequate QoS to various applications and users ‣ Maximize their revenue, lower their costs How do we develop effective services to support applications on cloud providers How can we combine the use of cloud and traditional resources for various applications ‣ (HPC) Cloud Bursting How do we effectively manage large scale data on the cloud? 16

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Data-intensive computing Middleware Systems ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 17

October 24, Growing need for analysis of large scale data ‣ Scientific ‣ Commercial Data-intensive Supercomputing (DISC) Map-Reduce has received a lot of attention ‣ Database and Datamining communities ‣ High performance computing community Closely coupled with interest in cloud computing Motivation

October 24, Positives: ‣ Simple API -Functional language based -Very easy to learn ‣ Support for fault-tolerance -Important for very large-scale clusters Questions ‣ Performance? -Comparison with other approaches ‣ Suitability for different class of applications? Map-Reduce: Positives and Questions

Class of Data-Intensive Applications Many different types of applications ‣ Data-center kind of applications  Data scans, sorting, indexing ‣ More ``compute-intensive`` data-intensive applications  Machine learning, data mining, NLP  Map-reduce / Hadoop being widely used for this class ‣ Standard Database Operations  Sigmod 2009 paper compares Hadoop with Databases and OLAP systems What is Map-reduce suitable for? What are the alternatives? ‣ MPI/OpenMP/Pthreads – too low level? October 24,

Our Work Proposes MATE (a Map-Reduce system with an AlternaTE API) based on Generalized Reduction ‣ Phoenix implemented Map-Reduce in shared-memory systems ‣ MATE adopted Generalized Reduction, first proposed in FREERIDE that was developed at Ohio State ‣ Observed API similarities and subtle differences between MapReduce and Generalized Reduction Comparison for ‣ Data Mining Applications ‣ Compare performance and API ‣ Understand performance overheads Will an alternative API be better for ``Map-Reduce``? October 24,

Comparing Processing Structures 22 Reduction Object represents the intermediate state of the execution Reduce func. is commutative and associative Sorting, grouping.. overheads are eliminated with red. func/obj. October 24, 2015

Observations on Processing Structure Map-Reduce is based on functional idea ‣ Do not maintain state This can lead to overheads of managing intermediate results between map and reduce ‣ Map could generate intermediate results of very large size MATE API is based on a programmer managed reduction object ‣ Not as ‘clean’ ‣ But, avoids sorting of intermediate results ‣ Can also help shared memory parallelization ‣ Helps better fault-recovery October 24,

October 24, Results: Data Mining (I) K-Means: 400MB dataset, 3-dim points, k = 100 on one WCI node with 8 cores Avg. Time Per Iteration (sec) # of threads

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Data-intensive computing Middleware Systems ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 25

26 Resource Provisioning Motivation: Adaptive Applications Earthquake modeling Coastline forecastingMedical systems Time-Critical Event Processing - Compute-intensive - Time constraints - Application-specific flexibility - Application Quality of Service (QoS)

27 Adaptive Applications (Cont’d) Adaptive Applications that perform time-critical event processing Application-specific flexibility: parameter adaptation Trade-off between application QoS and execution time HPC Applications (compute-intensive) (compute-intensive) Aim at maximize performance Do not consider adaptation Deadline-drivenSchedulingDeadline-drivenScheduling Not very compute-intensive

28 Challenges -- Resource Budget Constraints Elastic Cloud Computing - Pay-as-you-go model Satisfy the Application QoS with the Minimum Resource Cost Dynamic Resource Provisioning - Dynamically varying application workloads - Resource budget

29 Background: Pricing Model Charged Fees ‣ Base price ‣ Transfer fee Linear Pricing Model Exponential Pricing Model Base price charged for the smallest amount of CPU cycles Transfer fee for each CPU allocation change CPU cycle at the i th allocation Time duration at the i th allocation Number of CPU cycle allocations

30 Problem Description Adaptive Applications ‣ Adaptive parameters ‣ Benefit ‣ Time constraint Cloud Computing Environment ‣ Resource budget ‣ Overprovisioning/Underprovisioning Goal ‣ Maximize the application benefit while satisfying the time constraints and resource budget

31 Approach Overview Dynamic Resource Provisioning (feedback control) Resource Model (with optimization) Resource Model (with optimization) Resource Provisioning Controller ‣ Multi-input-multi-output (MIMO) feedback control model ‣ Modeling between adaptive parameters and performance metrics ‣ Control policy: reinforcement learning Resource Model ‣ Map change of parameters to change in CPU/memory allocations ‣ Optimization: avoid frequent resource changes change to the adaptive parameters change to CPU/memory allocations

32 Resource Provisioning Controller Performance Metrics Multi-Input- Multi-Output Model ControlPolicyControlPolicy 0 Satisfy time constraints and resource budget 0 Relationship between adaptive parameters and performance metrics 0 Decide how to change values of the adaptive parameters 0

33 Control Model Formulation -- Performance Metrics Performance Metrics ‣ Processing progress: ratio between the currently obtained application benefit and the elapsed execution time ‣ Performance/cost ratio: ratio between the currently obtained application benefit and the cost of the resources that have been assigned Notation Application benefit obtained at time step kElapsed execution time at time step kResource cost at time step k

34 Control Model Formulation -- Multi-Input-Multi- Output Model Auto-Regressive-Moving-Average with Exogenous Inputs (ARMAX) ‣ Second-order model ‣ is i th adaptive parameter at time step k ‣ are updated at the end of every interval Previous observed performance metrics Previous and current values of adaptive parameters

35 Framework Design ApplicationApplication Virtualization Management (Eucalyptus, Open Nebular...) Xen Hypervisor VMVM... Xen Hypervisor VMVM... Xen Hypervisor VMVM... Service Deployment Service Wrapper Resource Provisioning Controller Application Controller Resource Model Optimizer Performance Manager Priority Assignment Status Query Performance Analysis

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Data-intensive computing Middleware Systems ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 36

37 Workflow Consolidation: Motivation Another Critical Issue in Cloud Environment: Power Management ‣ HPC servers consume a lot of energy ‣ Significant adverse impact on the environment To Reduce Resource and Energy Costs ‣ Server consolidation ‣ Minimize the total power consumption and resource costs without a substantial degradation in performance

38 Problem Description Our Target Applications ‣ Workflows with DAG structure ‣ Multiple processing stages ‣ Opportunities for consolidation Research Problems ‣ Combine parameter adaptation, budget constraints and resource allocation with consolidation and power optimization ‣ Challenge: consolidation without parameter adaptation ‣ Support power-aware parameter adaptation -- future work

39 Contributions A power-aware consolidation framework, pSciMapper, based on hierarchical clustering and an optimization search method pSciMapper is able to reduce the total power consumption by up to 56% with a most a 15% slowdown for the workflow pSciMapper incurs low overhead and thus suitable for large- scale scientific workflows

40 The pSciMapper Framework Design Offline Analysis Online Consolidation Scientific Workflows Resource Usage Generation Temporal Feature Extraction Feature Reduction and Modeling Feature Reduction and Modeling Time SeriesKnowledgebase Temporal Signatures model Hierarchical Clustering Optimization Search Algorithm Algorithm Time Varying ResourceProvisioning ResourceProvisioning Consolidated Workloads

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Data-intensive computing Middleware Systems ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 41

Motivation: Data-Intensive Services on Clouds Cloud can provide flexible storage Data-intensive services can be executed on clouds Caching is an age-old idea to accelerate services ‣ On clouds, can we exploit elasticity A cost-sensitive elastic cache for clouds! 42

43 Problem: Query Intensive Circumstances...

44 Scaling up to Handle Load invoke: haitimap(29) (29 mod 3) = 2 Which proxy has the page? h(k) = (k mod num_proxies) h(29) HIT! reply: data(29) Derived Data Cache (Cloud Nodes) HaitiMap

45 Scaling up to Handle Load (11 mod 3) = 2 Which proxy has the page? h(k) = (k mod num_proxies) h(11) MISS cache: data(11) reply: data(11) Derived Data Cache (Cloud Nodes) Service Infrastructure invoke: haitimap(11) invoke: haitimap(11) HaitiMap

46 Scaling up to Handle Load (29 mod 4) = 1 Which proxy has the page? h(k) = (k mod num_proxies) h(29) MISS Derived Data Cache (Cloud Nodes) Service Infrastructure HaitiMap invoke: haitimap(29) 3

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud ‣ Data-intensive computing Middleware System Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 47

Heterogeneous High Performance Computing Heterogeneous arch., a common place ‣ Eg., Today’s desktops & notebooks ‣ Multi-core CPU + Graphics card on PCI-E A Recent HPC system ‣ Eg., Tianhe-1 [5 th fastest SC, NOV 2009] ‣ Use Multi-core CPUs and GPU (ATI Radeon HD 4870) on each node Multi-core CPU and GPU usage still divided ‣ Resources may be under-utilized Can Multi-core CPU and GPU be used simultaneously for computation? 48

Overall System Design 49 User Input: Simple C code with annotations Application Developer Multi-core Middlewar e API GPU Code for CUDA Compilation Phase Code Generator Run-time System Worker Thread Creation and Management Map Computation to CPU and GPU Dynamic Work Distribution Key Components

Performance of K-Means (Heterogeneous - NUCS) 50 60%

Outline Introduction to Cloud Computing Ongoing Projects in Cloud Computing ‣ Resource Provisioning with Budget and Time Constraints ‣ Workflow consolidation with power constraints ‣ An Elastic Cache on the Amazon Cloud ‣ Data-intensive computing Middleware System Other Research Projects ‣ Heterogeneous High-Performance Computing ‣ Deep web Integration and Mining ‣ Scientific Data Management 51

52 The Deep Web The definition of “the deep web” from Wikipedia The deep Web refers to World Wide Web content that is not part of the surface web, which is indexed by standard search engines. Some Examples: Expedia, Priceline

53 The Deep Web is Huge and Informative 500 times larger than the surface web 7500 terabytes of information (19 terabytes in the surface web) 550 billion documents (1 billion in the surface web) More than 200,000 deep web sites Relevant to every domain: scientific, e-commerce, market 95 percent of the deep web is publicly accessible (with access limitations)

54 How to Access Deep Web Data 1. A user issues query through input interfaces of deep web data sources 2. Query is translated into SQL style query 3. Trigger search on backend database 4. Answers returned through network Select price From Expedia Where depart=CMH and arrive=SEA and dedate=“7/13/10” and redate=“7/16/10”

55 System Overview Hidden schema discovery Data source integration Structured SQL query Sampling the deep web Online aggregation Low selectivity query

Summary Research in Cloud, High Performance Computing and Data- Intensive Computing (including data mining and web mining) Currently working with 10 PhD students and 5 MS students 10 PhDs completed in last 6 years To get Involved ‣ Join 888 in Winter