A Map-Reduce System with an Alternate API for Multi-Core Environments

Slides:

Advertisements

Similar presentations

MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.

Advertisements

Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.

SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.

LIBRA: Lightweight Data Skew Mitigation in MapReduce

Mapreduce and Hadoop Introduce Mapreduce and Hadoop

MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.

MapReduce Online Veli Hasanov Fatih University.

Spark: Cluster Computing with Working Sets

Distributed Computations

MapReduce Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January Shahram.

MapReduce Simplified Data Processing on Large Clusters Google, Inc. Presented by Prasad Raghavendra.

L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.

Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.

IPDPS, Supporting Fault Tolerance in a Data-Intensive Computing Middleware Tekin Bicer, Wei Jiang and Gagan Agrawal Department of Computer Science.

Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.

By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.

MapReduce: Simplified Data Processing on Large Clusters 컴퓨터학과 김정수.

Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Background: MapReduce and FREERIDE Co-clustering on FREERIDE Experimental.

Exploiting Domain-Specific High-level Runtime Support for Parallel Code Generation Xiaogang Li Ruoming Jin Gagan Agrawal Department of Computer and Information.

Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.

Computer Science and Engineering Advanced Computer Architectures CSE 8383 February 14 th, 2008 Presentation 1 By: Dina El-Sakaan.

Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining Wei Jiang and Gagan Agrawal.

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.

Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:

Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal

Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.

MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.

MapReduce M/R slides adapted from those of Jeff Dean’s.

Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.

Integrating and Optimizing Transactional Memory in a Data Mining Middleware Vignesh Ravi and Gagan Agrawal Department of ComputerScience and Engg. The.

A Map-Reduce System with an Alternate API for Multi-Core Environments Wei Jiang, Vignesh T. Ravi and Gagan Agrawal.

MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.

Data-Intensive Computing: From Clouds to GPUs Gagan Agrawal June 1,

Computer Science and Engineering Parallelizing Defect Detection and Categorization Using FREERIDE Leonid Glimcher P. 1 ipdps’05 Scaling and Parallelizing.

Hung-chih Yang 1, Ali Dasdan 1 Ruey-Lung Hsiao 2, D. Stott Parker 2

Optimizing MapReduce for GPUs with Effective Shared Memory Usage Department of Computer Science and Engineering The Ohio State University Linchuan Chen.

Data-Intensive Computing: From Clouds to GPUs Gagan Agrawal December 3,

Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.

Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.

By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.

PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.

A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.

MATE-CG: A MapReduce-Like Framework for Accelerating Data-Intensive Computations on Heterogeneous Clusters Wei Jiang and Gagan Agrawal.

High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.

AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.

Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi

TensorFlow– A system for large-scale machine learning

Big Data is a Big Deal!.

A Dynamic Scheduling Framework for Emerging Heterogeneous Systems

Spark Presentation.

CS399 New Beginnings Jonathan Walpole.

PREGEL Data Management in the Cloud

Compiling Dynamic Data Structures in Python to Enable the Use of Multi-core and Many-core Libraries Bin Ren, Gagan Agrawal 9/18/2018.

Accelerating MapReduce on a Coupled CPU-GPU Architecture

Database Applications (15-415) Hadoop Lecture 26, April 19, 2016

Tools and Techniques for Processing (and Management) of Data

MapReduce Simplied Data Processing on Large Clusters

MapReduce for Data Intensive Scientific Analyses

February 26th – Map/Reduce

Cse 344 May 4th – Map/Reduce.

Optimizing MapReduce for GPUs with Effective Shared Memory Usage

Wei Jiang Advisor: Dr. Gagan Agrawal

Data-Intensive Computing: From Clouds to GPU Clusters

Pregelix: Think Like a Vertex, Scale Like Spandex

Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz

Yi Wang, Wei Jiang, Gagan Agrawal

Adaptive Data Refinement for Parallel Dynamic Programming Applications

COS 518: Distributed Systems Lecture 11 Mike Freedman

MapReduce: Simplified Data Processing on Large Clusters

Presentation transcript:

A Map-Reduce System with an Alternate API for Multi-Core Environments Presented by Wei Jiang April 27, 2019 April 27, 2019 April 27, 2019 1 1

Outline Background MapReduce Generalized Reduction System Design and Implementation Experiments Related Work Conclusions April 27, 2019 April 27, 2019 April 27, 2019 2 2

Background We have evaluated FREERIDE and Hadoop MapReduce based on a set of applications Phoenix is one of the implementations of MapReduce for shared-memory systems, written in C, of small code size We also want to make FREERIDE smaller and light-weighted April 27, 2019 April 27, 2019 April 27, 2019 3 3

Google’s MapReduce Engine April 27, 2019 April 27, 2019 April 27, 2019 4 4

Phoenix implementation It is based on the same principles but targets shared-memory systems Consists of a simple API that is visible to application programmers An efficient runtime that handles parallelization, resource management, and fault recovery April 27, 2019 April 27, 2019 5

Phoenix runtime April 27, 2019 April 27, 2019 6

Generalized Reduction Processing structures April 27, 2019 April 27, 2019 7

A Case Study: Apriori April 27, 2019 April 27, 2019 8

A Case Study: Apriori April 27, 2019

System Design and Implementation Basic dataflow of MATE (MapReduce with AlternaTE API) Data structures to communicate between the user code and the runtime Three sets of functions in MATE Example, how to write a user application April 27, 2019 April 27, 2019 10

MATE runtime dataflow Basic one-stage dataflow April 27, 2019 11

Data structures-(1) scheduler_args_t: Basic fields Field Description Input_data Input data pointer Data_size Input dataset size Data_type Input data type Stage_num Computation-Stage number Splitter Pointer to Splitter function Reduction Pointer to Reduction function Finalize Pointer to Finalize function April 27, 2019 April 27, 2019 12

Data structures-(2) scheduler_args_t: Optional fields for performance tuning Field Description Unit_size # of bytes for one element L1_cache_size # of bytes for L1 data cache size Model Shared-memory parallelization model Num_reduction_workers Max # of threads for reduction workers(threads) Num_procs Max # of processor cores used April 27, 2019 April 27, 2019 13

Functions-(1) Transparent to users Function Description R/O static inline void * schedule_tasks(thread_wrapper_arg_t *) R static void * combination_worker(void *) static int array_splitter(void *, int, reduction_args_t *) void clone_reduction_object(int num) static inline int isCpuAvailable(unsigned long, int) April 27, 2019 April 27, 2019 14

Functions-(2) APIs provided by the runtime Function Description R/O int mate_init(scheudler_args_t * args) R int mate_scheduler(void * args) int mate_finalize(void * args) O void reduction_object_pre_init() int reduction_object_alloc(int size)—return the object id void reduction_object_post_init() void accumulate/maximal/minimal(int id, int offset, void * value) void reuse_reduction_object() void * get_intermediate_result(int iter, int id, int offset) April 27, 2019 April 27, 2019 15

Functions-(3) APIs defined by the user Function Decription R/O int (*splitter_t)(void *, int, reduction_args_t *) O void (*reduction_t)(reduction_args_t *) R Void (*combination_t)(void*) void (*finalize_t)(void *) April 27, 2019 April 27, 2019 16

Implementation Considerations Data partitioning: dynamically assigns splits to worker threads Buffer management: two temporary buffers, one for reduction objects, the other for combination results Fault tolerance: re-executes failed tasks; checkingpoint may be a better solution April 27, 2019

What is in the user code ? Implements necessary functions such as reduction, splitter, finalize, and etc. Generates the input dataset Setups the fields in scheduler_args_t Initializes the middleware and declare reduction object(s) Executes reduction tasks by calling mate_scheduler(one or more passes) Maybe does some finalizing work April 27, 2019 April 27, 2019 18

K-means user code int main (int argc, char **argv){ parse_args(); generate_points(); generate_means(); mate_init(&scheduler_args_t); reduction_object_pre_init(); while(needed) reduction_object_alloc(size); reduction_object_post_init(); while(not finished) { mate_scheduler(); update_means(); reuse_reduction_object(); process_next_iteration(); } mate_finalize(); April 27, 2019 April 27, 2019 19

Experiments: K-means K-means: 400MB, 3-dim points, k = 100 on one WCI node with 8 cores April 27, 2019 April 27, 2019 20

Experiments: K-means K-means: 400MB, 3-dim points, k = 100 on one AMD node with 16 cores April 27, 2019 April 27, 2019 21

Experiments: PCA PCA: 8000 * 1024 matrix, on one WCI node with 8 cores April 27, 2019 April 27, 2019 22

Experiments: PCA PCA: 8000 * 1024 matrix, on one AMD node with 16 cores April 27, 2019 April 27, 2019 23

Experiments: Apriori Apriori: 1,000,000 transactions, 3% support, on one WCI node with 8 cores April 27, 2019 April 27, 2019 24

Experiments: Apriori Apriori: 1,000,000 transactions, 3% support, on one AMD node with 16 cores April 27, 2019 April 27, 2019 25

Related Work Improves MapReduce’s API or implementations Evaluates MapReduce across different platforms and application domains Acadamia: CGL-MapReduce, Mars, MITHRA, Phoenix, Disco… Industry: Facebook (Hive), Yahoo! (Pig Latin, Map-Reduce-Merge), Google (Sawzall), Microsoft (Dryad) April 27, 2019 April 27, 2019 26

Conclusions MapReduce is simple and robust in expressing parallelism Two-stage computation style may cause performance losses for some subclasses of applications in data-intensive computing MATE provides an alternate API that is based on generalized reduction This variation can reduce overheads of data management and communication between Map and Reduce April 27, 2019 April 27, 2019 27

Questions? April 27, 2019 April 27, 2019 April 27, 2019 28 28