Download presentation
Presentation is loading. Please wait.
Published byLynne Kennedy Modified over 9 years ago
1
A Map-Reduce System with an Alternate API for Multi-Core Environments Wei Jiang, Vignesh T. Ravi and Gagan Agrawal
2
Outline October 24, 20152 Introduction MapReduce Generalized Reduction System Design and Implementation Experiments Related Work Conclusion
3
October 24, 20153 Growing need for analysis of large scale data Scientific Commercial Data-intensive Supercomputing (DISC) Map-Reduce has received a lot of attention Database and Datamining communities High performance computing community E.g. this conference !! Motivation
4
October 24, 20154 Positives: Simple API Functional language based Very easy to learn Support for fault-tolerance Important for very large-scale clusters Questions Performance? Comparison with other approaches Suitability for different class of applications? Map-Reduce: Positives and Questions
5
Class of Data-Intensive Applications Many different types of applications Data-center kind of applications Data scans, sorting, indexing More ``compute-intensive`` data-intensive applications Machine learning, data mining, NLP Map-reduce / Hadoop being widely used for this class Standard Database Operations Sigmod 2009 paper compares Hadoop with Databases and OLAP systems What is Map-reduce suitable for? What are the alternatives? MPI/OpenMP/Pthreads – too low level? October 24, 20155
6
This Paper Proposes MATE (a Map-Reduce system with an AlternaTE API) based on Generalized Reduction Phoenix implemented Map-Reduce in shared-memory systems MATE adopted Generalized Reduction, first proposed in FREERIDE that was developed at Ohio State 2001-2003 Observed API similarities and subtle differences between MapReduce and Generalized Reduction Comparison for Data Mining Applications Compare performance and API Understand performance overheads Will an alternative API be better for ``Map-Reduce``? October 24, 20156
7
7 Map-Reduce Execution
8
October 24, 20158 Phoenix Implementation It is based on the same principles of MapReduce But targets shared-memory systems Consists of a simple API that is visible to application programmers Users define functions like splitter, map, reduce, and etc.. An efficient runtime that handles low-level details Parallelization Resource management Fault detection and recovery
9
October 24, 20159 Phoenix Runtime
10
Comparing Processing Structures 10 Reduction Object represents the intermediate state of the execution Reduce func. is commutative and associative Sorting, grouping.. overheads are eliminated with red. func/obj. October 24, 2015
11
Observations on Processing Structure Map-Reduce is based on functional idea Do not maintain state This can lead to overheads of managing intermediate results between map and reduce Map could generate intermediate results of very large size MATE API is based on a programmer managed reduction object Not as ‘clean’ But, avoids sorting of intermediate results Can also help shared memory parallelization Helps better fault-recovery October 24, 201511
12
October 24, 201512 Apriori pseudo-code using MATE An Example
13
October 24, 201513 Apriori pseudo-code using MapReduce Example – Now with Phoenix
14
October 24, 201514 System Design and Implementation Basic dataflow of MATE (MapReduce with AlternaTE API) in Full Replication scheme Data structures in MATE scheduler Used to communicate between the user code and the runtime Three sets of functions in MATE Internally used functions MATE API provided by the runtime MATE API to be defined by the users Implementation considerations
15
October 24, 201515 MATE Runtime Dataflow Basic one-stage dataflow (Full Replication scheme)
16
October 24, 201516 Data Structures-(I) scheduler_args_t: Basic fields FieldDescription Input_dataInput data pointer Data_sizeInput dataset size Data_typeInput data type Stage_numComputation-Stage number (Iteration number) SplitterPointer to Splitter function ReductionPointer to Reduction function FinalizePointer to Finalize function
17
October 24, 201517 Data Structures-(II) scheduler_args_t: Optional fields for performance tuning FieldDescription Unit_size# of bytes for one element L1_cache_size# of bytes for L1 data cache size ModelShared-memory parallelization model Num_reduction_workersMax # of threads for reduction workers(threads) Num_procsMax # of processor cores used
18
October 24, 201518 Functions-(I) Transparent to users (R-required; O-optional) Function DescriptionR/O static inline void * schedule_tasks(thread_wrapper_arg_t *)R static void * combination_worker(void *)R static int array_splitter(void *, int, reduction_args_t *)R void clone_reduction_object(int num)R static inline int isCpuAvailable(unsigned long, int)R
19
October 24, 201519 Functions-(II) APIs provided by the runtime Function DescriptionR/O int mate_init(scheudler_args_t * args)R int mate_scheduler(void * args)R int mate_finalize(void * args)O void reduction_object_pre_init()R int reduction_object_alloc(int size)—return the object idR void reduction_object_post_init()R void accumulate(int id, int offset, void * value)O void reuse_reduction_object()O void * get_intermediate_result(int iter, int id, int offset)O
20
October 24, 201520 Functions-(III) APIs defined by the user Function DescriptionR/O int (*splitter_t)(void *, int, reduction_args_t *)O void (*reduction_t)(reduction_args_t *)R void (*combination_t)(void*)O void (*finalize_t)(void *)O
21
October 24, 201521 Implementation Considerations Focus on the API differences in programming models Data partitioning: dynamically assigns splits to worker threads, same as Phoenix Buffer management: two temporary buffers one for reduction objects the other for combination results Fault tolerance: re-executes failed tasks Checking-point may be a better solution since reduction object maintains the computation state
22
October 24, 201522 For comparison, we used three applications Data Mining: KMeans, PCA, Apriori Also evaluated the single-node performance of Hadoop on KMeans and Apriori Experiments on two multi-core platforms 8 cores on one WCI node (intel cpu) 16 cores on one AMD node (amd cpu) Experiments Design
23
October 24, 201523 Results: Data Mining (I) K-Means: 400MB dataset, 3-dim points, k = 100 on one WCI node with 8 cores Avg. Time Per Iteration (sec) # of threads 2.0 speedup
24
October 24, 201524 Results: Data Mining (II) K-means: 400MB dataset, 3-dim points, k = 100 on one AMD node with 16 cores Avg. Time Per Iteration (sec) # of threads 3.0 speedup
25
October 24, 201525 Results: Data Mining (III) PCA: 8000 * 1024 matrix, on one WCI node with 8 cores # of threads Total Time (sec) 2.0 speedup
26
October 24, 201526 Results: Data Mining (IV) PCA: 8000 * 1024 matrix, on one AMD node with 16 cores Total Time (sec) # of threads 2.0 speedup
27
October 24, 201527 Results: Data Mining (V) Apriori: 1,000,000 transactions, 3% support, on one WCI node with 8 cores # of threads Avg. Time Per Iteration (sec)
28
October 24, 201528 Results: Data Mining (VI) Apriori: 1,000,000 transactions, 3% support, on one AMD node with 16 cores # of threads Avg. Time Per Iteration (sec)
29
Observations October 24, 201529 MATE system has between reasonable to significant speedups for all the three datamining applications. In most of the cases, it outperforms Phoenix and Hadoop significantly, while slightly better in only one case. Reduction object can help to reduce the memory overhead associated with managing the large set of intermediate results between map and reduce
30
Related Work October 24, 201530 Lots of work on improving and generalizing MapReduce’s API… Industry: Dryad/DryadLINQ from Microsoft, Sawzall from Google, Pig/Map-Reduce-Merge from Yahoo! Academia: CGL-MapReduce, Mars, MITHRA, Phoenix, Disco, Hive… Address MapReduce limitations One input, two-stage data flow is extremely rigid Only two high-level primitives
31
October 24, 201531 Conclusions MapReduce is simple and robust in expressing parallelism Two-stage computation style may cause performance losses for some subclasses of applications in data-intensive computing MATE provides an alternate API that is based on generalized reduction This variation can reduce overheads of data management and communication between Map and Reduce
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.