Download presentation
Presentation is loading. Please wait.
Published bySuzan Waters Modified over 9 years ago
1
“Evaluating MapReduce for Multi-core and Multiprocessor Systems” Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis Computer Systems Laboratory Stanford University Presented by JP Cafaro
2
2ECE 259 / CPS 221 Introduction to MapReduce MapReduce is a programming model created by Google to help with the automatic parallelization and distribution of code over thousands of servers. It allows for the programmer to write simple functional code without needing to worry about all of the low-level parallelization under the hood. It works by taking an input data, and mapping it to intermediate pairs. Disjoint portions of the input data can be worked on in parallel. The intermediate pairs are then reduced to produce the final output. This can also be done in parallel.
3
3ECE 259 / CPS 221 Proposal and Features MapReduce is for thousands of distributed systems and relies on remote file accesses. The researchers wanted to create a shared memory system implementation of MapReduce for commercial systems (Phoenix) Phoenix can do a number of really cool things like dynamically spawn threads taking into account the number of cores, hardware threads per core, system load, etc. Work Stealing/Load Balancing, Prefetching, Granularity, Fault Tolerance It deals with a lot of the low level stuff automatically to create a simplistic programming model to greatly facilitate programmer efficiency.
4
4ECE 259 / CPS 221 Benchmark and Results The researchers used a number of parallelizable types of programs including word count, matrix multiply, reverse index, etc. Speedups were determined based on comparisons to sequential versions of the code. In all cases, using the MapReduce implementation was better than using the sequential version. In some cases, the overhead introduced by Phoenix made it less efficient than a low-level implementation in P-Threads.
5
5ECE 259 / CPS 221 Questions The main question is the tradeoff between programming simplicity and performance. The low level P-threads implementation didn’t use dynamic scheduling because of programming complexity even though it would have probably made the Phoenix implementation look less attractive from a performance standpoint. Are we giving up too much to make programmers’ lives easier? How many types of applications can we use this MapReduce implementation on? Are there other types of programming models that are similar to MapReduce that we could fit to other problems types?
6
6ECE 259 / CPS 221 Conclusions MapReduce/Phoenix can be really useful for some algorithms that map nicely onto this programming model as shown by the results. Other types of programs that this model isn’t naturally suited for experience less speedups. The overhead introduced by Phoenix makes alternatives such as using a lower level P-threads implementation perform better. Overall, this model is extremely simple and techniques such as MapReduce which automatically parallelize code are important to think about as we try and figure out how to write software for tons of cores.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.