HAMA: An Efficient Matrix Computation with the MapReduce Framework Sangwon Seo, Edward J. Woon, Jaehong Kim, Seongwook Jin, Jin-soo Kim, Seungryoul Maeng IEEE 2007 Dec 3, 2014 Kyung-Bin Lim
2 / 35 Outline Introduction Methodology Experiments Conclusion
3 / 35 Apache HAMA Easy-of-use tool for data-intensive scientific computation Massive matrix/graph computations are often used as primary functionalities Fundamental design is changed from MapReduce with matrix computation to BSP with graph processing Mimic of Pregel running on HDFS – Use zookeeper as a synchronization barrier
4 / 35 Our Focus This paper is a story about previous version 0.1 of HAMA – Latest version: 0.7.0, Mar released Only Focus on matrix computation with MapReduce Shows simple case studies
5 / 35 The HAMA Architecture We propose distributed scientific framework called HAMA (based on HPMR) – Provide transparent matrix/graph primitives
6 / 35 The HAMA Architecture HAMA API: Easy-to-use Interface HAMA Core: Provides matrix/graph primitives HAMA Shell: Interactive User Console
7 / 35 Contributions of HAMA Compatibility – Take advantage of all Hadoop features Scalability – Scalable due to compatibility Flexibility – Multiple Compute Engines Configurable Applicability – HAMA’s primitives can be applied to various applications
8 / 35 Outline Introduction Methodology Experiments Conclusion
9 / 35 Case Study With case study approach, we introduce two basic primitives with MapReduce model running on HAMA – Matrix multiplication and finding linear solution And compare with MPI versions of these primitives
10 / 35 Case Study Representing matrices – As a defaults, HAMA use HBase (NoSQL database) HBase is modeled after Google’s Bigtable Column oriented, semi-structured distributed database with high scalability
11 / 35 Case Study – Multiplication: Iterative Way Iterative approach (Algorithm)
12 / 35 Case Study – Multiplication: Iterative Way Simple, naïve strategy Works well with sparse matrix Sparse matrix: most entries are 0
13 / 35 Multiplication: Iterative Way
14 / 35 Multiplication: Iterative Way
15 / 35 Multiplication: Iterative Way
16 / 35 Multiplication: Iterative Way
17 / 35 Multiplication: Iterative Way
18 / 35 Multiplication: Iterative Way
19 / 35 Case Study – Multiplication: Block Way Multiplication can be done using sub-matrix Works well with dense matrix
20 / 35 Case Study – Multiplication: Block Way Block Approach – Minimize data movement (network cost)
21 / 35 Case Study – Multiplication: Block Way Block Approach (Algorithm)
22 / 35 Case Study – Finding Linear Solution Ax =b – x = ? A: known square symmetric positive-definite matrix b: known vector Use Conjugate Gradient approach
23 / 35 Case Study – Finding Linear Solution Finding Linear Solution – Cramer’s rule – Conjugate Gradient Method
24 / 35 Case Study – Finding Linear Solution Cramer’s rule
25 / 35 Case Study – Finding Linear Solution Conjugate Gradient Method – Find a direction (conjugate direction) – Find a step size (Line search)
26 / 35 Case Study – Finding Linear Solution Conjugate Gradient Method (Algorithm)
27 / 35 Outline Introduction Methodology Experiments Conclusion
28 / 35 Evaluations TUSCI (TU Berlin SCI) Cluster – 16 nodes, two Intel P4 Xeon processors, 1GB memory – Connected with SCI (Scalable Coherent Interface) network interface in a 2D torus topology – Running in OpenCCS (similar environment of HOD) Test sets
29 / 35 HPMR’s Enhancements Prefetching – Increase Data Locality Pre-shuffling – Reduces Amount of intermediate outputs to shuffle
30 / 35 Evaluations The comparison of average execution time and scaleup with Matrix Multiplication
31 / 35 Evaluations The comparison of average execution time and scaleup with CG
32 / 35 Evaluations The comparison of average execution time with CG, when a single node is overloaded
33 / 35 Outline Introduction Methodology Experiments Conclusion
34 / 35 Conclusion HAMA provides the easy-of-use tool for data-intensive computations – Matrix computation with MapReduce – Graph computation with BSP