Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dimitrios Katsaros1 Jointly with:

Similar presentations


Presentation on theme: "Dimitrios Katsaros1 Jointly with:"— Presentation transcript:

1 Hadoop MapReduce performance on SSDs The case of complex network analysis tasks
Dimitrios Katsaros1 Jointly with: Marios Bakratsas1, Pavlos Basaras1, Leandros Tassiulas2 1Dept. Electrical & Computer Engineering University of Thessaly, Greece 2Dept. Electrical Engineering Yale University, USA

2 Programming in large computing clusters

3 MapReduce basics …

4 Magnetic Disks OR Solid State Disks?
Solid State Drive (SSD) A purely electronic device built on NAND flash memory No mechanical parts Technical merits Low access latency Low power consumption Shock resistance Potentially uniform random access speed Remaining two problems limiting wider deployment of SSDs Limited life span Random write performance§ § SFS: Random write considered harmful in Solid State Drives, Proceedings of the USENIX Conference on File and Storage Technologies (FAST), 2012.

5 Is it true that SSDs are better that MDs?
So far (almost) every study concludes that SSDs are better, based on “benchmark” queries: Terasort, wordcount, K-means … What if we try different read/write patterns? Complex network analysis offers various primitives, such as: Task Type of analysis Applications Mutual friends Neighbor-based Local network (neighborhood) properties Recommendation queries Connected Components Path-based Large-scale network properties Reachability queries Resilience queries Triangle counting Mixed (extended neighborhood & paths) Clustering/communities finding queries

6 System setup CPU Intel i5 4670 3.4Ghz RAM 8Gb 1600MHz DDR3 (1333MHz)
Disk 1 (HDD) Western Digital Blue WD10EZEX 1TB Disk 2 (SSD1) Samsung 840 EVO 120GB Disk 3 (SSD2) Crucial MX GB

7 A glimpse of results

8 Take away lessons Although SSD was slightly faster in many tests, in some cases the magnetic disk outperformed SSD The magnetic disk performed marginally better for reduce phase Application profilers are necessary to drive the selection of storage medium

9 Thank you!


Download ppt "Dimitrios Katsaros1 Jointly with:"

Similar presentations


Ads by Google