Analyzing Yellowstone’s Network with a Raspberry Pi Cluster Lauren Patterson
Using a low cost Raspberry Pi cluster to find the interconnect path between two nodes on Yellowstone in order to analyze the performance of jobs. Objective of the Project
Assembling the Raspberry Pi cluster
Yellowstone Interconnect Credit: Siddhartha Ghosh
Files Used job1_nodes.txt – Gives the job ID and nodes used ibnetdiscover.log (Discover File) – Lists connections between switches LFTS.txt – Routing table for each switch
What is Hadoop? HDFS MapReduce
HDFS
Input Data Map Phase Shuffle phase Reduce phase Outpu t Data MapReduce
Pig Apache Pig Pig Latin Grunt
Pig Latin Script Created Pig Latin Script to find the path between two nodes in Yellowstone
JOIN Operations in PIG Default, Inner Join returns intersection of A and B Set B Set A A B U Full, Right and Left Outer Joins return A and B with different parts nulled out (white) Full Right Left Join
Path Finder Code Flow
Results ±3 ±82 ±19 ±15 ±3±4
Python Single Path Python Parallel Python – Mpi4py 1.3.1
±0.02 ±0.07 ±0.006 ±0.11 ±0.004 ±0.11
±18 ±4 ±20 ±2 ±7±4 ±1 ±2 ±0.5
What Do All Of These Have In Common? Raspberry Pi Hadoop Pig Python
Acknowledgments Richard Loft Karina Hauser Stephanie Barr Bruce Chittenden Amogh Simha Raghu Raj Prasanna Kumar
Questions?