Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das Curtis Yu Cristian Lumezanu Yueping Zhang Vishal Singh Guofei Jiang
Data processing Network
Schedule computation
Schedule communication 33% of average job running time
FlowComb network management framework for Big Data processing 1. what is the traffic demand? 2. which path to choose? 3. how to change the path?
Demand prediction Use application semantics information to effectively and transparently infer network transfers (possibly before they start)
Demand prediction Agents on Hadoop nodes analyze Hadoop logs, query nodes and predict data transfers. Hadoop node Parses TaskTracker logs to identify reducers and size of map output Parses JobTracker logs to identify finished mappers Agent
Flow scheduling Reroute flows on paths with sufficient available bandwidth
Flow scheduling Where?Centralized decision engine Which flows? FIFO Reroute? If congestion on default path Which path? First with available bandwidth
Flow control Use OpenFlow to install new forwarding rules in the network and enforce the new paths
System Architecture Master Slaves 1 1 Hadoop Cluster PFS Analyze Hadoop logs 2 2 Extract flow information 5 5 Install routing rules 3 3 Schedule upcoming flows 4 4 Set up flow paths FlowComb Middleware OpenFlow Controller OpenFlow Controller FlowComb agent NEC Confidential13
Experiments
Does the network matter? Link capacity (Mbps)Avg. processing time (min) (x1.3) 2567 (x1.7) (x3.7) 4 times slower !!!
Can FlowComb predict transfers? 28% of transfers detected before they start (and 56% before they end)
How quickly can FlowComb change paths? 10%70%20% 60% before transfer midpoint
Can FlowComb reduce processing time? 36% faster than Hadoop without FlowComb (and 28% faster than Hadoop with ECMP)
FlowComb Network management platform for Big Data processing that is transparent to applications and quick and accurate in detecting their demand uses application semantics to detect data transfers (sometimes before they even start)
Testbed
OpenFlow network Controller
Hadoop sort performance FlowComb baseline Time (s) Avg utilization (MBps)