Download presentation
Presentation is loading. Please wait.
Published byGiles Harmon Modified over 8 years ago
1
Aalo Efficient Coflow Scheduling Without Prior Knowledge Mosharaf Chowdhury, Ion Stoica UC Berkeley
2
Communication is Crucial Performance As SSD-based and in-memory systems proliferate, the network is likely to become the primary bottleneck 1. Based on a month-long trace with 320,000 jobs and 150 Million tasks, collected from a 3000-machine Facebook production MapReduce cluster. Facebook jobs spend ~ 25% of runtime on average in intermediate comm. 1 Reduce Stage Map Stage
3
Flow-Based Solutions GPS RED WFQCSFQ ECNXCP D 2 TCP DCTCP PDQD3D3 FCP DeTailpFabric 2005201020151980s1990s2000s RCP Per-Flow FairnessFlow Completion Time Independent flows cannot capture the collective communication patterns (e.g., shuffle) common in data-parallel applications
4
Cof low Communication abstraction for data- parallel applications to express their performance goals 1.Minimize completion times, 2.Meet deadlines, or 3.Perform fair allocation
5
Benefits of Link 1 Link 2 Inter-Coflow Scheduling
6
Benefits of time 2 4 6 2 4 6 2 4 6 Coflow1 comp. time = 5 Coflow2 comp. time = 6 Coflow1 comp. time = 5 Coflow2 comp. time = 6 Fair Sharing Smallest-Flow FirstSmallest-Coflow First Coflow1 comp. time = 3 Coflow2 comp. time = 6 L1 L2 L1 L2 L1 L2 Link 1 Link 2 3 Units Coflow 1 6 Units Coflow 2 2 Units Inter-Coflow Scheduling Benefits increases with the number of coexisting coflows
7
Varys Efficiently schedules coflows leveraging complete and future information 1 1. Efficient Coflow Scheduling with Varys, SIGCOMM’2014. 1.The size of each flow, 2.The total number of flows, and 3.The endpoints of individual flows
8
1.The size of each flow, 2.The total number of flows, and 3.The endpoints of individual flows Pipelining between stages Speculative executions Task failures and restarts Varys Efficiently schedules coflows leveraging complete and future information
9
1.The size of each flow, 2.The total number of flows, and 3.The endpoints of individual flows Pipelining between stages Speculative executions Task failures and restarts Aalo Efficiently schedules coflows without complete and future information
10
Coflow Scheduling Minimize Avg. Comp. TimeFlows on a Single Link With complete knowledgeSmallest-Flow-First Without complete knowledge Least-Attained Service (LAS)
11
Coflow Scheduling Minimize Avg. Comp. TimeFlows on a Single LinkCoflows in the Entire Network With complete knowledgeSmallest-Flow-FirstVarys 1, Smallest-Coflow-First 1 Without complete knowledge Least-Attained Service (LAS) ? 1. Efficient Coflow Scheduling with Varys, SIGCOMM’2014.
12
Coflow Scheduling Minimize Avg. Comp. TimeFlows on a Single LinkCoflows in the Entire Network With complete knowledgeSmallest-Flow-FirstVarys 1, Smallest-Coflow-First 1 Without complete knowledge Least-Attained Service (LAS) ? LAS: prioritize flow that has sent the least amount of data 1. Efficient Coflow Scheduling with Varys, SIGCOMM’2014.
13
Coflow-Aware LAS (CLAS) Prioritize coflow that has sent the least total number of bytes The more a coflow has sent, the lower its priority Smaller coflows finish faster
14
Coflow-Aware LAS (CLAS) Prioritize coflow that has sent the least total number of bytes The more a coflow has sent, the lower its priority Smaller coflows finish faster Challenges (also shared by LAS) Can lead to starvation Suboptimal for similar size coflows
15
Suboptimal for Similar Coflows Reduces to fair sharing Doesn’t minimize average completion time FIFO works well for similar coflows Optimal when cflows are identical Coflow 1Coflow 2 time 2 4 6 Coflow1 comp. time = 6 Coflow2 comp. time = 6 time 2 4 6 Coflow1 comp. time = 3 Coflow2 comp. time = 6
16
Between a “Rock” and a “Hard Place” Prioritize across dissimilar coflows FIFO schedule similar coflows
17
Discretized Coflow-Aware LAS (D- CLAS) Priority discretization Change priority when total # of bytes sent exceeds predefined thresholds Scheduling policies FIFO within the same queue Prioritization across queue Weighted sharing across queues Guarantees starvation avoidance FIFO … Highest- Priority Queue Lowest- Priority Queue QKQK Q2Q2 Q1Q1
18
How to Discretize Priorities? … A E A E K-1 Exponentially spaced thresholds: A×E i A, E : constants 1 ≤ i ≤ K : threshold constant K : number of the queues A E 2 0 ∞ Highest- Priority Queue Lowest- Priority Queue QKQK Q2Q2 Q1Q1 FIFO
19
Computing Total # of Bytes Sent D-CLAS requires to know total # of bytes sent over all flows of a coflow Distributed computation over small time scales challenging
20
Computing Total # of Bytes Sent D-CLAS requires to know total # of bytes sent over all flows of a coflow Distributed computation over small time scales challenging How much do we loose if we don’t compute total # of bytes sent? D-LAS: make decisions based on total number of bytes sent locally
21
D-LAS Far From Optimal! time 2 4 6 2 4 6 Coflow1 comp. time = 6 Coflow2 comp. time = 6 D-LAS (decision on # of bytes sent locally) D-CLAS Coflow1 comp. time = 3 Coflow2 comp. time = 6 L1 L2 L1 L2 Link 1 Link 2 3 Units Coflow 1 6 Units Coflow 2 2 Units
22
1.Implement D-CLAS using a centralized architecture 2.Expose a non-blocking coflow API Aalo Efficiently schedules coflows without complete and future information
23
Worker milliseconds Worker D-CLAS Sender 1 Sender 2 μsμs Network Interface Timescale Local/Global Scheduling Coordinator Aalo Architecture
24
Details Non-blocking: when a new coflow arrives at an output port Put its flow(s) in lowest priority queue and schedule them immediately No need to sync all flows of a coflow as in Varys
25
Details Non-blocking: when a new coflow arrives at an output port Put its flow(s) in lowest priority queue and schedule them immediately No need to sync all flows of a coflow as in Varys Compute total number of bytes sent Workers send info about active coflows periodically Coordinator computes total # of bytes sent, and relay this info back to workers Workers use this info to move coflows across queues Minimal overhead for small flows
26
1.Can it approach clairvoyant solutions? 2.Can it scale gracefully? YES Evaluation A 3000-machine trace- driven simulation matched against a 100-machine EC2 deployment
27
On Par with Clairvoyant Approaches [EC2] Varys Per-Flow 1.93X1.18X 0.89X0.91X Comm. Improv.Job Improv.
28
Performance Breakdown [EC2] Improvements for small coflows by avoiding coordination Performance loss for medium coflows by mischeduling them Similar for large coflows because they are in slow-moving queues Aalo Varys
29
What About Scalability? [EC2]
30
Aalo Efficiently schedules coflows without complete information Makes coflows practical in presence of failures and DAGs Improved performance over flow-based approaches Provides a simple, non-blocking API https://github.com/coflow Mosharaf Chowdhury – mosharaf@umich.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.