Download presentation
Presentation is loading. Please wait.
Published byRobert Archibald Beasley Modified over 5 years ago
1
Job-aware Scheduling in Eagle: Divide and Stick to Your Probes
Pamela Delgado, Diego Didona, Florin Dinu, Willy Zwaenepoel
2
I. Data-center scheduling
cluster Job 1 task … task scheduler … … The context of this presentation is data center scheduling Job N task … task Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
3
I. Data-center scheduling challenges
Heterogeneous workloads Short vs long tasks Problem: Head-of-line blocking (short behind long) Short Long Short Short In data-center scheduling we face some challenges combination of tasks that have a long execution time and tasks with short execution time for the purpose of this talk if a job has short tasks we call it short Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
4
I. Data-center scheduling challenges
Scheduler induced stragglers Problem: Non job-aware scheduling Large scale task 1 Job completion time … task n task x time cluster In this case one task finishes later than others, this leads to BAD job completion time schedulers schedule at the task level, this leads to non job-aware scheduling Scale: both in terms of cluster size and terms of load Tens of thousands tasks/second … Tens of thousands … Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
5
II. Eagle Contributions Divide: Stick to Your Probes: Hybrid scheduler
Novel technique to avoid head-of-line blocking Stick to Your Probes: Decentralized job-awareness Hybrid scheduler On top of Hybrid Scheduler to have necessary scalability so what is hybrid scheduling? hybrid means a mix of centralized/distributed how does it work Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
6
I. Hybrid scheduling: long centralized L L L L L L L L L centralized
scheduler L L L L L L L … L L
7
I. Hybrid scheduling: short distributed L L L L L L distributed
scheduler distributed scheduler … s probe probe not use late binding L L L L … L L
8
II.1. Problem: Head-of-line blocking
Short behind long High likelihood (long = many resources) Long A short task is enqueued behind a long task (either in the queue or running) Short Short Short head of queue Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
9
II.1. Rationale for Divide
Expected completion time of a task proportional to variance of task execution times* DIVIDE by execution time Long Long Short Short Short *Pollaczek-Khinchine formula: Theory Vol1, Queueing Systems. L. Kleinroch 1975 Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
10
II.1. Dynamic division Long Long Long … Short Short Short Short Short
Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
11
Succinct State Sharing
II.1. Eagle – Divide IDEA: Dynamic partitioning Succinct State Sharing * Centralized: send bitmap of nodes with long tasks * Distributed: based on bitmap avoid Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
12
II.1. Eagle – Divide L L L reject L L L L L L distributed distributed
scheduler distributed scheduler centralized scheduler … L L L reject L L c L L … L L reschedule Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
13
II.1. Eagle – Divide No head-of-line blocking
Dynamic: mitigate resource wastage Scalable: no burden on centralized Succinct: bitmap Because its dynamic we mitigate Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
14
II.2. Problem: stragglers
distributed scheduler task 1 task 2 Task waiting to execute! probe Completely distributed schedulers like in Hawk, Sparrow, Tarcil, send random probes to n1 n2 n3 n4 Node free! Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
15
II.2. Rationale Expected completion time of a job inversely proportional to number of jobs* Better finish one job entirely than to execute many jobs partially Expected completion time of a job is inversely proportional to the number of jobs present in the system Job 1 Job N task … task … task … task *Little’s formula: A proof for the queueing formula: L=𝜆𝑤. J.D.C. Little 1961 Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
16
II.2. Eagle - Stick to Your Probes
IDEA: Get a job out of the system ASAP Sticky Batch Probing * Probe STICKS to a node. * Probe can execute more tasks. Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
17
II.2. Eagle - Stick to Your Probes
distributed scheduler task 1 task 2 probe Probe STICKS there! n1 n2 n3 n4 Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
18
II.2. Eagle – Stick to Your Probes
Job-awareness Straggler mitigation Decentralized end on a high note Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
19
II. Eagle – Recap Divide Stick to your probes Hybrid scheduler
dynamically divide nodes for short/long tasks Stick to your probes probe sticks to the node able to execute more tasks Hybrid scheduler Queue reorder: Shortest Remaining Processing Time (SRPT) Related work has shown the advantages of queue reordering Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
20
III. Evaluation - simulation
Event-driven simulator Google trace – half a million jobs 15000 – nodes Measure: Job running time Report short jobs 50th, 90th and 99th percentiles Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
21
III.A. Hawk Hybrid scheduler Work stealing
free nodes steal tasks from another try to avoid head-of-line blocking But this will not really avoid the head of line blocking as we will see Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
22
Better across the board
III.A. Eagle vs Hawk Short job running times lower better Better across the board We show only short jobs because long jobs are scheduled in the same LWL fashion in both systems Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
23
III.A. Eagle vs Hawk none some Why are we better? Eagle Hawk
Avoids head-of-line blocking none some Job-aware scheduler Queue reordering Partitioning + stealing do not get rid of all short behind long Stealing randomized Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
24
III.B. State-of-the-art (SOTA)
[Apollo+] Schedule all jobs in Least Work Left (LWL) [Apollo+] Distributed: waiting times updated at heartbeat interval Google: 3 [s] [Yaq-d*] Queue reordering SRPT +Apollo: Scalable and coordinated scheduling for cloud-scale computing. E. Boutin et.al.OSDI'14 *Efficient queue management for cluster scheduling. J. Rasley et.al. EuroSys'16 Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
25
Better across the board
III.B. Eagle vs SOTA Short job running times lower better Better across the board Better at higher loads The same at lower loads Lower Higher Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
26
III.B. Eagle vs SOTA Why are we better?
Eagle: more flexible task assignment SOTA: task assigned to one node SOTA heartbeats: stale information SOTA: concurrent scheduling Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
27
III. Evaluation - Implementation
Spark plug-in 100-node cluster Subset of Google trace Measure job running time Report short jobs 50th, 90th and 99th percentiles Compare to Hawk We don’t have availability for the other system Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
28
III. Evaluation - Implementation
Subset of Google trace lower better Eagle works well in a real cluster Better at higher loads The same at lower loads Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
29
IV. Conclusion Eagle new techniques Succinct State Sharing (Divide)
No head-of-line blocking Sticky Batch Probing (Stick to Your Probes) Job-aware Two new techniques to improve scheduling of data-parallel jobs in data centers SSS : dynamically divide nodes into partitions long/short SBP: a probe sticks until job is done Introduction Eagle: Divide Eagle: Stick to Your Probes Evaluation Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.