Scalable and Coordinated Scheduling for Cloud-Scale computing Apollo : Scalable and Coordinated Scheduling for Cloud-Scale computing 72150263 심윤석
INDEX Backgroud Goals & Challenges of Apollo Apollo Framework Evaluation Conclusion
Backgroud SCOPE DAG (Directed acyclic graph) Job Stage Task Compile 150 DOG
Backgroud
Goals & Challenges Minimize Job Latency & Maximize Cluster Utilization Scaling Heterogeneous workload Maximize Resource Utilization
Goals & Challenges Scale Job processes had GB to PB of data 100,000 scheduling request/sec (in peak time) Clusters contain over 20,000 servers Clusters run up to 170,000 tasks in parallel
Goals & Challenges Heterogeneous workload Short (Seconds) & Long (Hours) Execution Time I/O bound, CPU bound Various Resource Requirements (e.g. Memory, Cores) Data Locality (Long Task) & Scheduling Latency (Short Task)
Goals & Challenges Maximize Utilization Workload Fluctuates Regularly Especially CPU Utilization
Apollo Framework
Apollo Framework Distributed and Coordinate Scheduler
Apollo Framework Estimation Based Scheduling
Apollo Framework Wait-Time Update
Apollo Framework Wait-Time Matrix For represent server load Lightweight Expected Wait Time Future Resource Availability
Apollo Framework 𝐸=𝐼+𝑊+𝑅 𝐶= 𝑃 𝑠𝑢𝑐𝑐 𝐸+𝐾 1− 𝑃 𝑠𝑢𝑐𝑐 𝐸 Estimation-Based Scheduling For Minimize Task Completion Time Stable match algorithm Task Completion Time Equation E Estimated Task Completion Time I Initialization Time W Wait Time R Runtime Include Server Failure Cost C Final Estimated Completion Time P Success Probability K Server Failure Panalty 𝐸=𝐼+𝑊+𝑅 𝐶= 𝑃 𝑠𝑢𝑐𝑐 𝐸+𝐾 1− 𝑃 𝑠𝑢𝑐𝑐 𝐸
Apollo Framework Distributed and Coordinate Scheduler One scheduler per one job Each scheduler make Independent Decision based on Global Status Conflicts can be occur
Apollo Framework Correcting Conflicts (Correction Machanism) Re-evaluates prior scheduling decisions Duplicate Scheduling Confidence Scattering completion time Randomization
Apollo framework Opportunistic Scheduling Opportunistic Task Maximize Utilization Random Scheduling Fairness Opportunistic Task Can be preempted Can be upgrade to regular task Only consume idle resources Opportunistic Task can use if Regular Task does not exist
Evaluation Apollo at Sacle Scheduling Quality Evaluating Estimates Completion Time Correction Effectiveness Stable matching Efficiency
Evaluation Apollo at Scale Run 170,000 tasks in parallel Tracks 14,000,000 pending tasks Well utilized in weekday (90% median CPU utilization)
Evaluation Scheduling Quality 80% of Recurring jobs getting faster Significantly improved wait time Similar performance with Oracle (No schedule latency, conflicts, failure …)
Evaluation Evaluating Estimates Completion Time
Evaluation Correction Effectiveness Stable matching Efficiency 82% Success rate < 0.5% Trigger rate Stable matching Efficiency
Conclusion Minimize Job Latency Maximize Cluster Utilization Loosely Coordinated Distributed Scheduler High Quality Scheduling Maximize Cluster Utilization Opportunistic Scheduling
reference https://www.usenix.org/conference/osdi14/technical- sessions/presentation/boutin https://www.usenix.org/sites/default/files/conference/protected- files/osdi14_slides_boutin.pdf