Download presentation
Presentation is loading. Please wait.
Published byJodie Daniel Modified over 8 years ago
1
Large-scale cluster management at Google with Borg By: Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, John Wilkes Presented by: Xianghan Pei EECS 582 – W161
2
Outline Introduction & Motivation The User Perspective Borg Architecture Scalability & Availability Utilization Isolation Lessons Future and Related Work EECS 582 – W162
3
Introduction & Motivation Introduction Borg: An OS of Cluster(Datacenter) Motivation Hide the details(programmer focus on App) Provide resource sharing Provide high reliability and availability for Cluster EECS 582 – W163 The high-level architecture of Borg
4
The User Perspective Workload production(prod) VS non-production(non-prod) Cells Heterogeneous, around 10 k machines in median Jobs and Tasks EECS 582 – W164 > borgcfg.../hello_world_webserver.borg up
5
The User Perspective Allocs Reserved set of resources Priority, Quota, and Admission Control Job has a priority (preempting) Quota is used to decide which jobs to admit for scheduling Naming and Monitoring 50.jfoo.ubar.cc.borg.google.com Monitoring health of the task and thousands of performance metrics EECS 582 – W165
6
Borg Architecture Borgmaster Main Borgmaster process & Scheduler Five replicas Borglet Manage and monitor tasks and resource Borgmaster polls Borglet every few seconds EECS 582 – W166 The high-level architecture of Borg(A Cell)
7
Borg Architecture Scheduling feasibility checking: find machines Scoring: pick one machines User preferences & build-in criteria E-PVM VS best-fit Tradeoff EECS 582 – W167 The high-level architecture of Borg(A Cell)
8
Scalability Separate scheduler Separate threads to poll the Borglets Partition functions across the five replicas Score caching Equivalence classes Relaxed randomization EECS 582 – W168 The high-level architecture of Borg(A Cell)
9
Availability System Borgmaster: Five replicas Borglet: Borgmaster Take checkpoint Job Reschedules evicted tasks Spreading tasks across failure domian … EECS 582 – W169
10
Utilization Cell Sharing Segregating prod and non-prod work into different cells would need more machines EECS 582 – W1610
11
Utilization Cell Size Subdividing cells into smaller ones would require more machines EECS 582 – W1611
12
Utilization Fine-grained resource requests EECS 582 – W1612 No bucket sizes fit most of the tasks wellBucketing resource would need more machines
13
Utilization Resource reclamation estimate how many resources a task will use and reclaim the rest for work Kill non-prod if not available EECS 582 – W1613
14
Utilization Resource reclamation Choose medium EECS 582 – W1614
15
Isolation Security isolation chroot jail as the primary security isolation mechanism Performance isolation High-priority LV task(prod) get best treatment compressible resources: reclaimed non-compressible resources: kill or remove task EECS 582 – W1615
16
Lessons Bad Jobs are restrictive as the only grouping mechanism for tasks One IP address per machine complicates things Optimizing for power users at the expense of casual ones Good Allocs are useful Cluster management is more than task management Introspection is vital The master is the kernel of a distributed system EECS 582 – W1616
17
Future Work Kubernetes Evolved from Borg An open-source system for automating deployment, operations, and scaling of containerized applications Pods: groups of containers Labels Replica controller Services EECS 582 – W1617
18
Relative Work Apache Mesos YARN Tupperware EECS 582 – W1618
19
Questions? EECS 582 – W1619
20
References Borg Paper Borg Slides at InfoQBorg Slides EECS 582 – W1620
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.