Running Multiple Schedulers in Kubernetes

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Unveiling ProjectWise V8 XM Edition. ProjectWise V8 XM Edition An integrated system of collaboration servers that enable your AEC project teams, your.
SLA-Oriented Resource Provisioning for Cloud Computing
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.
Why static is bad! Hadoop Pregel MPI Shared cluster Today: static partitioningWant dynamic sharing.
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.
Cluster Scheduler Reference: Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center NSDI’2011 Multi-agent Cluster Scheduling for Scalability.
New Challenges in Cloud Datacenter Monitoring and Management
A Platform for Fine-Grained Resource Sharing in the Data Center
Hands-On Microsoft Windows Server 2008
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
AppManager Product Status Update David Mount Technical Manager – UK, Ireland & Middle East David Mount Technical Manager – UK, Ireland & Middle East.
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
Module 10 Administering and Configuring SharePoint Search.
CHT Project Progress Report 10/07 Simon. CHT Project Develop a resource management scheduling algorithm for CHT datacenter. ◦ Two types of jobs, interactive/latency-
Tarball server (for Condor installation) Site Headnode Worker Nodes Schedd glidein - special purpose Condor pool master DB Panda Server Pilot Factory -
Welcome to the Twin Cities BizTalk User Group July 2006.
Introduction to ZooKeeper. Agenda  What is ZooKeeper (ZK)  What ZK can do  How ZK works  ZK interface  What ZK ensures.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
A Platform for Fine-Grained Resource Sharing in the Data Center
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cassandra Architecture.
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center NSDI 11’ Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D.
Large-scale cluster management at Google with Borg By: Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, John Wilkes Presented.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Omega: flexible, scalable schedulers for large compute clusters
Introducing Flink on Mesos Eron Wright – DELL
Cassandra The Fortune Teller
TensorFlow– A system for large-scale machine learning
Scalable containers with Apache Mesos and DC/OS
Designing and Implementing an ETL Framework
Dockerize OpenEdge Srinivasa Rao Nalla.
Introduction to Distributed Platforms
Kubernetes Modifications for GPUs
Not Just Another Mesos Framework
IBM Presentation Template Full Version
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Spark Presentation.
Maximum Availability Architecture Enterprise Technology Centre.
CoreDNS and Kubernetes
Apache Hadoop YARN: Yet Another Resource Manager
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Kubernetes Container Orchestration
Azure Container Instances
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Omega: flexible, scalable schedulers for large compute clusters
Apollo Weize Sun Feb.17th, 2017.
Specialized Cloud Mechanisms
AWS Cloud Computing Masaki.
CPU SCHEDULING.
Orchestration & Container Management in EGI FedCloud
Container cluster management solutions
Resource-Efficient and QoS-Aware Cluster Management
Cloud Computing Large-scale Resource Management
Job-aware Scheduling in Eagle: Divide and Stick to Your Probes
OpenShift as a cloud for Data Science
Your Data Any Place, Any Time
Containerized Spark at RBC
COS 518: Distributed Systems Lecture 11 Mike Freedman
Cloud Resource Scheduling for Online and Batch Applications
Containers on Azure Peter Lasne Sr. Software Development Engineer
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Running Multiple Schedulers in Kubernetes Xiaoning Ding, Principal Architect, Huawei

Agenda Default Scheduling in Kubernetes Why Multiple Schedulers How It Works in Kubernetes Improvements in Huawei HOSS Comparison and Summary Q & A

Default Scheduling in Kubernetes ETCD 1 Incoming Pods 2 Persist API Server Watch Pods with nodeName = $nodeName Launch pods Watch Pods with nodeName = “” Schedule and update nodeName (binding) 4 3 Kubelet Kubelet Kubelet Scheduler Node 1 Node 2 Node n

Default Scheduling in Kubernetes (Cont.) Predicates PodFitsResources PodFitsPorts MatchNodeSelector … ImageLocalityPriority BalancedResourceAllocation LeastRequestedPriority … Priorities Predicates: the scheduling rules that filter out unqualified nodes. Priorities : the scheduling rules that rank remaining nodes according to preferences. Scheduling policy: a combination of predicates and priorities.

Why Multiple Schedulers Scenario: diverse workloads on a shared cluster Better flexibility and maintainability Easier to try out new schedulers Easier to develop, maintain and evolve Better availability and scalability Add, update or remove schedulers without downtime Ability to scale out schedulers on process level

How It Works in Kubernetes: Job Dispatch Annotations: scheduler.alpha.kubernetes.io/name = “Scheduler1” 1 Incoming Pods ETCD 2 Persist API Server Watch Pods with nodeName = “” Check scheduler name annotation, drop if not match Schedule and bind 3 4 Watch Pods with nodeName = $nodeName Launch pods Kubelet Kubelet Kubelet Scheduler 1 Configure Scheduler names Scheduler 2 Node 1 Node 2 Node n

How It Works in Kubernetes: Conflict Detection Incoming Pods Request.memory = 2 G 1 ETCD P1 2 Persist P2 API Server P1.nodeName = “node 1” Watch Pods with nodeName = $nodeName Re-run all predicates Launch pods 3 4 Scheduler 1 Kubelet P2.nodeName = “node 1” 3 Available. memory = 3 G Scheduler 2 Node 1

HOSS: What We Want to Improve Job dispatch Instance-level load balancing Dynamic scheduling policy Conflict resolution Early conflict detection and rescheduling Flexible conflict criteria Conflict resolution policies

HOSS: Job Dispatch 1 2 ETCD API Server 3 Scheduler Controller 4 5 Annotations: scheduler.alpha.kubernetes.io/type = “Type1” scheduler.alpha.kubernetes.io/policy = “PolicyA” 1 Incoming Pods ETCD 2 Persist API Server Assign scheduler instances dynamically by updating “name” annotation 3 instance1 instance2 Scheduler Controller Configure scheduler type and instance names Scheduler Type 1 4 Watch and schedule pods based on specified scheduling policy 5 instance1 instance2 instance3 Watch and launch pods Kubelet instance4 Scheduler Type 2 Node

HOSS: Conflict Detection 1 Incoming Pods ETCD 2 Persist API Server Conflict Resolver Watch Pods with nodeName = $nodeName Re-run all predicates Launch pods Watch and schedule pods Re-schedule the current pod if binding failed 4 Perform conflict checks based on specified criteria 3 Kubelet Schedulers Node 1

HOSS: Multi-level Conflict Criteria Strong Weak Customizable Based on node resource versioning mechanism. Version mismatch is a conflict. Based on node resource quantity. Only resource insufficiency is a conflict. Based on a Boolean expression of node properties and pod properties. Negative evaluation result is a conflict.

HOSS: Conflict Resolution Policies Pod-priory-based conflict resolution Scheduler-priority-based conflict resolution Batch conflict resolution Group-based conflict resolution

HOSS Multi-scheduler Framework HOSS: All Schedulers Kubernetes Native Scheduler Firmament Scheduler Tarcil Scheduler Mesos Yarn ETCD HOSS Multi-scheduler Framework

Comparison: Multi-Schedulers in Mesos Framework 2 Scheduler tasks Framework 1 Scheduler tasks 2 <task 1, node1, 2cpu, 1gb, …> <task 2, node1, 1cpu, 2gb,…> 1 <node1, 4cpu, 4gb, …> Allocation Module Mesos Master <framework1, task 1, 2cpu, 1gb, …> < framework2, task 2, 1cpu, 2gb,…> 3 Executor Node 1 Task1 Task2 Executor Node 2 Task

Summary Kubernetes Mesos* HOSS Concurrency Control Optimistic Pessimistic Scheduler’s Resource View Shared global state Partial state Job Dispatch Multi-type N/A Multi-instance Dynamic policy Conflict Detection Late detection Early detection Conflict Model Coarse-grained model Fine-grained model * Without the ongoing improvement JIRA-1607 “Optimistic Offer”

References https://github.com/kubernetes/kubernetes/blob/master/docs/devel/scheduler.md https://github.com/kubernetes/kubernetes/blob/master/docs/devel/scheduler_algorithm.md https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/multiple-schedulers.md Firmament: http://www.firmament.io/blog/scheduler-architectures.html Tarcil: http://web.stanford.edu/~cdel/2014.insubmission.tarcil.pdf Omega: http://research.google.com/pubs/pub41684.html Borg, Omega and Kubernetes: http://research.google.com/pubs/pub44843.html Mesos optimistic offer: https://issues.apache.org/jira/browse/MESOS-1607 An interview about different multi-scheduler architectures: https://kismatic.com/company/qa-with- malte-schwarzkopf-on-distributed-systems-orchestration-in-the-modern-data-center/

Thank You