SDN controller scalability issue

Slides:

Advertisements

Similar presentations

SDN Abstractions. In an SDN Ideal World, we want… multiple applications (Composition): – So, need to worry about sharing. – About isolation. Network policies.

Advertisements

SDN Controller Challenges

Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

An Overview of Software-Defined Network Presenter: Xitao Wen.

INTRODUCTION Frequent and resource-exhaustive events, such as flow arrivals and network-wide statistics collection events, stress the control plane and.

SDN and Openflow.

Towards Virtual Routers as a Service 6th GI/ITG KuVS Workshop on “Future Internet” November 22, 2010 Hannover Zdravko Bozakov.

Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank

Handout # 4: Scaling Controllers in SDN - HyperFlow

A Scalable, Commodity Data Center Network Architecture.

Software Defined Networking COMS , Fall 2013 Instructor: Li Erran Li SDNFall2013/

OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.

Software-Defined Networks Jennifer Rexford Princeton University.

Software Defined Networking COMS , Fall 2014

M.Menelaou CCNA2 ROUTING. M.Menelaou ROUTING Routing is the process that a router uses to forward packets toward the destination network. A router makes.

ATM SWITCHING. SWITCHING A Switch is a network element that transfer packet from Input port to output port. A Switch is a network element that transfer.

Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.

1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.

The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.

Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.

Data Sharing. Data Sharing in a Sysplex Connecting a large number of systems together brings with it special considerations, such as how the large number.

SDN AND OPENFLOW SPECIFICATION SPEAKER: HSUAN-LING WENG DATE: 2014/11/18.

VMware vSphere Configuration and Management v6

SDN and Openflow. Motivation Since the invention of the Internet, we find many innovative ways to use the Internet – Google, Facebook, Cloud computing,

SDN Management Layer DESIGN REQUIREMENTS AND FUTURE DIRECTION NO OF SLIDES : 26 1.

Network management Network management refers to the activities, methods, procedures, and tools that pertain to the operation, administration, maintenance,

CSci8211: SDN Controller Design 1 Overview of SDN Controller Design  SDN Re-cap  SDN Controller Design: Case Studies  NOX Next Week:  ONIX  ONOS 

CSci8211: SDN Controller Design: ONOS 1 NOS Case Study: ONOS Open Network OS by ON.LAB  Prototype 1 focus on implementing a global network view goals:

Coping with Link Failures in Centralized Control Plane Architecture Maulik Desai, Thyagarajan Nandagopal.

Logically Centralized? State Distribution Trade-offs in Software Defined Networks.

PART1 Data collection methodology and NM paradigms 1.

B4: Experience with a Globally-Deployed Software WAN

SketchVisor: Robust Network Measurement for Software Packet Processing

Virtualization of Infrastructure as a Service (IaaS): Redundancy Mechanism of the Controller Node in OpenStack Cloud Computing Platform BY Shahed murshed.

SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.

SDN and Security Security as a service in the cloud

Instructor Materials Chapter 7: Network Evolution

Xin Li, Chen Qian University of Kentucky

SDN challenges Deployment challenges

HULA: Scalable Load Balancing Using Programmable Data Planes

Multi-layer software defined networking in GÉANT

Software defined networking: Experimental research on QoS

15-744: Computer Networking

15-744: Computer Networking

Lecture 2: Leaf-Spine and PortLand Networks

Martin Casado, Nate Foster, and Arjun Guha CACM, October 2014

Hydra: Leveraging Functional Slicing for Efficient Distributed SDN Controllers Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay Rao.

ETHANE: TAKING CONTROL OF THE ENTERPRISE

Ch 13 WAN Technologies and Routing

Alternative system models

Packet Switching Outline Store-and-Forward Switches

NOX: Towards an Operating System for Networks

Author: Daniel Guija Alcaraz

Overview of SDN Controller Design

CHAPTER 3 Architectures for Distributed Systems

Plethora: Infrastructure and System Design

Chapter 16: Distributed System Structures

Northbound API Dan Shmidt | January 2017

Dingming Wu+, Yiting Xia+*, Xiaoye Steven Sun+,

Software Defined Networking (SDN)

Switching, routing, and flow control in interconnection networks

Enabling Innovation Inside the Network

Chapter 3 Part 3 Switching and Bridging

Network Architecture By Dr. Shadi Masadeh 1.

Packet Switching Outline Store-and-Forward Switches

Autonomous Network Alerting Systems and Programmable Networks

Chapter 5 Network Layer: The Control Plane

In-network computation

Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,

Presentation transcript:

SDN controller scalability issue Fundamental issue: the speed gap between data plane and control plane. SDN controller Switch OS Switch OS Switch OS Switch HW Switch HW Switch HW 10-100 Gbps 10-100 0Mbps ???

SDN controller scalability issue Control plane 1. Stress controller resources 2. Stress the control channel Data plane Data plane can overwhelm control plane by design Control channel or controller resources SDN controller has a fundamental scalability issue

Solutions 1: Increase controller capacity - distributed controllers Control plane Control plane Control plane 1. Stress controller resources 2. Stress the control channel Data plane Flat structure multiple controllers ONIX (OSDI’10) ONOS(HostSDN’14)

Solutions 2: reduce traffic to controller -- Hierarchical controller Root Control Control plane 1. Stress controller resources Local Control 2. Stress the control channel Data plane Data plane Hierarchical controller design Kandoo (HotSDN’12)

Solutions 2: Reduce traffic to controller – offload control to switch Root Control Control plane 1. Stress controller resources offload Control 2. Stress the control channel Data plane Data plane Offload to switch control plane Diffane (SIGCOMM’10) DevoFlow(SIGCOMM’11)

ONIX ONIX’s view of network components Physical infrastructure: switches, routers, etc Connectivity infrastructure: channels for messages. Onix: A distributed system running the controller Control logic: network management applications running on top on Onix

Onix architecture

Onix NIB Holds a collection of network entities Can be viewed as a centralized graph with notification mechanism Updates to the NIB are asynchronous.

Onix NIB API Query: find entities Create/destroy: create and remove entities Access attributes: inspect and modify entities Notification: receive update about changes Synchronize: wait for updates being export to network elements and controllers Configuration: configure how state is imported to and exported from the NIB Pull: ask entities to be imported on-demand

Onix abstraction Global view: Observe and control a centralized network view (NIB) which contains all physical network elements Flow: the first packet and subsequent packets with the same header are treated in the same way. Switch: with flow tables <header: counters, actions> Event-based operation: the controller operations are triggered by routers or applications.

Onix API The global view is represented as a network graph Nodes represent physical network entities Developers program over the network graph Write flow entry List ports Register for updates ……

Network Information Base The NIB is the focal point of the system State for applications to access External state changes imported into it Local state changes exported from it

Onix scalability A single physical controller won’t work for large network NIB will overrun the memory in one server CPU and bandwidth for one server will not be enough Onix solution: Partition, aggregation, and consistency. Partition: Each Onix instance may have connections to a subset of network elements Network control logic can configure to keep a subset of the NIB in memory

Onix scalability Partition, aggregation, and consistency. Aggregation: Each Onix instance can be configured to expose a subset of elements in its NIB as an aggregate element (reduce fidelity) to another Onix instance. Consistency and durability Control logic dictates the consistency requirement of the network state it manages Two storage options Replicated transactions (SQL) storage One-hop memory-based DHT Control logic resolve conflicts when necessary.

Onix reliability Network element and link failures Control logic reconfigures to deal with such failures Management connectivity infrastructure failures Assumed reliable (remember Google B4 issue?) Onix failures: Distributed coordination facilities to provide failover

Onix Summary Onix provides state distribution capability The developers of management applications still have to understand the scalability implications of their design One of the earlier SDN controllers: the controller functionality and application functionality are not clearly partitioned.

Distributed SDN controller Research issues?

Distributed SDN research issues Network abstraction for distributed SDN Need concrete understanding of network abstraction in the current systems Exploit existing distributed system techniques to address distributed network abstraction issues Consistency, usability, synchronization, fault tolerance, etc Adapting distributed system techniques specifically to SDN controller No need to reinvent the wheel

ONOS: Towards an Open, Distributed SDN OS Earlier NOS dodges the distributed system issues. Earlier distributed NOS may try to reinvent the wheel. ONOS is a second generation of distributed NOS, separating distributed systems issues with network management issues We know how to distribute and maintain information in a distributed manner. Many systems are available. Distributed NOS can utilize the existing distributed information systems and focus on network management issues.

Distributed system building blocks Distributed storage system Cassandra RAMcloud (in memory storage) Distributed graph database (Titan) Distributed event notification(HazelCast) Distributed coordination service (Zookeeper) Distributed system data structures and algorithms Distributed hash table (DHT) Consensus algorithm Failure detector Checkpointing Transaction

Onos architecture

Onos abstraction: Global network view

Onos summary Use existing distributed system infrastructure. Focus on making it efficient with known distributed system applications. E.g. how to maintain, lookup, and update the topology effectively

Kandoo: A framework for efficient and scalable offloading of control applications

Local Apps

Local Apps

Where to run the local apps

Kandoo

An example: Elephant flow rerouting

An example: Elephant flow rerouting

Kandoo variations

Kandoo summary 2 levels of controller Deal with the scalability issue by moving software closer to the data plane Future: a generalized hierarchy Filling the gap between local and non-local apps Finding the right scope is quite challenging

Devoflow DevoFlow: scaling flow management for high-performance networks, SIGCOMM’2011. OpenFlow is good; but fine-grain per flow management creates too much overhead Flow setup Statistics collection Devoflow – a new paradigm to reduce the control and overhead while providing fine control for important flows.

Dilemma Control dilemma: Statistics-gathering dilemma: Role of controller: visibility and mgmt capability however, per-flow setup too costly Flow-match wildcard (existing hardware), hash-based: much less load, but no effective control Statistics-gathering dilemma: Pull-based mechanism: counters of all flows full visibility but demand high BW Wildcard counter aggregation: much less entries but lose trace of elephant flows Aim to strike in between

Main Concept of DevoFlow Devolving most flow controls to switches Use the default wildcard match Maintain partial visibility Keep trace of significant flows Default v.s. special actions: Security-sensitive flows: categorically inspect Normal flows: may evolve or cover other flows become security-sensitive or significant Significant flows: special attention Collect statistics by sampling, triggering, and approximating

Design Principles of DevoFlow Try to stay in data-plane, by default Provide enough visibility: Esp. for significant flows & sec-sensitive flows Otherwise, aggregate or approximate statistics Maintain simplicity of switches

Mechanisms Control Statistics-gathering Rule cloning Local actions Sampling Triggers and reports Approximate counters

Rule Cloning – identify elephant flow ASIC clones a wildcard rule as an exact match rule for new microflows Timeout or output port by probability

Rule Cloning ASIC clones a wildcard rule as an exact match rule for new microflows Timeout or output port by probability

Rule Cloning ASIC clones a wildcard rule as an exact match rule for new microflows Timeout or output port by probability

Local Actions Rapid re-routing: fallback paths predefined Recover almost immediately Multipath support: based on probability dist. Adjusted by link capacity or loads

Statistics-Gathering Sampling Pkts headers send to controller with1/1000 prob. Triggers and reports Set a threshold per rule When exceeds, enable flow setup at controller Approximate counters Maintain list of top-k largest flows

DevoFlow Summary Per-flow control imposes too many overheads Balance between Overheads and network visibility Effective traffic engineering, network management Switches with limited resources Flow entries, control-plane BW Hardware capability, power consumption