OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.

Slides:



Advertisements
Similar presentations
SDN Controller Challenges
Advertisements

Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
VCRIB: Virtual Cloud Rule Information Base Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan HotCloud 2012.
OpenFlow overview Joint Techs Baton Rouge. Classic Ethernet Originally a true broadcast medium Each end-system network interface card (NIC) received every.
Nanxi Kang Princeton University
PARIS: ProActive Routing In Scalable Data Centers Dushyant Arora, Theophilus Benson, Jennifer Rexford Princeton University.
Precept 6 Hashing & Partitioning 1 Peng Sun. Server Load Balancing Balance load across servers Normal techniques: Round-robin? 2.
© 2006 Cisco Systems, Inc. All rights reserved. MPLS v2.2—2-1 Label Assignment and Distribution Introducing Typical Label Distribution in Frame-Mode MPLS.
INTRODUCTION Frequent and resource-exhaustive events, such as flow arrivals and network-wide statistics collection events, stress the control plane and.
OpenFlow-Based Server Load Balancing GoneWild
SDN and Openflow.
MSN 2004 Network Memory Servers: An idea whose time has come Glenford Mapp David Silcott Dhawal Thakker.
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
Flowspace revisited OpenFlow Basics Flow Table Entries Switch Port MAC src MAC dst Eth type VLAN ID IP Src IP Dst IP Prot L4 sport L4 dport Rule Action.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Internet Indirection Infrastructure Ion Stoica UC Berkeley.
Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.
SDN Scalability Issues
Class 3: SDN Stack Theophilus Benson. Outline Background – Routing in ISP – Cloud Computing SDN application stack revisited Evolution of SDN – The end.
FlowSense: Monitoring Network Utilization with Zero Measurement Cost Curtis Yu 1, Cristian Lumezanu 2, Yueping Zhang 2, Vishal Singh 2, Guofei Jiang 2,
A Scalable, Commodity Data Center Network Architecture.
Scalable Management of Enterprise and Data Center Networks Minlan Yu Princeton University 1.
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Software Defined Networking COMS , Fall 2013 Instructor: Li Erran Li SDNFall2013/
Routing Concepts Warren Toomey GCIT. Introduction Switches need to know the link address and location of every station. Doesn't scale well, e.g. to several.
70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network Chapter 12: Routing.
PA3: Router Junxian (Jim) Huang EECS 489 W11 /
Jon Turner, John DeHart, Fred Kuhns Computer Science & Engineering Washington University Wide Area OpenFlow Demonstration.
Measuring Control Plane Latency in SDN-enabled Switches Keqiang He, Junaid Khalid, Aaron Gember-Jacobson, Sourav Das, Chaithan Prakash, Aditya Akella,
Information-Centric Networks07a-1 Week 7 / Paper 1 Internet Indirection Infrastructure –Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh.
Programmable Networks: Active Networks + SDN. How to Introduce new services Overlays: user can introduce what-ever – Ignores physical network  perf overhead.
1 Network Layer Lecture 13 Imran Ahmed University of Management & Technology.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) Sriram Gopinath( )
SDN and Openflow. Motivation Since the invention of the Internet, we find many innovative ways to use the Internet – Google, Facebook, Cloud computing,
Extending OVN Forwarding Pipeline Topology-based Service Injection
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 4: Planning and Configuring Routing and Switching.
Shadow MACs: Scalable Label- switching for Commodity Ethernet Author: Kanak Agarwal, John Carter, Eric Rozner and Colin Dixon Publisher: HotSDN 2014 Presenter:
Hiearchial Caching in Traffic Server. Hiearchial Caching  A set of techniques and mechanisms to increase the size and performance of network caches.
Reactive Logic in Software-Defined Networking: Measuring Flow-Table Requirements Maurizio Dusi*, Roberto Bifulco*, Francesco Gringoli”, Fabian Schneider*
Internet Indirection Infrastructure (i3) Ion Stoica Daniel Adkins Shelley Zhuang Scott Sheker Sonesh Surana Presented by Kiran Komaravolu.
Coping with Link Failures in Centralized Control Plane Architecture Maulik Desai, Thyagarajan Nandagopal.
Programming Assignment 2 Zilong Ye. Traditional router Control plane and data plane embed in a blackbox designed by the vendor high-seed switching fabric.
Network Virtualization Ben Pfaff Nicira Networks, Inc.
InterVLAN Routing 1. InterVLAN Routing 2. Multilayer Switching.
Xin Li, Chen Qian University of Kentucky
SDN challenges Deployment challenges
SDN controller scalability issue
SDN Network Updates Minimum updates within a single switch
FlowRadar: A Better NetFlow For Data Centers
F5 BIGIP V 9 Training.
ETHANE: TAKING CONTROL OF THE ENTERPRISE
VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)
Introduction to Networking
Load Balancing Memcached Traffic Using SDN
SDN Overview for UCAR IT meeting 19-March-2014
Net 323: NETWORK Protocols
Toward Taming Policy Enforcement for SDN_______ in the RIGHT way_
SoftRing: Taming the Reactive Model for Software Defined Networks
Implementing an OpenFlow Switch on the NetFPGA platform
Virtual TCAM for Data Center Switches
Programmable Networks
Specialized Cloud Architectures
Programmable Switches
SDN-Guard: DoS Attacks Mitigation in SDN Networks
Presentation transcript:

OpenFlow Switch Limitations

Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales – Coarse grained rules and long time scales Middlebox provision (perf + security) – Fine grained rule and long time scales Network services – Load balancer: fine-grained/short-time – Firewall:fine-grained/long-time Cloud Services – Fine grained/long-time scales

Background: Switch Design TCAM Switch CPU+Mem Hash Table Network Controller Network Controller 13Mbs 35Mbs 250GB

OpenFlow Background: Flow Table Entries OpenFlow rules match on 14 fields – Usually stored in TCAM (TCAM is much smaller) – Generally 1K-10K entries. Normal switches – 100K-1000K entries – Only match on 1-2 fields

Background: Switch Design TCAM Switch CPU+Mem Hash Table Network Controller Network Controller 13Mbs 35Mbs 250GB

OpenFlow Background: Network Events Packet_In (flow-table expect, pkt matches no rule) – Asynch from switch to controller Flow_mod (insert flow table entries) – Asynch from controller to switch Flow_timeout (flow was removed due to timeout) – Asynch from switch to controller Get Flow statistics (information about current flows( – Synchronous between switch & controller – Controller sends request, switch replies

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: Theo To: Bruce From: Theo To: Bruce 1.Check Flow table, If no match then Inform CPU 2. CPU create packet-in event, and sends to controller 3. Controller run code to process event

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 0 From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 0 From: Theo To: Bruce From: Theo To: Bruce 2. CPU processes flow_mod and insert into TCAM 4. Controller creates flow event and sends a flow_mod event

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce From: Theo To: Bruce 2. CPU processes flow_mod and insert into TCAM 4. Controller creates flow event and sends a flow_mod event

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce From: Theo To: Bruce 1.Check Flow table 2.Found matching rule 3.Forward packet 4.Update the count

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: John From: Theo To: John 1.Check Flow table 2.No matching rule … now we must talk to the controller

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: John From: Theo To: John 1.Check Flow table 2.Found matching rule 3.Forward packet 4.Update the count

Background: Switch Design Switch CPU+Mem Network Controller Network Controller 13Mbs 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Cathy From: Theo To: Cathy 1.Check Flow table 2.Found matching rule 3.Forward packet 4.Update the count

Background: Switch Design Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 Problem with Wild-card – Too general – Can’t find details of individual flows – Hard to do anything fine- grained

Background: Switch Design Switch CPU+Mem 35Mbs From: theo, to: bruce, send on port 1 Timeout: 1secs, count: 1K From: theo, to: bruce, send on port 1 Timeout: 1secs, count: 1K From: theo, to: john, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: john, send on port 1 Timeout: 10 secs, count: 1 Doing fine-grained things Think hedera – Find all elephant flows – Put elephant flows on diff path How to do this? – Controller sent get-stat request – Switch respond will all stats – Controller goes through each request – Install special paths

Background: Switch Design Switch CPU+Mem 35Mbs From: theo, to: bruce, send on port 3 Timeout: 1secs, count: 1K From: theo, to: bruce, send on port 3 Timeout: 1secs, count: 1K From: theo, to: john, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: john, send on port 1 Timeout: 10 secs, count: 1 Doing fine-grained things Think hedera – Find all elephant flows – Put elephant flows on diff path How to do this? – Controller sent get-stat request – Switch respond will all stats – Controller goes through each request – Install special paths

Problems with Switches TCAM is very small can only support a small number of rules – Only 1k per switch, endhost generate lots more flows Controller install entry for each flow increases latency – Takes about 10ms to install new rules So flow must wait!!!!!! – Can install at a rate of 13Mbs but traffic arrives at 250Gbp Controller getting stats for all flows takes a lot resources – For about 1K, you need about MB – If you request every 5 seconds then you total:

Background: Switch Design TCAM Switch CPU+Mem Hash Table Network Controller Network Controller 13Mbs 35Mbs 250GB

Problems with Switches TCAM is very small can only support a small number of rules – Only 1k per switch, endhost generate lots more flows Controller install entry for each flow increases latency – Takes about 10ms to install new rules So flow must wait!!!!!! – Can install at a rate of 13Mbs but traffic arrives at 250Gbp Controller getting stats for all flows takes a lot resources – For about 1K, you need about MB – If you request every 5 seconds then you total:

Getting Around TCAM Limitation Cloud centric solutions – Use Placement tricks Data Center centric solutions – Use overlay: use placement tricks General technique: Difane – Use Detour routing

DiFANE

DiFane Creates a hierarchy of switches – Authoritative switches Lots of memory Collectively stores all the rules – Local switches Small amount of memory Stores a few rules For unknown rules route traffic to an authoritative switch

Following packets Packet Redirection and Rule Caching 23 Ingress Switch Authority Switch Egress Switch First packet Redirect Forward Feedback: Cache rules Hit cached rules and forward

Three Sets of Rules in TCAM TypePriorityField 1Field 2ActionTimeout Cache Rules 21000**111*Forward to Switch B10 sec **Drop10 sec …………… Authority Rules 11000**001*Forward Trigger cache manager Infinity ***Drop, Trigger cache manager …………… Partition Rules 150***000*Redirect to auth. switch 14… …………… 24 In ingress switches reactively installed by authority switches In ingress switches reactively installed by authority switches In authority switches proactively installed by controller In authority switches proactively installed by controller In every switch proactively installed by controller In every switch proactively installed by controller

Stage 1 25 The controller proactively generates the rules and distributes them to authority switches.

Partition and Distribute the Flow Rules 26 Ingress Switch Egress Switch Distribute partition information Authority Switch A AuthoritySwitch B Authority Switch C reject accept Flow space Controller Authority Switch A Authority Switch B Authority Switch C

Stage 2 27 The authority switches keep packets always in the data plane and reactively cache rules.

Following packets Packet Redirection and Rule Caching 28 Ingress Switch Authority Switch Egress Switch First packet Redirect Forward Feedback: Cache rules Hit cached rules and forward A slightly longer path in the data plane is faster than going through the control plane

Bin-Packing/Overlay

Virtual Switch Virtual switch has more Mem than hardware switch – So you can install a lot more rules in virtual switches Create an overlay between virtual switches – Install fine-grained in virtual switches – Install normal OSPF rules in HW – Can implement everything in virtual switch Has overlay draw-backs.

Bin-Pack in data Centers Insight: traffic is between certain servers – If server placed together then their rules are only inserted in one switch

Getting Around CPU Limitations Prevent controller from being in flow creation loop – Create clone rules Prevent controller from being in decision loops – Create forwarding groups

Clone Rules Insert a special wild card rule When a packet arrives switch makes a micro-flow rule itself – Micro-flow inherits all properties of the wildcard rule Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce From: Theo To: Bruce

Clone Rules Insert a special wild card rule When a packet arrives switch makes a micro-flow rule itself – Micro-flow inherits all properties of the wildcard rule Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce From: Theo To: Bruce From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

Forwarding Groups What happens when there’s a failure? – Port 1 goes down? – Switch must inform the controller Instead, have backup ports – Each rule also states backup Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

Forwarding Groups What happens when there’s a failure? – Port 1 goes down? – Switch must inform the controller Instead, have backup ports – Each rule also states backup Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1, backup: 2 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1, backup: 2 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1, backup2 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1, backup2 Timeout: 10 secs, count: 1

How do I do load balancing? – Something like ECMP? – Or server load-balancing? Currently, – Controller installs rules for each flow  do load balancing when installing – Controller can do get stats, and load balance later Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

Forwarding Groups Instead, have port-groups – Each rule specifies a group of ports to send on When micro-rule is create – Switch can assign ports to micro-rules in a round robin matter Or based on probability Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1,2,4 Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1,2,4 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

Forwarding Groups Instead, have port-groups – Each rule specifies a group of ports to send on When micro-rule is create – Switch can assign ports to micro-rules in a round robin matter Or based on probability Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1(10%), 2(90%) Timeout: 10 secs, count: 1 From: theo, to: ***, send on port 1(10%), 2(90%) Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 2 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 2 Timeout: 10 secs, count: 1

Getting Around CPU Limitations Prevent controller from polling switches – Introduce triggers: Each rule has a trigger and sends stats to the controller when the threshold is reached E.g. if over 20 pkts match flow, – Benefits of triggers: Reduces the number entries being returned Limits the amount of network traffic

Summary Switches have several limitations – TCAM space – Switch CPU Interesting ways to reduce limitations – Place more responsibility in the switch Introduce triggers Have switch create micro-flow rules from general rules