Programmable Switches

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

Programming Protocol-Independent Packet Processors
P4: specifying data planes
ENGINEERING WORKSHOP Compute Engineering Workshop P4: specifying data planes Mihai Budiu San Jose, March 11, 2015.
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick.
OpenFlow overview Joint Techs Baton Rouge. Classic Ethernet Originally a true broadcast medium Each end-system network interface card (NIC) received every.
Slick: A control plane for middleboxes Bilal Anwer, Theophilus Benson, Dave Levin, Nick Feamster, Jennifer Rexford Supported by DARPA through the U.S.
An Overview of Software-Defined Network Presenter: Xitao Wen.
SDN and Openflow.
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
Router Architecture : Building high-performance routers Ian Pratt
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
Chapter 19 Binding Protocol Addresses (ARP) Chapter 20 IP Datagrams and Datagram Forwarding.
Router Architectures An overview of router architectures.
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
Building Compilers for Reconfigurable Switches Lavanya Jose, Lisa Yan, Nick McKeown, and George Varghese 1 Research funded by AT&T, Intel, Open Networking.
An Overview of Software-Defined Network Presenter: Xitao Wen.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Programmable Data Planes COS 597E: Software Defined Networking.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Software-Defined Networks Jennifer Rexford Princeton University.
Dr. John P. Abraham Professor UTPA
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
Arbitrary Packet Matching in Openflow
P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, D. Walker SIGCOMM CCR, 2014 Presented.
P4: Programming Protocol-Independent Packet Processors
COS 561: Advanced Computer Networks
Programmable switches
HULA: Scalable Load Balancing Using Programmable Data Planes
Programming SDN Newer proposals Frenetic (ICFP’11) Maple (SIGCOMM’13)
P4 (Programming Protocol-independent Packet Processors)
Packet Forwarding.
NOX: Towards an Operating System for Networks
CS4470 Computer Networking Protocols
April 28, 2017 SUMIT MAHESHWARI INES UGALDE
dRMT: Disaggregated Programmable Switching
Chapter 6: Network Layer
Load Balancing Memcached Traffic Using SDN
SDN Overview for UCAR IT meeting 19-March-2014
Net 323: NETWORK Protocols
Srinivas Narayana MIT CSAIL October 7, 2016
CS 1652 The slides are adapted from the publisher’s material
Programmable Networks
What’s “Inside” a Router?
The Stanford Clean Slate Program
CS 31006: Computer Networks – The Routers
Sonata Query-driven Streaming Network Telemetry
DC.p4: Programming the forwarding plane of a datacenter switch
Toward Taming Policy Enforcement for SDN_______ in the RIGHT way_
Dr. John P. Abraham Professor UTPA
Network Core and QoS.
Dr. John P. Abraham Professor UTRGV, EDINBURG, TX
Dr. John P. Abraham Professor UTPA
Implementing an OpenFlow Switch on the NetFPGA platform
Sonata: Query-Driven Streaming Network Telemetry
Net 323 D: Networks Protocols
P4FPGA : A Rapid Prototyping Framework for P4
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Programmable Networks
Network Layer: Control/data plane, addressing, routers
Project proposal: Questions to answer
Ch 17 - Binding Protocol Addresses
Techniques and problems for
Taurus: An Intelligent Data Plane
Design principles for packet parsers
Network Core and QoS.
Control-Data Plane Separation
Control-Data Plane Separation
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Programmable Switches Lecture 13, Computer Networks (198:552)

Part of the control plane SDN router data plane Part of the control plane Data plane implements per- packet decisions On behalf of control & management planes Forward packets at high speed Manage contention for switch/link resources (part of) control plane Processor data plane Switching fabric Net intf Net intf Net intf Net intf

Life of a packet: RMT architecture Many modern switches share similar architecture FlexPipe, Xpliant, Tofino, … Pipelined packet processing with a 1 GHz clock

What data plane policies might we need? Parsing Ex: Turn raw bits 0x0a000104fe into IP header 10.0.1.4 and proto 254 Stateless lookups Ex: Send all packets with protocol 254 through port 5 Stateful processing Ex: If # packets sent from any IP in 10.0/16 exceeds 500, drop Traffic management Ex: Packets from 10.0/16 have high priority unless rate > 10 Kb/s Buffer management Ex: restrict all traffic outside of 10.0/16 to 80% of the switch buffer

Programmability Allow network designers/operators to specify all of the above Needs hardware design and language design Software pkt processing could incorporate all of these features However: limited throughput, low port density, high power Key Q: Can we achieve programmability with high performance?

Programmability: Topics today 1: Packet parsing 2: Flexible stateless processing 3: Flexible stateful processing 4, if we have time: complex policies without perf penalties

(1) Packet parsing: Need to generalize In the beginning, OpenFlow was simple: Match-Action Single rule table on a fixed set of fields (12 fields in OF 1.0) Needed new encapsulation formats, different versions of protocols, additional measurement-related headers Number of headers ballooned to 41 in OF 1.4 specification! With multiple stages of heterogenous tables

(1) Parsing abstractions Goal: can we make transforming bits to headers more flexible? A parser state machine where each state may emit headers TCP IP Ethernet UDP Payload Custom protocol

(1) Parsing implementation in hardware Use TCAM to store state machine transitions & hdr bit locations Extract fields into packet header vector in a separate action RAM

(2) How are the parsed headers used? Headers carried through the rest of the pipeline To be used in general-purpose match-action tables Headers

(2) Abstractions for stateless processing Goal: specify a set of tables & control flow between them Actions: more general than OpenFlow 1.0 forward/drop/count Copy, add, remove headers Arithmetic, logical, and bit-vector operations! Set metadata on packet header for control flow between tables

(2) Table dependency graph (TDG)

(2) Match-action table implementation Mental model: Match and Action units supplied with the Packet Header Vector Each pipeline stage accesses its own local memory PHV

(2) Match-action table implementation Hardware realization: separately configurable memory blocks SRAM Exact match Action memory Statistics! PHV PHV TCAM ternary match

(2) Match-action table implementation Hardware realization: separately configurable memory blocks SRAM Exact match Action memory Statistics! PHV Match RAM blocks also contain pointers to action memory and instructions PHV TCAM ternary match

(1,2) Parse & pipeline specification with High-level goals Allow reconfiguring packet processing in the field Protocol independent Target independent Declarative: specify parse graph and TDG Headers, parsing, metadata Tables, actions, control flow P4 separates table configuration from table population

(1,2) Parse & pipeline specification with Header and state machine spec header_type ethernet_t { fields { dstMac : 48; srcMac : 48; ethType : 16; } header ethernet_t ethernet; parser start { extract(ethernet); return ingress; }

(1,2) Parse & pipeline specification with Actions Rule Table action _drop() { drop(); } action fwd(dport) { modify_field(standard_metadata. egress_spec, dport); table forward { reads { ethernet.dstMac: exact; } actions { fwd; _drop; size: 200; Control Flow control ingress { apply(forward); }

(3) Flexible stateful processing What if the action depends on previously seen (other) packets? Example: send every 100th packet to a measurement server Other examples: Flowlet switching, DNS TTL change tracking, XCP, … Actions in a single match-action table aren’t expressive enough Example: if (pkt.field1 + pkt.field2 == 10) { counter++; }

(3) An example: “Flowlet” load balancing Consider the time of arrival of the current packet and the last packet of the same flow If current packet arrives 1 ms later than the last packet did, consider rerouting the packet to balance load Else, keep the packet on the same route as the last packet Q: why might you want to do this?

(3) Abstraction: Packet transaction A piece of code along with state that runs to completion on each packet before processing the next [Domino’16] Why is this challenging to implement on switch hardware? Hint: Switch is clocked at 1 GHz!

(3) Abstraction: Packet transaction A piece of code along with state that runs to completion on each packet before processing the next [Domino’16] Why is this challenging to implement on switch hardware? Hint: Switch is clocked at 1 GHz! (1) Switch must process a new packet every 1 ns Transaction code may not run completely in one pipeline stage

(3) Abstraction: Packet transaction A piece of code along with state that runs to completion on each packet before processing the next [Domino’16] Why is this challenging to implement on switch hardware? Hint: Switch is clocked at 1 GHz! (1) Switch must process a new packet every 1 ns Transaction code may not run completely in one pipeline stage (2) Read and write to state must happen in the same pipeline stage Need atomic operation in hardware

(3) Insight #1: Stateful atoms The atoms constitute the switch’s action instruction set: run under 1 ns

(3) Insight #2: Pipeline the stateless actions if (pkt.field1 + pkt.field2 == 10) { counter ++; } Have a compiler do this analysis for us  Stateless operations (whose results depend only on the current packet) can execute over multiple stages Only the stateful operation must run atomically in one pipeline stage

(4) Implementing complex policies What if you have a very large P4 (or Domino) program? Ex: too many logical tables in TDG Ex: logical table keys are too wide Sharing memory across stages leads to paucity in physical tables Ex: too many (stateless) actions per logical table Sharing compute across stages leads to paucity in physical tables Solution in RMT architecture: Re-circulation

(4) Re-circulation “extends” the pipeline Recirculate to ingress But throughput drops by 2x!

(4) Decouple pkt compute & mem access! Allow packet processing to run to completion on separate physical processors [dRMT’17] Aggregate per-stage memory clusters into a shared memory pool Crossbar enables all processors to access each memory Schedule instructions on each core to avoid contention

(4) RMT: compute and memory access Stage 1 Stage 2 Stage N Parser Match Action Deparser Pkt Header Queues In Out keys results As background, I’ll briefly review how programmable switches are architected in hardware. For concreteness, I’ll focus on the RMT architecture because it is representative of many commercial products today. In the RMT architecture, a parser turns bytes from the wire into a bag of packet headers. In each pipeline stage, a programmable match unit extracts the relevant part of the packet header as a key to look up in the match-action table. It then sends this to the memory cluster, which performs the lookup and returns a result. This result is used by a programmable action unit to transform the packet headers appropriately. This process then repeats itself. Memory Cluster 1 Memory Cluster 2 Memory Cluster N

(4) dRMT: Memory disaggregation Stage 1 Stage 2 Stage N Parser Match Action Deparser Queues In Out First, we disagg. memory. We replace the stage-local memory in the pipeline with a shared array of memory clusters accessible from any stage through a crossbar. Now, the stages in aggregate have access to all the memory in the system, because the memory doesn’t belong to any one stage in particular. Memory Cluster 1 Memory Cluster 2 Memory Cluster N

(4) dRMT: Compute disaggregation Queues Parser In Deparser Distributor Out Proc. 1 Proc. 2 Proc. N Match Action Match Action Match Action Pkt 1 Pkt 2 Pkt N Next, we disaggregate compute. We replace each of the pipeline stages that is rigidly forced to always execute matches followed by actions with a match-action processor that can execute matches and actions in any order that respects program dependencies. Now, once packets have been parsed, they are distributed by a distributor to one of the processors in round-robin order. So the first packet goes to processor 1, the second to processor 2, and so on. Once a processor receives a packet, it is responsible for carrying out all operations on that packet. Unlike the pipeline, packets don’t move around between processors. Let’s look at pkt 2 on proc 2. Over the duration of this packet, the proc might access tables in different memory clusters. Memory Cluster 1 Memory Cluster 2 Memory Cluster N