Fred Kuhns Washington University Applied Research Laboratory

Slides:



Advertisements
Similar presentations
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Advertisements

IP Forwarding Relates to Lab 3.
RIP V1 W.lilakiatsakun.
CPSC Network Layer4-1 IP addresses: how to get one? Q: How does a host get IP address? r hard-coded by system admin in a file m Windows: control-panel->network->configuration-
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
Internet Control Protocols Savera Tanwir. Internet Control Protocols ICMP ARP RARP DHCP.
Jon Turner Extreme Networking Achieving Nonstop Network Operation Under Extreme Operating Conditions.
Ken Wong Jon Turner and Prashanth Pappu Washington University Distributed Queueing Gigabit Kits (June 2002)
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
CSCE 515: Computer Network Programming Chin-Tser Huang University of South Carolina.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Washington WASHINGTON UNIVERSITY IN ST LOUIS Design of the MultiService Router (MSR): A Platform for Networking Research Fred Kuhns.
Network Layer4-1 NAT: Network Address Translation local network (e.g., home network) /24 rest of.
ICMP (Internet Control Message Protocol) Computer Networks By: Saeedeh Zahmatkesh spring.
Jon Turner (and a cast of thousands) Washington University Design of a High Performance Active Router Active Nets PI Meeting - 12/01.
Washington WASHINGTON UNIVERSITY IN ST LOUIS January 7, MSR Tutorial John DeHart Washington University, Applied Research Lab
Washington WASHINGTON UNIVERSITY IN ST LOUIS How to Implement the WaveVideo Plugin in an MSR Router.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
IP Forwarding.
Applied research laboratory David E. Taylor Users Guide: Fast IP Lookup (FIPL) in the FPX Gigabit Kits Workshop 1/2002.
Control Processor Switch Fabric ATM Switch Core Port Processors FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP.
Washington WASHINGTON UNIVERSITY IN ST LOUIS Packet Routing Within MSR Fred Kuhns
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
CSC 600 Internetworking with TCP/IP Unit 7: IPv6 (ch. 33) Dr. Cheer-Sun Yang Spring 2001.
Washington WASHINGTON UNIVERSITY IN ST LOUIS CP and Full MSR Test Status.
Transport Layer3-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
Internet Protocols (chapter 18) CSE 3213 Fall 2011.
Washington WASHINGTON UNIVERSITY IN ST LOUIS 1 DTI Visit - John DeHart- 4/25/2001 Agenda l WU/ARL Background – John DeHart (15 minutes) l DTI Background.
1 IEX8175 RF Electronics Avo Ots telekommunikatsiooni õppetool, TTÜ raadio- ja sidetehnika inst.
Univ. of TehranComputer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani.
Washington WASHINGTON UNIVERSITY IN ST LOUIS MSR Tasks for Fall 2001 Fred Kuhns, John DeHart and Ken Wong.
Field Programmable Port Extender (FPX) 1 Remote Management of the Field Programmable Port Extender (FPX) Todd Sproull Washington University, Applied Research.
Washington WASHINGTON UNIVERSITY IN ST LOUIS Packet Classification in the SPC arl/projects/msr/work/msrcfy.ppt.
Washington WASHINGTON UNIVERSITY IN ST LOUIS Gigabit Ethernet Interface for the MSR Fred Kuhns Applied Research Laboratory Washington.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Supercharged PlanetLab Platform, Control Overview
Chapter 4 Network Layer Computer Networking: A Top Down Approach 6th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 CPSC 335 Data Communication.
Chapter 4 Network Layer All material copyright
Scaling the Network: The Internet Protocol
Packet Switching Outline Store-and-Forward Switches
Layered Architectures
IP Forwarding Covers the principles of end-to-end datagram delivery in IP networks.
Troubleshooting IP Addressing
Net 323: NETWORK Protocols
IP Forwarding Relates to Lab 3.
What’s “Inside” a Router?
Demonstration of a High Performance Active Router DARPA Demo - 9/24/99
IP : Internet Protocol Surasak Sanguanpong
IP Forwarding Relates to Lab 3.
An NP-Based Router for the Open Network Lab Overview by JST
Supercharged PlanetLab Platform, Control Overview
Network Core and QoS.
Remote Management of the Field Programmable Port Extender (FPX)
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Chapter 15. Internet Protocol
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
1 Multi-Protocol Label Switching (MPLS). 2 MPLS Overview A forwarding scheme designed to speed up IP packet forwarding (RFC 3031) Idea: use a fixed length.
Chapter 3 Part 3 Switching and Bridging
Network Layer: Control/data plane, addressing, routers
Scaling the Network: The Internet Protocol
IP Forwarding Relates to Lab 3.
Ch 17 - Binding Protocol Addresses
Packet Switching Outline Store-and-Forward Switches
Anup K.Talukdar B.R.Badrinath Arup Acharya
Networking and Network Protocols (Part2)
IP Forwarding Relates to Lab 3.
Network Core and QoS.
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Fred Kuhns Washington University Applied Research Laboratory Multi-Service Router System Architecture: A Platform for Networking Research Fred Kuhns Washington University Applied Research Laboratory http://www.arl.wustl.edu/arl/project/msr

Presentation Overview Overview of Project and status System Architecture IP Forwarding in the MSR Control Protocol

Motivating Example Gigabit links Traffic isolation Security and DOS determine best e2e path get app. spec. & plugin code add destination host open_session (type, params) code servers Session Establishment Gigabit links Traffic isolation Security and DOS Rapid prototyping Experimental protocols Resource reservations and guarantees Embedded applications and active processing

Network Connection Scenarios Hosts LAN A Hosts LAN B Net G CR Port 1 AR Port 2 Net C Net D Port 2 AR Net B Net A Net E Port 1 CR Net F Ethernet and ATM links Internet and Intranet Flow specific resource/processing requirements Flow aggregation and filtering

MSR Project Goals Develop an open, high performance and extensible IP routing platform supporting per flow resource allocation and active processing for use in networking research: HW and SW port level processing resources: SPC, FPX or Both Configuration discovery at initialization time Special case processing in software on an embedded module at each port (SPC) Optimized packet processing in hardware (FPX) IP forwarding and advanced packet scheduling Active processing in hardware (FPX) or software (SPC) Support prototyping of new functions in software (SPC) before migrating to the FPX

MSR Project Goals Gigabit link speeds independent of specific link technologies Create framework for experimenting with new protocols for traditional routing or resource management (QoS). Simple, intuitive, extensible and robust software frameworks Router control and resource management Support conventional routing protocols such as OSPF Leverage existing and legacy efforts Leverage existing code and libraries Build on Gigabit kits program, Extreme Networking project and Programmable Networks vision

Presentation Divided into 3 Core Topics Hardware architecture and performance High-performance forwarding path and core interconnect. Hardware Components:WUGS, APIC, SPC and FPX. Top-Level Functional Requirements Captures the management and control operations. In the MSR, most top-level functions are implemented on a centralized control processor (CP). Port-level Functional Requirements The IP forwarding path, resource allocations, active processing and port level control functions. Also statistics and monitoring. The SPC and FPX implement the port level processing functions.

Hardware Components Switch Fabric OPP IPP FPX SPC TI Control Processor Field Programmable Port Ext. Network Interface Device Reprogrammable Application Device SDRAM 128 MB SRAM 4 MB Switch Fabric IPP OPP FPX SPC TI Control Processor Input Port Processor VCI OUT Control Processor global coordination & control routing protocols build routing tables and other information needed by SPCs first level code server Smart Port Card Sys. FPGA 64 MB Pentium Cache North Bridge APIC ATM Switch Core Field Programmable Port Extenders Embedded Processors Transmisson Interfaces

Top-Level Framework Resource Control Managing device bandwidth, memory and processor resources Assigning effective link speeds, and internal speedup factor for distributed queuing Downloading plugins and binding to application flows Allocating buffers to ingress or egress processing paths Defining and instantiating classifier filters and binding to per flow/flow aggregate port level resources Routing and Signaling: Extensible routing framework based on zebra Forwarding table management & distribution (SPC/FPX) Resource reservation - authentication and admission control Enhanced OSPF to carry link bandwidth availability

Top-Level Framework System Initialization: System Management Configuration discovery, initialization and Default parameters System Management Visualization - the GUI tool Performance and status monitoring Firewall functions Command protocol interface for managing ports has flavor of SNMP (set, get and soon to be added events)

Port-Level Functional Requirements Control, management and monitoring interface Command protocol interface in the SPC kernel. Identifies module and module specific command Similar functionality available in the FPX. Per flow or flow aggregate resource allocation Packet classification: exact and general match filters Distributed Queuing and virtual output queues Packet scheduler framework with support for FCFS or Weighted Fair queuing using Queue State DRR at output ports Flow based resource allocation and a lightweight flow setup protocol

Port-Level Functional Requirements IP Forwarding IP best-effort forwarding with a fast IP lookup algorithm (FIPL) implemented in the FPX and SPC Scalable IP Lookup for Programmable Routers, by David E. Taylor, John W. Lockwood, Todd Sproull, Jonathan S. Turner, David B. Parlour, Proceedings of IEEE Infocom 2002, 6/02 Exact match and general match filters with pinned routes Active Processing Support Plugin environment Monitor and control resource usage Dynamic plugin download and execution

Status: Complete Core Architectural design and implementation complete: software modules and hardware components Wave demo Core software module design and implementation complete: general filter, exact match classifier, FIPL, QSDRR, distributed queuing, active processing, virtual interfaces, port command protocol. Testing of IP forwarding function in FPX complete. Initial DQ testing complete, currently design is evolving. Queue State DRR initial testing

Status: Current Effort Analysis and Enhancement of the Distributed Queuing algorithm and its implementation. Implementation of the lightweight flow service Implementation of the Network Access Service Implementation of core FPX processing modules Packet scheduling Validate weights with QSDRR reserved and datagram traffic test its impact on TCP Implement a finish time version of weighted fair queuing

Status: Current Effort “Evolve” the classifier interface for binding flows to plugins and resources. Creating a set of aggregated datagram classes Implement the LFS protocol in software perform LFS processing in software on the SPC status reporting in response to explicit requests status reporting to management agents for monitoring and accounting (event notification using command protocol interface to local CP) MSR manager and Route manager on the CP Enhance OSPF for reporting resource availability Actively manage various databases and resources

Advanced Services Extreme Networking (http://www.arl.wustl.edu/arl/projects/extreme) Lightweight Flow Setup Service - under development one-way unicast flow with reserved bandwidth, soft-state flow identification in SPC, returns virtual output queue framework for managing BW allocations and unfulfilled requests stable rate (firm) and transient rate (soft) reservations Network Access Service (NAS) - under development provides controlled access to LFS registration/authentication of hosts, users resource usage data collection for monitoring, accounting

Other EN Extensions to Basic MSR Per source aggregate queues based on source prefix Super scalable packet scheduling approximate radix sort w/compensation (timing wheels) Enhanced, per flow, Distributed Queuing Reserved Tree Service (RTS) configured, semi-private network infrastructure reserved bandwidth, separate queues paced upstream forwarding with source-based queues

Presentation Overview Overview of Project and status System Architecture IP Forwarding in the MSR Control Protocol

MSR Hardware Components Control Processor Switch Fabric ATM Switch Core PP PP PP PP PP PP Processors Port LC LC LC LC LC LC Line Cards

Port Processors: SPC and/or FPX Control Processor Switch Fabric ATM Switch Core IPP OPP IPP OPP IPP OPP IPP OPP IPP OPP IPP OPP FPX FPX FPX FPX FPX FPX Processors Port SPC SPC SPC SPC SPC SPC LC LC LC LC LC LC Line Cards

Example SPC and FPX Design Shim contains results of classification step SPC FPX DQ Module Processing Active Z.2 IP Classifier X.1 shim APIC NID Flow Control

MultiService Router - Overview CP - Control Processor RA - Route Agents MM - MSR Manager PP - Port Processor (SPC/FPX) PE - Processing Environment DQ - Distributed Queuing DRR - Deficit Round Robin FP - Forwarding Path Signaling Agents MSR CP NCMO/Jammer MM RA flexsig Configure framework Routing NOC OSPF Signaling flexroutd RSVP Net Manager App and GUI Resource OSPF GBNSC (switch & ports) Local Interface PP PP PP PE plugin plugin MSR control PE plugin plugin WUGS classify/lookup DQ DQ classify/lookup DRR FP FP classify classify DRR PP PP PP PP PP

Top-Level Components MultiService Router (MSR) Control Processor (CP): System monitoring and control: MSR Manager (MM): router configuration; signaling protocol; Forwarding db and Classifier rule set management; system monitoring; port level resource management and control; local admission control; discovers hardware configuration at startup . Routing Agent (RA): local instance of routing protocols, communicates with remote entities. Sends route updates to RM. Currently Zebra based. WUGS switch controller (GBNSC), used for monitoring functions: sends control cells to WUGS to read statistics and configuration information. Port Processor (PP): Port level resource management. Forwarding Path (FP): modules/components performing the core IP forwarding and port level control operations. Route lookup, flow classification, filtering, distributed queuing and fair output queuing. Processing Environment (PE): infrastructure supporting active processing of IP datagrams. Application flows are bound to a software (or hardware module in the FPX) processing module.

Top-Level Components Network Operations Center (NOC) - GUI interface Network Management Application Active metric collection Passive monitoring of DQ. Display formats include format, temporal resolution, processing overhead. Metric and display evaluation Active management not implemented. Supports MSR Testing test/demo configuration and setup identify meaningful metrics and display architecture Display and "manage" MSR configuration interface to init MSR, change per port attributes reset MSR set runtime parameters

Device (APIC) Driver (Common Code) The Control Processor Native ATM Library SPC Libs Cmd/Msg FPX Control Logic/Cell Libs Switch & APIC Control/Cell Libs GBNSC Config INET API MSR Wrappers Policy Manager Resource IP Routing MSR Abstraction Layer Configuration MSR Manager Operational Control flexsig TCP UDP IP Device (APIC) Driver (Common Code) Native ATM "raw"

Control Processor Tasks The Top-Level framework, Currently supports: Discovery and Initialization Data collection for the Management tool Test and Demo Environment Support Routing support: OSPF - Standard Routing Protocols Local resource management and global signaling Monitor resource usage Support Active Processing - plugins

Requirements - RM Resource identification and initialization: Create and distribute routing tables to port processors constructed using OSPF and a QoS routing protocol Distributed queuing (DQ) management reserves output link BW and sets policies. Allocation of resources within individual ports Static allocation - configuration script or admin commands Dynamic allocation or re-allocation out-of-band allocation by manager out-of-band allocation by signaling agents in-band as needed by particular services

Distributed Routing Table Admin VC space and BW - Admission Cntrl OSPF RSVP OSPF# ... route tables flow table EN IP Resource Management Merge Tables Port 1 Port 2 Port 3 Port N ...

Routing Support Context: Programmable Networks Focus: Software Component Architecture Issues: Building, maintaining and distributing route tables Delivery of updates (LSAs from neighbors) to CP Format of route table entries Support for logical interfaces (sub-interfaces) CP component interactions (APIs): Routing - Zebra, OSPFd and msr_route_manager signaling and resource manager Assumptions: All routing neighbors are directly attached

Presentation Overview Overview of Project and status System Architecture IP Forwarding in the MSR Control Protocol

MSR 0.5 – Basic IP Forwarding Core functions implemented in basic MSR system Control Processor System monitoring System configuration Port level Software (SPC): Simple and Fast IP Lookup (FIPL) and Table management APIC Driver (the engine) MSR Memory management (buffers) High priority periodic callback mechanism Distributed Queuing General Classifier and an active processing environment

Phase 0.5 - Basic IP Forwarding CP One connected IP entity per port Control Traffic SPC/FPX SPC/FPX IP router router IP SPC/FPX SPC/FPX IP router router loopback IP WUGS Does not show distributed queuing

Distributed Queuing Goals Mechanisms Maintain High Output Link Utilization Avoid Switch Congestion Avoid Output Queue Underflow Mechanisms Virtual Output Queuing (VOQ) at the inputs Broadcast VOQ and output backlogs every D sec Each PPi recalculates ratei,j every D sec

DQ - Cell Format Broadcast DQ summary cells every D sec: Src port - sending port number (0-7) Overall Rate - total aggregate rate (BW) allocated to this port for the port-to-switch connection –currently not used Output queue length - bytes queued in output port’s output queue. VOQ X Queue Length - number of bytes queued in src port’s VOQ for output port X. VCI = DQVC 32 Cell Header Src port Overall Rate Output Queue Length Padding VOQ 0 Queue Length VOQ 1 Queue Length VOQ 2 Queue Length VOQ 3 Queue Length VOQ 4 Queue Length VOQ 5 Queue Length VOQ 6 Queue Length VOQ 7 Queue Length

MSR Router: Distributed Queuing data cell hdr Create DQ summary cell for this port and Broadcast cell to all input ports (including self) DQ summary cells wait in queue for start of next cycle p0 p8 DQ data cell hdr ... queue Read all summary cells (including own) and calculate output rate for each VOQ. p0 p8 DQ data cell hdr ... queue out Qs Determine per output port queue depth out Qs DQ updates packet scheduler to pace each VOQ according to backlog share wugs 192.168.200.X 192.168.204.X P0 P4 SPC/FPX SPC/FPX Next/Prev Hop p0 p8 DQ data cell hdr ... queue Next/Prev Hop DQ DQ data cell hdr DQ 192.168.205.X 192.168.201.X P1 P5 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop DQ DQ At each port, DQ runs every D sec 192.168.202.X 192.168.206.X P2 P6 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop DQ DQ 192.168.202.2 192.168.203.X 192.168.207.X P3 P7 SPC/FPX Next/Prev Hop SPC/FPX CP DQ DQ 192.168.203.2

Distributed Queuing Algorithm Goal: avoid switch congestion and output queue underflow. Let hi(i,j) be input i’s share of input-side backlog to output j. can avoid switch congestion by sending from input i to output j at rate  LShi(i,j) where L is external link rate and S is switch speedup Let lo(i,j) be input i’s share of total backlog for output j. can avoid underflow of queue at output j by sending from input i to output j at rate  Llo(i,j) this works if L(lo(i,1)+···+lo(i,n))  LS for all i Let wt(i,j) be the ratio of lo(i,j) to lo(i,1)+···+lo(i,n). Let rate(i,j)=LSlo(wt(i,j),hi(i,j)). Note: algorithm avoids congestion and avoids underflow for large enough S. what is the smallest value of S for which underflow cannot occur?

MSR IP Data Path - An Example WUGS 192.168.200.X 192.168.204.X P0 P4 SPC/FPX SPC/FPX CP Next/Prev Hop IP fwd DQ DQ 192.168.200.2 192.168.204.2 192.168.205.X 192.168.201.X P1 P5 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop DQ DQ 192.168.202.X 192.168.206.X P2 P6 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop DQ DQ 192.168.202.2 192.168.203.X 192.168.207.X P3 P7 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop IP fwd DQ DQ 192.168.203.2

MSR Version 1.0 - Enhanced Control Processor Configuration discovery enhanced download with broadcast Command protocol implementation Port level Software (SPC): Virtual Interface support and Shim processing Dynamic update of FIPL table MSR Memory Management enhancements Distributed Queuing Exact Match Classifier Embedded Debug facility

Virtual Interfaces on ATM No PVC, No Traffic to/from MSR CP R R Port 1 SPC/FPX ATM Switch 2xx Port 3 X Port 0 Port 2 43 VC=50 50 44 51 lookup 42 Port 1 Port 3 40 VC=51 40 50 43 out out Host 42 51 44 Port 4 Port 2

lookup/out processing Internal MSR Connections (SPC only) CP RM RA Sockets: Communication endpoints ... config ospf IP layer: routes pkts to/from sockets socket atm IP (udp/tcp) VP0 VP1 VP2 VP3 Driver: routes packets between interface and net layer raw atm Virtual Interfaces: The CP can have up to 4 virtual interfaces, one for each sub-port. *Only one will be used. 50 51 52 53 SPC Port loopback not shown IP Address bound to virtual interfaces only ip fwd path shown Port 0 lookup/out processing SPC SPC Port 1 63 63 Port 3 control control 50 43 41 50 42 51 44 51 lookup 42 44 out out 52 40 52 53 40 53 50 40 40 43 41 50 51 out out 51 42 42 lookup 52 52 44 44 53 53 SPC Port 2 SPC 63 63 Port 4 control control 43 50 41 50 41 51 43 51 52 lookup 44 42 out out 52 53 40 40 53 40 40 50 50 43 51 41 51 out out 43 WUGS 41 52 lookup 52 53 44 42 53

Although can Support up to 16 Inbound VCs: One for Ethernet or Packet Routing, SPC and FPX Ingress Egress IP eval: IP processing for FPX. Broadcast and Multicast destination address IP options Packet not recognized WUGS Current VCI Support: 1) 8 Ports (PN) 2) 4 sub-ports (SP) SPC SPC FIPL IP proc plugins FIPL IP proc plugins Ether only one VCI Ether only VC to endstations shim demux shim update shim update shim demux FPX FPX FPX_VCI FPX_VCI Link Interface Link Interface add shim FIPL shim proc. rem shim ... ... 40 ... 47 40 ... 47 (out port +40) (in port + 40) From previous hop router or endsystem Inbound VC = SPI + 50 0 <= SPI <= 3 Although can Support up to 16 Inbound VCs: One for Ethernet or Four for ATM VCs to next hop routers (p2p conn) Outbound VC = SPI + 50 0 <= SPI<= 3

Using Virtual Interfaces VIN IP/Mask VC Port Sub 192.168.200.1/24 50 1 192.168.201.1/24 2 192.168.202.1/24 51 3 192.168.203.1/24 4 192.168.204.1/24 5 192.168.205.1/24 52 6 192.168.206.1/24 7 192.168.207.1/24 8 192.168.208.1/24 CP VP0 50 Port 0 SPC/FPX shim processing Port 1 Port 3 SPC/FPX SPC/FPX 43 41 50 44 42 50 Input port lookup IP route Insert shim send to output shim 51 lookup 42 44 51 40 40 40 40 50 50 41 shim 43 out 42 42 51 lookup 51 44 44 Port 2 Port 4 Output port Reassemble frame get out VIN from shim remove shim send to next hop/host SPC/FPX SPC/FPX 43 50 41 51 43 41 shim 50 lookup 42 52 44 40 40 40 40 50 41 43 51 shim 50 43 WUGS 41 lookup 52 44 42

Packet Forwarding WUGS CP RM RA IP packet data Discover Port: 0 SubPorts: 0-3 VIN 0: 192.168.200.1/24:Ext. Link VC 50 VIN 1: 192.168.201.1/24:Ext. Link VC 51 VIN 2: 192.168.202.1/24:Ext. Link VC 52 VIN 3: 192.168.203.1/24:Ext. Link VC 53 RM RA Configure zebra Resource OSPF Signaling Routing flexroutd Port: 5 SubPorts: 0-3 VIN 20: 192.168.220.1/24:Ext. Link VC 50 VIN 21: 192.168.221.1/24:Ext. Link VC 51 VIN 22: 192.168.222.1/24:Ext. Link VC 52 VIN 23: 192.168.223.1/24:Ext. Link VC 53 Discover (switch & ports) interfaces PP P0 PP P5 PP IP packet P1 50 src: 192.168.220.5 dst: 192.168.200.1 sport: 5050 dport: 89 data 51 P2 WUGS 52 53 P3 P6 P4 P7

Forwarding IP Traffic WUGS Driver reads header and performs route lookup, returns fwd_key: fwd_key = {Stream ID, Out VIN} SID = reserved session ID, local only VIN = {Port (10 bits), SubPort (6 bits)} Insert shim, update AAL5 trailer and IP header. calculate internal VC from output VIN’s port number (VC = 40) CP RM RA Configure zebra Resource OSPF Signaling Routing flexroutd Discover (switch & ports) interfaces PP lookup/out P0 PP src: 192.168.220.5 dst: 192.168.200.1 sport: 5050 dport: 89 data shim P5 PP P1 50 51 52 53 ip lookup P2 WUGS 192.168.200.1 -> fwd_key P3 P6 Lookup destination in table: 192.168.200.1/24 matches - Out VIN 0 P4 P7

Internal IP Packet Format 8 Bytes Shim Version H-length TOS Total length Identification Flags Fragment offset TTL Protocol Header checksum IP Header Source Address IP Datagram Destination Address IP Options ?? IP data (transport header and transport data) AAL5 padding (0 - 40 bytes) AAL5 Trailer CPCS-UU (0) CPCS-UU (0) Length (IP packet + LLC/SNAP) CRC (APIC calculates and sets)

IntraPort Shim: Field Definitions 31 15 Flags Not Used Stream Identifier Input VIN Output VIN Flags Virtual Interface Number Format AF NR OP UK X PN (10 bits) SPI (6 bits) Flags - Used by SPC to demultiplex incoming packets.The FPX sets flags to indicate reason for sending packet to SPC. Note, may also use flags to implement flow control. AF: Active Flow. NR: No route in table. OP: IP Options present (Correct version but incorrect header size). UK: Unknown packet type (incorrect version for example). Stream Identifier (SID): Identifier for reserved traffic, locally unique label. Not used between ports. FPX fills in for reserved flows. Input VIN - The physical port and sub-port packet arrived on. PN is the physical port number and SPI is the sub-port identifier. There is a fixed map from SPI to VCI. FPX sets these values. Not used between ports. VCI = Base VC + SPI Output VIN - output port and sub-port. The FPX sets this if the route lookup succeeds. If the SPC performs the lookup for the FPX then the SPC fills in. The SPC may also modify this value in order to re-route a packet

InterPort Shim: Field Definitions 31 15 Flags Not Used Input VIN Output VIN Flags Virtual Interface Number Format X PN (10 bits) SPI (6 bits) Used to convey forwarding information to output port. Currently only the Output SPI is necessary for forwarding. Flags: TBD. Input VIN – Same as IntraPort Shim. Ingress FPX or SPC when FPX is not used. Output VIN – Same as IntraPort Shim.

FIPL Table Entry Formats (Currently in Flux) FPX version of FIPL table entry (36 bits): Output VIN Stream Identifier A 16 31 35 TBD 15 Output VIN Stream Identifier 16 31 15 SPC version of FIPL table entry (32 bits): Virtual Interface Number Format PN (10 bits) SPI (6 bits) 15 5

Forwarding IP Traffic WUGS At Port 0, driver extract shim and determines the destination VIN Output VIN converted to output VC=50 Removes shim, updates AAL5 trailer and sends on VC 50 (in this case the packets goes to the CP) CP RM RA Configure zebra Resource OSPF Signaling Routing flexroutd Discover (switch & ports) interfaces P0 PP P1 P2 src: 192.168.100.5 dst: 192.168.214.1 sport: 5050 dport: 89 data shim (to VIN 0) shim processing Route Advertisement PP P5 50 51 52 53 src: 192.168.220.5 dst: 192.168.200.1 sport: 5050 dport: 89 data WUGS P3 P6 P4 P7

Example: Processing Route Updates kernel delivers packet to socket layer Packet read by application: Assume application is OSPFd, Senders IP address is mapped to interface ID OSPFd and Zebra associate received advertisement with an MSR input VIN. CP RM RA Configure zebra Resource OSPF Signaling Routing flexroutd Discover (switch & ports) socket Interfaces PP P0 PP PP P5 50 51 52 53 P1 P2 WUGS P3 P6 P4 P7

Mapping Packets to Interfaces Since the CP receives all packets on one logical interface, an issue arises as to how route advertisements can be mapped to the correct MSR interface. Assume that all peer routers are “directly connected”. In other words, their sending address will be in the same network as the associated MSR interface (VIN). In this case the route daemon can directly map an advertisement to the correct local interface.

Note on Sockets, IP and Interfaces What if packet arrives on interface with different address? Should never happen but what if it does? Example: Packet sent to CP but arrives at port 7 on different port. CP kernel will still send pkt to socket bound to 192.168.200.1 If all neighbor routers are directly attached, then it doesn't matter. We can distinguish by looking at sending IP address. CP RM RA Configure zebra Resource OSPF Signaling Routing flexroutd Discover (switch & ports) socket Interfaces PP P0 PP PP P5 52 53 54 55 P1 src: 192.168.100.5 dst: 192.168.205.1 sport: 5050 dport: 89 P2 WUGS P3 P6 data P4 P7

Example: Processing Route Updates OSPFd notifies zebra of any route updates. Zebra provides a common framework for interacting with system Zebra passes route updates to he MSR Route Manager CP RM RA Configure zebra Resource OSPF Signaling Routing flexroutd Discover (switch & ports) Virtual Interfaces PP P0 PP PP P5 52 53 54 55 P1 P2 WUGS P3 P6 P4 P7

Route Distribution and Port Updates CP RM RA Add destination vector and output port/VC Configure zebra Resource OSPF Signaling Routing flexroutd "Broadcast" update Discover (switch & ports) Virtual Interfaces PP P0 PP PP 192.168.205.1 WUGS PP PP PP PP PP

Presentation Overview Overview of Project and status System Architecture IP Forwarding in the MSR Command Protocol and Debugging Messages sendcmd() and the CP to PP Control Path

MSR Command Overview The CP may send commands to individual ports (SPCs) using the command protocol. Use the sendcmd utility. A specific command type designates the SPC kernel modules to receive the command message. A sub-command may also be specified to further identify the particular operation to be performed. Command Protocol description Control Processor sends command messages to a specific port and expects to receive a reply message indicating either Success or Failure. This is termed a Command Cycle. There is the notion of a Command Transaction which may include one or more command cycles. A command transaction is terminated when the target (port) responds with a reply message containing an EOF

Predefined Top-Level Commands policy - Manage MSR policy object (see description) port_init - set local port number, enable DQ fipl –Send route updates to port (see the fipmgr utility) rp_pcu – Send commands to the plugin control unit. rp_inst - Send message to a plugin instance rp_class - send message to plugin base class apic – send commands to the APIC driver stats – Report port level statistics cfy – Classifier interface set_debug/get_debug – Set/Get debug flags and mask. port_up, port_down, dq and perf – Not Implemented

Policy Object Commands The policy object is referenced during runtime modifications will generally have an immediate effect get_XXX returns the current value, set_XXX will set it to a new value. get_gen/set_gen - enable/disable general classifier get_iprt/set_fipl/set_simple - specify which route lookup algorithm to use: fipl or simple linear lookup. get_dflags/set_dflags - specify where to send debug messages: disable, print to local console or send to cp. get_drr/set_drr - enable/disable drr (Not Implemented) get_dq/set_dq - enable/disable DQ (Not Implemented) get_flags/set_flags - return or set all policy control flags.

Plugin Framework Enhancements Integrated with Command framework send command cells to PCU: create instance, free instance, bind instance to filter, unbind instance Send command cells to particular plugin instances Send command cells to plugin base class instance access to: plugin class, instance id, filter id pcu reports describing any loaded classes, instances and filters

MSR RP_PCU Command rp_pcu Command sub-commands addfltr/remfltr - add/remove pkt filter at gate x flist - port prints current gate x filter list bind/unbind - bind/unbind instance to fltr/gate combo create - create plugin instance free - remove plugin instance clist - port prints current plugin class list ilist - port prints current instance list load/unload - load plugin - not implemented null - no-op. Can be used as a ping operation

APIC Command apic command - this command is useful for debugging and validating behavior. info - a verbose printing of low level apic register, descriptor and buffer state. resume - can specify a suspended channel to resume desc - print information about a particular descriptor and any associated buffer. pace - change the per VC pacing parameter gpace - change the global pacing paramter

Classifier Command cfy command is currently needed to perform necessary maintenance tbl_flush - flush (i.e. free) idle entries in the classifier table rt_flush - flush (remove) cached routes in the classifier table entries. info - print a list of the active classifier table entries.

Statisitics Command stats command - this is a potentially useful command set but currently is only nominally supported. get_all - print all statistics using the DEBUG facility. reset - reset all counters.

MSR DEBUG Command Debug facility - mechanisms for sending text messages defines a category and level Integrated with Command framework Dynamic setting of debug mask - affects what messages are sent based on the category and level masks. set_debug or get_debug command Valid debug categories/modules apic, ipfwd, ingress, egress, iprx, iptx, mem, dq, stats, ctl, conf, kern, natm, pcu, plugin, classify, perf, atmrx, atmtx, buff Valid debug levels - 0 - 255 predefined: verbose, warning, error, critical Interface in MSR kernel MSR_DEBUG((MSR_DEBUG_<category>|MSR_DEBUG_LEVEL_<level>, “fmt”, args));

Example Sending Cmd to Port Lookup sub-command perform function call then report results wugs 192.168.200.X 192.168.204.X P0 P4 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop msr_ctl cmd data cell hdr DQ DQ 192.168.205.X 192.168.201.X P1 P5 reply(); plugin instance created: Status, Instance ID SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop DQ DQ sendcmd(); create plugin instance: port id = 0, PluginID = 200 192.168.202.X 192.168.206.X P2 Report command completion status to application. P6 SPC/FPX SPC/FPX Next/Prev Hop Next/Prev Hop DQ DQ 192.168.202.2 192.168.203.X 192.168.207.X P3 P7 SPC/FPX SPC/FPX Next/Prev Hop CP DQ DQ 192.168.203.2