Presentation is loading. Please wait.

Presentation is loading. Please wait.

Washington WASHINGTON UNIVERSITY IN ST LOUIS (SPC) Port-Level Processing: the MSR Kernel Fred Kuhns.

Similar presentations


Presentation on theme: "Washington WASHINGTON UNIVERSITY IN ST LOUIS (SPC) Port-Level Processing: the MSR Kernel Fred Kuhns."— Presentation transcript:

1 Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.eduhttp://www.arl.wustl.edu/~fredk (SPC) Port-Level Processing: the MSR Kernel Fred Kuhns Washington University Applied Research Laboratory

2 2 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Overview Introduction to hardware environment APIC core processing and buffer management Overview of SPC kernel software architecture and processing steps Plugin environment and filters Command Facility

3 3 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Control Processor Switch Fabric ATM Switch Core Port Processors FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP FPX SPC LC IPPOPP Line Cards (link interfaces) Port Processors: SPC and/or FPX

4 4 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 APIC IP Classifier DQ Module NID X.1 Z.2 shim Active processing SPC FPX Flow Control Shim contains results of classification step Using Both and FPX and SPC

5 5 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Focus on SPC as Port Processor Control Processor Switch Fabric... Flow/Route Lookup Dist. Q. Ctl. Output Port Proc. Flow Lookup Input Port Proc. Flow/Route Lookup Dist. Q. Ctl. Flow Lookup SPC

6 6 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 The SPC: an Embedded Processor Switch Interface Link Interface Serial Ports APIC CPU Module PCI Bus System FPGA DRAM

7 7 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Typical Pentium PC Architecture CPU North- Bridge CacheDRAM SouthBridge (PIIX3) (PIC, PIT, …) PCI Bus ISA Bus PCI Devices ISA Devices BIOS Super-IO BIOS RTC Uarts Kbd/Mse Floppy Parallel... Addr/DataCtrl Addr/Data/Ctrl Intr NMI INIT

8 8 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 System FPGA Intel Embedded Module SPC Hardware Architecture CPU North- Bridge CacheDRAM PCI Bus APIC Addr/DataCtrl Addr/Data/Ctrl Intr NMI INIT PITPICRTC’ BIOS ROM UART1 Interface UART2 Interface UART1 UART2 Link Interface Switch Interface

9 9 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 SPC Components APIC - PCI Bus Master Pentium Embedded Module –166 MHz MMX Pentium Processor L1 Cache: 16KB Data, 16KB Code L2 cache: 512 KB –NorthBridge 33 MHz, 32 bit PCI Bus PCI Bus Master System FPGA - PCI Bus Slave –Xilinx XC4020XL-1 FPGA –20K Equivalent Gates, ~ 75% used

10 10 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 SPC Components (continued) Memory –EDO DRAM –64MB (Max for current design) –SO DIMM Switch Interface - 1 Gb Utopia Link Interface - 1 Gb Utopia UART –Two Serial Ports NetBSD system console TTY port

11 11 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Overview Introduction to hardware environment APIC core processing and buffer management Overview of SPC kernel software architecture and processing steps Plugin environment and filters Command Facility

12 12 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 APIC Descriptors APIC uses a data structure called a descriptor to describe available buffers and their status. The hardware and software follow a well defined protocol for jointly managing the descriptors. The APIC controls one or more Free Descriptor chains, with each chain representing buffers available for Rx for a predefined set (one or more) of RX channels.

13 13 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 APIC Descriptors and Buffers Match/Checksum- - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTY 081624 31 Physical Address of Data Buffer Index into Desc Table Buffer Length or Amount Left Unused Flags: O - Read Only, E - EOF, C - CRC OK, T - Type, Y - Valid Bits Frame must be multiple of 48 B. Buffers are 2048 B. Max size = 2016 B, or 42 cells. Reserve 8 B for shim and 8 B for trailer IP Datagram MTU must be 2000 B At output port, max 2016 B frame received, offset 8 bytes in buffer. At most the 2024 B of buffer are used. 24 B at end of buffer not used. Fragment offset VersionH-lenTOSTotal length Identificationflags TTLprotocolHeader checksum Source Address Destination Address Options ?? Type (08.00)OUI (00.00) OUI (00)LLC (AA.AA.03) IP data (transport header and transport data) AAL5 padding (0 - 40 bytes) CPCS-UU (0) Length (IP packet + LLC/SNAP) CRC Shim MSR Buffer (2048B) 24 Bytes Not Used Shim Not Used IP Datagram AAL5 Padding and Trailer (Shim Offset Not used on Egress)

14 14 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Descriptor Notes V = Volatile Buffer I = Interrupt/Notify on Read· S = SAM Enable O = Read Only E = End of Frame C = CRC OK, RX L = Loss Priority (CLP of last cell), RX X = Congestion indication from last cell's PTI, RX T = BufType, 0 -> Data; 1 -> RM; 2 -> segment OAM; 3 end-2-end OAM Y = Sync = 0 -> Done, Valid Link; 1 -> Done, InValid Link 2 -> Not Ready; 3 -> Ready Possible values for First Word CAFE0083= Tx, EoF, Ready (Driver) CAFE0080= Tx, EoF, DoneValidLink (APIC) CAFE0002= Tx, NotReady CAFE0003= Rx, Ready, No Interrupt on Read CAFE0403= Rx, Ready, Interrupt on Read XXXX00C0= Rx, EoF, CRC OK, DoneValidLink XXXX0040= Rx, CRC OK, DoneValidLink XXXX00C1= Rx, EoF, CRC OK, DoneInValidLink XXXX0041= Rx, CRC OK, DoneInValidLink Physical Address of Data Buffer Index into Desc Table Match/Checksum - - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTY 081624 31 MatchFlags SizeNext Low32Addr

15 15 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 APIC Descriptors 0xCAFE- - BufAddrLo BufAddrHi 2016NextDesc VIS O0O0 E0E0 C0C0 LX T 00 Y 11 0xCAFE- - BufAddrLo BufAddrHi 2016NextDesc VIS O0O0 E0E0 C0C0 LX T 00 Y 11 buffer Pool X Chain Head Free Descriptor chain used by APIC during receive,each descriptor contains the physical address of an available buffer.

16 16 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Descriptors on a Receive Queue checksum- - BufAddrLo BufAddrHi 0NextDesc VIS O0O0 E0E0 C0C0 LX T 00 Y 00 checksum- - BufAddrLo BufAddrHi 1016NextDesc ??? VIS O0O0 E1E1 C1C1 LX T 00 Y 00 VC 101’ Queue

17 17 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 RX Descriptor to Buffer Mapping Match/Checksum- - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTYMatch/Checksum- - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTYMatch/Checksum- - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTYMatch/Checksum- - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTYMatch/Checksum- - BufAddrLo BufAddrHi BufLenNextDesc VISOECLXTY Buffer j 0 j+1 1 j+2 j+3 j+N N 2KB Descriptors Buffers (replace Mbufs)

18 18 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Descriptor Layout aal5rx_start aal5tx_start aal5rx_end aal5_count aal0rx_start aal0rx_end aal0tx_start aal0tx_end aal0_count RX channel 0, aal0_count_vci RX channel 1, aal0_count_vci TX channel 0, aal0_count_vci TX channel 1, aal0_count_vci local_count Index Starting address := desc_area unallocated msr_descr_count RX/TX Shared IP Packet Buffers RX - Cell Buffers TX - Cell Buffers *aal5_pool *aal0_pool aal0_count local_start local_end Invalid Descriptor

19 19 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Descriptor & Buffer Relationships Rx Tx Descriptor Table (DT) current rx offset Descriptors MSR Buffers (MB) Buffers notification processing ATM hdr conn status current desc resume pacing Tx processing conn status current desc Rx channel Global registers APIC port 0 port 1 port 2 Rx desc bound (same offset) to specific buffer Tx Offset same as rx offset TX desc allocated dynamically and bound to the RX desc and buffer MSR Buffer Headers Buf Hdrs same as buf offset

20 20 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Receiving a Packet Rx Tx Descriptor Table (DT) indx DT base MSR Buffers (MB) indx MB base notification processing ATM hdr conn status current desc resume pacing Tx processing conn status current desc Rx channel Global registers APIC port 0 port 1 port 2 Driver and IP code cell 2) APIC writes Rx’ed AAL5 frame to buffer referenced by new Rx desc. 1) AAL5 frame is received: APIC allocates and reads desc from RX pool. Then the previous Rx desc is written back (updated).

21 21 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Completing the Receive Rx Tx Descriptor Table (DT) indx DT base IP hdr IP data MSR Buffers (MB) indx MB base notification processing ATM hdr conn status current desc resume pacing Tx processing conn status current desc Rx channel Global registers APIC port 0 port 1 port 2 Driver and IP code 3) Last: Assert Interrupt 1) APIC writes (updates) current desc. 2) APIC updates notification register APIC disables interrupts on Rx channel

22 22 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Sending Packet Rx Tx Descriptor Table (DT) indx DT base IP hdr IP data MSR Buffers (MB) indx MB base notification processing ATM hdr conn status current desc resume pacing Tx processing conn status current desc Rx channel Global registers APIC port 0 port 1 port 2 IP Lookup Table Driver and IP code 2) a) write to current desc’s next index b) Write to resume Tx channel register cell 3) APIC sends (reads) packet and interrupts when done 1) allocate Tx desc and bind to Rx desc and buffer

23 23 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Overview Introduction to hardware environment APIC core processing and buffer management Overview of SPC kernel software architecture and processing steps Plugin environment and filters Command Facility

24 24 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Broadcast Report plugin Plugin Plugin Environment Exact Match General Match Classifier Route Lookup (FIPL, Simple) interrupt... SP 1 SP 2 SP N Commands DQ Reports VOQ 0 VOQ 1... VOQ 7 APIC TX Qs: DQ Adjusts VOQ Pacing Sub Port 1 Sub Port 0 Sub Port 2 Sub Port 3 Paced APIC TX queues DRR Service... handler(): send budget per interval Ingress/ Egress ? CP command processor and debug message SPC Software Architecture commands command reply debug messages periodic callback interrupt (D  sec) Read DQ Report Cells APIC Ingress Egress IP processing insert/process shim APIC Specific Driver Code handler(): read cells, set pacing, broadcast report DQ Service

25 25 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 SPC Data Path - Simplified View... DQ/ In Queuing plugin Plugin Plugin Environment... DRR/ Out Queuing Flow Classifier/ (channel map) Route Lookup (Shim, FIPL, Simple, cache)... Frame/Buffer and IP Processing Ingress/ Egress ? NM Filter

26 26 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 DQ Reports Distributed Queuing callback: read cells, set pacing, broadcast report VOQ 0 VOQ 1... VOQ 7 APIC TX Qs: DQ Adjusts VOQ Pacing SPC Input (Ingress) Processing periodic callback interrupt (D  sec) Read DQ Report Cells IP Processing: Insert InterPort Shim APIC Specific Driver Code APIC Flow Classifier/ (channel map) Route Lookup (FIPL, Simple) NM Filter PCU Framework X.1Z.1 W.1 Manage X.2Y.1 Z.2 Local Resource Manager and PCU Interface Plugin Environment IP Options Replace IntraShim with InterShim. Update trailer and IP header SP 1 SP 2 SP 4 SP 3 interrupt APIC Broadcast Report

27 27 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 DQ Reports Distributed Queuing callback: read cells, set pacing, broadcast report VOQ 0 VOQ 1... VOQ 7 APIC TX Qs: DQ Adjusts VOQ Pacing periodic callback interrupt (D  sec) Read DQ Report Cells IP Processing: Insert InterPort Shim APIC Specific Driver Code APIC Flow Classifier/ (channel map) Route Lookup (FIPL, Simple) NM Filter PCU Framework X.1Z.1 W.1 Manage X.2Y.1 Z.2 Local Resource Manager and PCU Interface Plugin Environment IP Options Replace IntraShim with InterShim. Update trailer and IP header SP 1 SP 2 SP 4 SP 3 interrupt APIC Broadcast Report APIC RX AAL5 Frame MSR Buffer (2KB) IP dgram trailer padding Rx offset shim IP trailer padding Insert Shim Output VIN Input VIN Stream Identifier Not Used Flags Intra/Inter Port Shim AFNROPUK X Intra Port Shim Flags PN (10 bits)SPI (6 bits) VIN Format filter 1 filter 2 filter 3 filter 4 filter 5 filter 6 filter 7 filter 8 filter 9 filter 10 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 Search Invoke instance handler General Match Filter: Linear search using the 5-tuple {src_addr, dst_addr, src_port, dst_port, proto}, match maps a flow to one or more plugin instances Set input and output VIN in Shim, Calculate aal5 length, decrement ip ttl, calculate IP header checksum. Place in APIC TX queue. Input (Ingress) Processing Fragment offset VersionH-lengthTOSTotal length IdentificationFlags TTLProtocolHeader checksum Source Address Destination Address IP data (transport header and transport data) AAL5 padding (0 - 40 bytes) CPCS-UU (0) Length (IP packet + LLC/SNAP) CRC (APIC calculates and sets) 8 Bytes Source Port Destination Port hash of ip header Hash Field widths and offsets are configurable: msr/msr_classify.h Flow Table flow hash route cached in flow entry. If none call ip lookup (fipl/simple) Exact Match Classifier:

28 28 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 interrupt Classifier IP processing: process shim APIC Specific Driver Code... SP 1 SP 2 SP N APIC Sub Port 1 Sub Port 0 Sub Port 2 Sub Port 3 Paced APIC TX queues DRR Service... handler: send budget per flow APIC periodic callback interrupt (D  sec) PCU Framework X.1Z.1 W.1 Manage X.2Y.1 Z.2 Local Resource Manager and PCU Interface Plugin Environment IP Options Flow Classifier/ (channel map) NM Filter Determine Out VC Remove Shim update AAL5 trailer and IP header Output Port (Egress) Processing DQ report Tx queue lengths

29 29 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 interrupt Classifier IP processing: process shim APIC Specific Driver Code... SP 1 SP 2 SP N APIC Sub Port 1 Sub Port 0 Sub Port 2 Sub Port 3 Paced APIC TX queues DRR Service... handler: send budget per flow APIC periodic callback interrupt (D  sec) PCU Framework X.1Z.1 W.1 Manage X.2Y.1 Z.2 Local Resource Manager and PCU Interface Plugin Environment IP Options Flow Classifier/ (channel map) NM Filter Determine Out VC Remove Shim update AAL5 trailer and IP header Output Port (Egress) Processing DQ report Tx queue lengths APIC RX AAL5 Frame MSR Buffer (2KB) Rx offset shim IP trailer padding Verify Shim and adjust buffer and header references General and Exact match classifier same as ingress, except route is obtained from output VIN in Shim Adjust buffer update trailer update ip hdr Remove Shim for TX TX offset shim IP trailer padding Place in DRR queue for this flow (referenced by flow entry). Every D  sec the DRR handler is executed. It sends up to MAX bytes per period (minus backlog) sharing available BW among the active flows. APIC Output channels are paced such that their sum is the the effective link bandwidth.

30 30 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 What about Ethernet? Host 1 Host 2 Host N MSR Router Ethernet Switch Router

31 31 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 GigE Link Interface - Egress ARP Table (M Entries) MACIP IP 1 MAC 1 IP M MAC M... NH Table (4 entries) IPVC VC 1 IP 1 VC 4 IP 4... 65 = SP 1 66 = SP 2... 64+N = SP N to NH 64 = SP 0 to ES if VC != 64, Lookup VC in NH table returns IP used for ARP lookup (support N = 4) if VC = 64, Lookup IP destination address in packet header IP Header data AAL5 trailer IP Header data Ethernet Add Ethernet header using destination address from ARP table. Add our Ethernet source address. Maintain ARP table by snooping, sending ARPs and responding to ARP broadcasts. Software creates NH table at boot time. From FPX/SPC To Next Hop or Endstation In development

32 32 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 GigE Link Interface - Ingress ARP Table (M Entries) MACIP IP 1 MAC 1 IP M MAC M... 64 = SP 0 to FPX/SPC IP Header data AAL5 trailer IP Header data Ethernet From Next Hop or Endstation To FPX/SPC If source MAC in table then verify else add If broadcast and ARP, process ARP else if broadcast and IP broadcast goto Deliver else if multicast and IP multicast goto Deliver else if not our destination MAC address drop else if IP unicast Deliver Remove Ethernet Header Encapsulate in AAL5 frame Send to switch on default VC (VC = 64) In development

33 33 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Overview Introduction to hardware environment APIC core processing and buffer management Overview of SPC kernel software architecture and processing steps Plugin environment and filters Command Facility

34 34 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Packet Classification & Plugins Classification provides and opportunity to bind flows to registered plugin instances. General classifier - Network Management –classification using 5-tuple, Prefix match on address, exact match port and proto 0 is a wildcard for all fields –input and output ports –filters added/removed via the command facility

35 35 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Flow Bound to a Plugin... DQ/ In Queuing plugin Plugin Plugin Environment... DRR/ Out Queuing Flow Classifier/ (channel map) Route Lookup (Shim, FIPL, Simple, cache)... Frame/Buffer and IP Processing Ingress/ Egress ? NM Filter instance->handle_packet(instance, packet, flags) Call packet handler for bound instance with pointer to IP packet (struct ip *). AAL5 Frame Fragment offset VersionH-lenTOSTotal length Identificationflags TTLprotocolHeader checksum Source Address Destination Address Options ?? IP data (transport header and transport data) AAL5 padding (0 - 40 bytes) CPCS-UU (0) Length (IP packet + LLC/SNAP) CRC Shim pkt (struct ip *) handle_packet(inst, pkt, flags) { /* Plugin may read and/or * modify content but not * delete it unless COPY. * On return the framework * forwards packet */... return;} Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6 Rule 7 Rule 8 Rule 9 Rule 10 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 Search Invoke instance handler General Match Classifier: Linear search of {src_addr, dst_addr, src_port, dst_port, proto}. General Classifier options: {First, Last, All} Rule Actions: {Deny, Permit, Active}. Rule flags {All, Copy, Stop} Send packet to exact match classifier Flow Table flow hash Instance 1 {Active} Flow entry to plugin has a one-to-one relationship. Exact Match Classifier: Hash {src_addr, dst_addr, src_port, dst_port}, then linear search for flow spec. Exact Match Classifier options: None. Rule Actions: {Deny, Permit, Active, Reserve}. Rule flags {Pinned, Idle, Remove} Exact Match: active processing same as general match. The AAL5 length is and IP header checksum are calculated so plugin does not have to perform these operations.

36 36 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Search Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6 Rule 7 Rule 8 Rule 9 Rule 10 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 i1i2i3i4i5 Invoke instance handler General Match Classifier: Linear search of {src_addr, dst_addr, src_port, dst_port, proto} General Classifier options: {First, Last, All} Rule Actions: {Deny, Permit, Active}. Rule flags {All, Copy, Stop} General Match Classifier Notes

37 37 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Flow Table flow hash Instance 1 {Active} Flow entry to plugin has a one-to-one relationship General Match Classifier: Linear search of - {src_addr, dst_addr, src_port, dst_port, proto}. Exact Match Classifier options: None. Rule Actions: {Deny, Permit, Active, Reserve}. Rule flags {Pinned, Idle, Remove} Exact Match Classifier Notes

38 38 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Instance 1 {Active} Instance 2 {Active, All} Instance 1 {Deny} Rule N General/Exact Match Classifier Class A “plugin x” Class B “plugin y” Class C “plugin z” Rule P Instance 1 {Active} Plugin instance maps to at most one rule/filter. General classifier: rule maps to at most 5 instances. Exact match classifier: rule maps to at most 1 instance. Active Processing Environment

39 39 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Creating an Instance Class A classid = 100 inst_t *create_instance(class_t *, inst_id) Instance of Class A - (Base Class extended by Developer) class_t*class inst_t*next inst_idid fid_tbound_fid void (*handle_packet) (inst_t *, ip_t *, flag32_t); void (*bind_instance) (inst_t *); void (*unbind_instance) (inst_t *); void (*free_instance) (inst_t *); int (*handle_msg) (inst_t *, buf_t *, flag8_t, seq_t, len_t *)... create class instance Return reference to instance create_instance() Called by PCU framework in response to receiving command. struct my_inst { inst_t base; subclass defs };

40 40 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Plugin Class Specific Interface All plugins belong to a class. At run time a class (i.e. plugin) must be instantiated before it vcan be referenced. Plugin is passed its instance pointer (like c++) as the first argument. Developer may extend the base class (struct rp_instance) to include additional fields which are local to each instance. Plugin developer must implement the following methods: –void(*handle_packet)(struct rp_instance *, struct ip *, u_int32_t); –void(*bind_instance)(struct rp_instance *); –void(*unbind_instance)(struct rp_instance *); –void(*free_instance)(struct rp_instance *); –int (*handle_msg)(struct rp_instance *, void *, u_int8_t, u_int8_t, u_int8_t);

41 41 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Plugin Framework Enhancements Integrated with Command framework –send command cells to PCU: create instance, free instance, bind instance to filter, unbind instance –Send command cells to particular plugin instances –Send command cells to plugin base class Enhanced interface to address limitation noticed in crossbow: –instance access to: plugin class, instance id, filter id –pcu reports describing any loaded classes, instances and filters

42 42 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Overview Introduction to hardware environment APIC core processing and buffer management Overview of SPC kernel software architecture and processing steps Plugin environment and filters Command Facility

43 43 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Command Facility Highlights Overview High level description - Application Layer MSR Command Interface Overview Cell format and field definitions Example

44 44 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Definitions Session: Open connection between the CP and a specific SPC. Intended to represent open connections and command state Transaction: Represent a complete command. A transaction terminates with either an EOF is received by the CP or and error occurs. EOF: End of File is returned to CP with the last bit of command data is returned or in response to a Cancel message (or an error occurs)

45 45 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Overview - Cmd Interface on CP Synchronous Request/Response protocol Timeout can be specified as well as the number of retries - Per session option –Essentially provides a reliable service –Issue: if no reply, cmd/reply msg lost in port, channel or CP. Retries may be a bad thing. Address - MSR Port and Command – Message destination - Callback function within the Port’s kernel (implements command)

46 46 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Command Interface on CP Types of messages: –New Command, Get Next set of reply data Command, Cancel Command –Error Reply, EOF Reply, Continued Reply Message Identifiers - Only requires a sequence number initialized to 0 for each New Command: –One sending entity on CP, –One outstanding command for each port, –Ports send exactly one reply msg per command msg, –Command must fit within one cell, –Replies may span multiple cells.

47 47 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Command Interface on Port Callback function registered with MSR kernel and called under 3 cases: –New Command Flags = Command; Sequence = 0; Length = valid bytes in buffer; Buffer = application data –Next Command Flags = Command | Next; Sequence = previous+1; Length = valid bytes in buffer; Buffer = application data –Cancel Command Flags = Command | Cancel; Sequence = previous+1; Length = 0; Buffer contains no valid data

48 48 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Command Interface on Port Callback function must: –Read from/Write to supplied buffer –Set length = Bytes written to buffer (in/out param) –Indicate if an error occurred (return -1) –Whether more data exists (return 0 => EOF, return > 0 => Not EOF, return ERROR | EOF) Framework: –generates reply message using same Command value and Sequence number. –sets flags indicating status (EOF, Error etc)

49 49 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Failure Modes Library support for lost messages: –if (timeout > 0, Replies > 0), then CP API library will re-send with RETRY flag set. –if (timeout > 0, Replies = 0 or all replies failed), then API library returns error to application –If (timeout = 0 - No Timeout), then send operation blocks indefinitely. Lost Command message - –if (timeout > 0 and retries > 0), CP resends command; same sequence number but RETRY flag set. Command buffer and flags passed to callback fn.

50 50 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Failure Modes Lost Reply message, –if no retries, Any issues? –if retries then CP resends New Command - Port knows this is a duplicate command (RETRY flag). Application responsible for handling retries. If an issue can use unique message ids. Extreme case use a history (last reply message). Next Command - Port receives Command w/Sequence > 0, w/RETRY flag. Passed to application which chooses the correct course of action. The intent is to ensure there are no holes in the reply data received by the CP. Cancel message - same as Next command.

51 51 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Possible Enhancements Support asynchronous messaging: –Multiple outstanding commands per port –Asynchronous I/O on CP –Speed up boot process and dynamic configuration –Facilitates implementing port monitoring (ping or heartbeat) for fault detection and recovery. –two methods for reporting results: upcall - function registered by application is called when results arrive poll - application periodically polls library for results. Support Broadcast and/or Multicast

52 52 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Layer Simple messaging facility optimized for MSR. Command message (CP sends): –Sent by CP to a specific MSR port (unicast) –Must fit within one AAL0 cell. –Message header, includes: protocol version Command Sequence number flags –Application data follows header –Library implements Request/Reply protocol.

53 53 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Layer Reply Message (Port sends): –Port must send reply message in response to a Command message. –Reply message Header: version and sequence number: same as command msg. Includes application data and flags indicating if command was successful and if more data exists (EOF). –Application registers command specific callback function at port. –Callback function must conform to specified interface.

54 54 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Overvew Command Protocol description –Control Processor sends command messages to a specific port and expects to receive a reply message indicating either Success or Failure. This is termed a Command Cycle. –There is the notion of a Command Transaction which may include one or more command cycles. A command transaction is terminated when the target (port) responds with a reply msg containing an EOF

55 55 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Overvew Command Protocol description, continued –CP processing of Reply msg, depends on EOF flag: If EOF is set then no further reply data is available and the command transaction is closed. If EOF is not set then there is remaining data and the command transaction is still open. –If remaining data (Not EOF), then CP must follow with a either a Next or Cancel command message. Sequence number indicates the “chunk” of data to be returned. Command indicates the message’s destination sequence number = previous + 1

56 56 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 APIC Cell Format MSR Command Message 816240 cidldccpoutxxxpin clgfcvpivcipti x x x x x x x x x x x x x verlengthcommand/statussequence numberflags Cell payload contains the MSR Command Command header is 4 Bytes, leaving 44 Bytes for sub-commands and data.

57 57 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 ATM/APIC Header pin (Ports-In) - Port cell arrived –Tx not used (set to 000b) –Rx: port cell arrived on (is the below correct?) 001 Port 0, 010 Port 1, 100 Port 2, etc. pout (Ports-Out) - Set of output ports. –Tx: Command library sets: 001 Fiber/Link, 010 Ribbon/Switch, 011 Both 101 Loopback MV0, 110 Loopback MV1 –Rx: Set by VCXT, see pin above. cidldccpoutxxxpin clgfcvpivcipti x x x x x x x x x x x x x

58 58 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 ATM/APIC Cell Format cc (Control Cell Indicator) - Not used, set to 0b ld (Low Delay) - Not used, set to 0b. –Should we use low delay? cid (Connection Identifier) - set to vci value. gfc (Generic Flow Control) - set to 0000b. cidldccpoutxxxpin clgfcvpivcipti x x x x x x x x x x x x x

59 59 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 ATM/APIC Cell Format cidldccpoutxxxpin clgfcvpivcipti x x x x x x x x x x x x x vpi (Virtual Path Identifier) - Set to 0x0. vci (Virtual Circuit Identifier) - Equal to cid. –See presentation on MSR configurations for a complete list of VCI assignments. pti (Payload Type) - Set to 000b (data cell) cl (Cell Loss Priority) - Set to 0b (High Priority)

60 60 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Header Version (2 bits) - Protocol version. Allows for at most 4 versions. Current version set to 0. –field width was a trade off with the length field. Length (6 bits) - Number of valid data bytes. – 0 <= Length <= 44, so 6 bits sufficient. –This field is indirectly set by the application or command implementation. The CP library and kernel interfaces allow for applications to pass a buffer pointer and indicate the number of valid data bytes. verlengthcommand/statussequence numberflags

61 61 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Header Command/Status (8 Bits) CP inserts command value, SPC/port inserts status information. –Valid Commands are listed in $SYS/msr/msr_ctl.h, also see $MSR/utils/command/*.{c,h} –Library API on CP accepts Command as argument. implementation in kernel - array of function pointers, uses Command as index –Reply msg Status indicating success or an error code (Upcall, ATM, Cmd Invalid, Cmd Not Implemented, or Other Cmd Error). verlengthcommand/statussequence numberflags

62 62 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Header Sequence Number (8 Bits) - Is of primary use by the applications. –When command message first sent, sequence = 0. –If the reply does not include an EOF flag, then CP increments sequence by one for each subsequent command message. –When EOF is received the Command Transaction is complete and the sequence number is reset to 0. verlengthcommand/statussequence numberflags

63 63 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Command Header Flags (8 bits) - Bit field, valid flags are: –Invalid - flag = 0, should not occur –CMD - cell contains a valid command from CP –REPLY - cell contains reply from Port –ERROR - Reply only, error processing on Port –EOF - No reply data remains, end of cmd transaction –NEXT - get next set of reply data –CANCEL - cancel current cmd transaction –RETRY - set if cp resend a command after it was lost verlengthcommand/statussequence numberflags

64 64 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 CP Library API Library API for application on CP, –int sendcmd(int sid, int cmd, char *data, int flags, int *dlen) sid = session id, cmd - Command to execute on port data = buffer pointer, flags = –RETRY (reply timeout), –CANCEL (cancel current command), –Next (get next set of reply data)

65 65 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 MSR Kernel API MSR kernel interface $SYS/msr/msr_ctl.{h,c} Callback function signature: –msr_ctl_ (void *buf, u_int8_t flags, u_int8_t seq, u_int8_t *dlen) –buf = command buffer w/application data, –flags = CMD, NEXT, RETRY or CANCEL, –seq = sequence number indicating reply data set, and –dlen is input/output parameter, data length in bytes.

66 66 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Kernel State Diagram Command Closed NextRetry Idle Retry Command Proto Error Command EOF Proto Error Cancel

67 67 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 CP Library State Diagram Wait (for reply) Closed Next Retry Idle Protocol Error Command EOF Protocol Error Open Session Result of a timeout

68 68 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/8/02 Example Sending Cmd to Port CP Next/Prev Hop Next/Prev Hop Next/Prev Hop Next/Prev Hop Next/Prev Hop Next/Prev Hop Next/Prev Hop wugs P0 P1 P2 P3 P4 P5 P6 P7 192.168.200.X 192.168.201.X 192.168.202.X 192.168.203.X 192.168.204.X 192.168.205.X 192.168.206.X 192.168.207.X SPC/FPX DQ 192.168.203.2 192.168.202.2 sendcmd(); create plugin instance: port id = 0, PluginID = 200 cmd data cell hdr msr_ctl reply(); plugin instance created: Status, Instance ID Report command completion status to application. Lookup sub-command perform function call then report results


Download ppt "Washington WASHINGTON UNIVERSITY IN ST LOUIS (SPC) Port-Level Processing: the MSR Kernel Fred Kuhns."

Similar presentations


Ads by Google