Download presentation
Presentation is loading. Please wait.
Published byJaeden Mandeville Modified over 9 years ago
1
IXP: The Bump in the Wire IXP: The Bump in the Wire INF5062: Programming Asymmetric Multi-Core Processors 22 April 2015
2
Background and Motivation: Network traffic increase
3
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Uses conventional, shared hardware (e.g., a PC) Software −runs the entire system −allocates memory −controls I/O devices −performs all protocol processing First generation network systems: Software-Based Network System
4
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Review of General Data Path on Conventional Computer Hardware Architectures communication system application user space kernel space sending: communication system application receiving: communication system application forwarding: Checksumming Copying Fragmentation Interrupts
5
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Question: Which is growing faster? −network bandwidth −processing power Note: if network bandwidth is growing faster −CPU may be the bottleneck −need special-purpose hardware Note: if processing power is growing faster −no problems with processing −network/busses will be bottlenecks
6
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Growth Of Technologies Mbps year Engineering rule: 1GHz general purpose CPU = 1Gbps network data rate Thus, software running on a general-purpose processor is insufficient to handle high-speed networks because the aggregate packet rate exceeds the capabilities of the CPU
7
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Network Processors: The Idea in a Nutshell Many designs through many generations (varying amount of HW & SW) Include support for protocol processing and I/O on one chip −General-purpose processor(s) for control tasks −Special-purpose processor(s) for packet processing and table lookup −Include functional units for tasks such as checksum computation, hashing, … Call the result a network processor
8
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Network Processors: Main Idea Traditional system: - slow - resource demanding - shared with other operations Network processors: - a computer within the computer - special, programmable hardware - offloads host resources
9
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Explosion of Commercial Products 11 990 2000: network processors transformed from interesting curiosity to mainstream product −r−reduction in both overall costs and time to market −2−2002: over 30 vendors with a vide range of architectures −e−e.g., Multi-Chip Pipeline (Agere) Augmented RISC Processor (Alchemy) Embedded Processor Plus Coprocessors (Applied Micro Circuit Corporation) Pipeline of Homogeneous Processors (Cisco) Pipeline of Heterogeneous Processors (EZchip) Configurable Instruction Set Processors (Cognigine) Extensive And Diverse Processors (IBM) Flexible RISC Plus Coprocessors (Motorola) Internet Exchange Processor (Intel) …
10
Intel I II IXP1200 / 2400: A Short Overview
11
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXA: Internet Exchange Architecture IXA is a broad term to describe the Intel network architecture (HW & SW, control- & data plane) IXP: Internet Exchange Processor −processor that implements IXA −IXP1200 is the first IXP chip (4 versions) −IXP2xxx has now replaced the first version IXP1200 basic features −1 embedded 232 MHz StrongARM −6 packet 232 MHz µengines −onboard memory −4 x 100 Mbps Ethernet ports −multiple, independent busses −low-speed serial interface −interfaces for external memory and I/O busses −… IXP2400 basic features −1 embedded 600 MHz XScale −8 packet 600 MHz µengines −3 x 1 Gbps Ethernet ports −…
12
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP1200 Architecture Serial line: - connects to the RISC - intended for control and management - rate 38 Kbps PCI bus: - allow IXP to connect to I/O devices - enable use of host CPU - rate 2.2 Gbps IX (Intel eXchange) bus: - enable higher rates compared to PCI - form fast path (IXP and high-speed interfaces) - interface to other IXP cards - 4.4 Gbps SRAM bus: - shared bus (several external units) - usually control rather than data - rate 3.71 Gbps SDRAM bus: - provide access to external SDRAM memory used to store packets - can also pass addresses, control/store operations, etc. - rate 7.42 Gbps
13
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP1200 Architecture RISC processor: - StrongARM running Linux - control, higher layer protocols and exceptions - 232 MHz Microengines: - low-level devices with limited set of instructions - transfers between memory devices - packet processing - 232 MHz Access units: - coordinate access to external units Scratchpad: - on-chip memory - used for IPC and synchronization
14
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP1200 Processor Hierarchy General-Purpose Processor: - used for control and management - running general applications RISC processor: - chip configuration interface (serial line) - control, higher layer protocols and exceptions I/O processors (microengines): - transfers between memory devices - packet processing Coprocessors: - real-time clock and timers - IX bus controller - hashing unit -... Physical interface processors: - implement layer 1 & 2 processing
15
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP1200 Memory Hierarchy
16
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP1200 Memory Hierarchy Different memory types… …are organized into different addressable data units (words or longwords) …have different access times …connected to different busses Therefore, to achieve optimal performance, programmers must understand the organization and allocate items from the appropriate type
17
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP1200 IXP2400 SRAM FLASH MEMORY MAPPED I/O DRAM SRAM access SDRAM access SCRATCH memory PCI access IX access Embedded RISK CPU (StrongARM) PCI bus IX bus DRAM bus SRAM bus microengine 2 microengine 1 microengine 5 microengine 4 microengine 3 microengine 6 multiple independent internal buses IXP1200
18
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP2400 IXP2400 Architecture microengine 8 SRAM coprocessor FLASH DRAM SRAM access SDRAM access SCRATCH memory PCI access MSF access Embedded RISK CPU (XScale) PCI bus receive bus DRAM bus SRAM bus microengine 2 microengine 1 microengine 5 microengine 4 microengine 3 multiple independent internal buses slowport access … transmit bus RISC processor: - StrongArm XScale - 233 MHz 600 MHz Microengines - 6 8 - 233 MHz 600 MHz Media Switch Fabric - forms fast path for transfers - interconnect for several IXP2xxx Receive/transmit buses - shared bus separate busses Slowport - shared inteface to external units - used for FlashRom during bootstrap Coprocessors - hash unit - 4 timers - general purpose I/O pins - external JTAG connections (in-circuit tests) - several bulk cyphers (IXP2850 only) - checksum (IXP2850 only) - …
19
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP2400 Architecture Memory −generally more of everything −generally larger gap between CPUs and memory access in terms of cycles −local memory on each microengine saving temporary results private per packet processor small (2560 bytes) low latency (one cycle) accessed through special registers
20
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP2400 Basic Packet Processing microengine 8 SRAM coprocessor FLASH DRAM SRAM access SDRAM access SCRATCH memory PCI access MSF access Embedded RISK CPU (XScale) PCI bus receive bus DRAM bus SRAM bus microengine 2 microengine 1 microengine 5 microengine 4 microengine 3 multiple independent internal buses slowport access … transmit bus
21
Using IXP2400
22
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Intel IXP2400 Hardware Reference Manual
23
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Programming Model RX microblock TX microblock XScale microengines input ports output ports core component process microblock Scratch rings head tail metadata queue data buffers linked list logical mapping
24
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Programming Model RX microblock TX microblock XScale microengines input ports output ports core component process microblock Hardware contexts Threads
25
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Framework uclo −Microengine loader −Necessary to load your microengine code into the microengines at runtime hal −Hardware abstraction layer −Mapping of physical memory into XScale processes’ virtual address space −Functions starting with hal ossl −Operating system service layer −Limited abstraction from hardware specifics −Functions starting with ix_ rm −Resource manager −Layered on top of uclo and ossl −Memory and resource management all memory types and their features IPC, counters, hash −Functions starting with ix_
26
Bump in the Wire
27
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Bump in the Wire Internet Count web packets, count ICMP packets
28
Packet Headers and Encapsulation
29
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Ethernet describes content of ethernet frame, e.g., 0x0800 indicates an IP datagram, 0x0806 indicates an ARP packet 48 bit address configured to an interface on the NIC on the receiver 48 bit address configured to an interface on the NIC on the sender 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Source address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Frame type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
30
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Internet Protocol version 4 (IPv4) 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ indicates the format of the internet header, i.e., version 4 length of the internet header in 32 bit words, and thus points to the beginning of the data (minimum value of 5) datagram length (octets) including header and data - allows the length of 65,535 octets identifying value to aid assembly of fragments first zero, fragments allowed and last fragment indicate where this fragment belongs in datagram checksum on the header only – TCP, UDP over payload. Since some header fields change(TTL), this is recomputed and verified at each point indicates used transport layer protocol disable a packet to circulate forever,decrease value by at least 1 in each node – discarded if 0 32-bit address fields. May be configured differently from small to large networks options may extend the header – indicated by IHL. If the options do not end on a 32-bit boundary, the remaining fields are padded in the padding field (0’s) indication of the abstract parameters of the quality of service desired – somehow treat high precedence traffic as more important – tradeoff between low-delay, high-reliability, and high-throughput – NOT used, bits now reused for differential services code point
31
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Internet Control Message Protocol (ICMPv4) type-specific arbitrary length data Type of the control msg, including echo request (8) and echo reply (0) Checksum for the ICMP header only 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | header checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Refinement 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 8 | 0 | header checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | identifier | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ICMP Echo Request Optional identifier, chosen by sender, echoed by receiver Optional sequence number, chosen by sender, echoed by receiver
32
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo UDP specifies the total length of the UDP datagram in octets contains a 1’s complement checksum over UDP packet and an IP pseudo header with source and destination address port to identify the sending application port to identify receiving application 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
33
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo TCP port to identify the sending application port to identify receiving application contains a 1’s complement checksum over UDP packet and an IP pseudo header with source and destination address sequence number for data in payload acknowledgement for data received options may extend the header. If the options do not end on a 32-bit boundary, the remaining fields are padded in the padding field (0’s) header length in 32 bit units pointer to urgent data in segment code bits: urgent, ack, push, reset, syn, fin 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Header| |U|A|P|R|S|F| | | length| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ receiver’s buffer size for additional data
34
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Encapsulation
35
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Identifying Web Packets These are the header fields you need for the web bumper: Ethernet type 0x800 IP type 6 TCP port 80
36
Lab Setup
37
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Lab Setup IXP lab switch Internet Student labswitch … ssh connection
38
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Lab Setup - Addresses IXP lab switch …
39
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP lab switch … IXP lab Lab Setup - Addresses … switch 192.168.2.11 192.168.2.50 192.168.2.51 192.168.2.52 192.168.2.55 129.240.66.50 129.240.66.51 129.240.66.52 129.240.66.55 … …
40
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Lab Setup – Data Path IXP lab switch … IXP2400 PCI IO hub memory hub CPU memory hub interface system bus RAM interface web bumper (counting web packets and forwarding all packets from one interface to another) 192.168.1.1 192.168.1.5 SSH connection to IFI: 129.240.66.50
41
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Lab Setup – Data Path IXP lab switch … IXP2400 IO hub memory hub CPU memory web bumper (counting web packets and forwarding all packets from one interface to another) 192.168.1.1 192.168.1.5 SSH connection to IFI: 129.240.66.50 192.168.2.50 192.168.2.11 SSH connection to IFI: 129.240.66.48
42
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo IXP 2400 The Web Bumper web bumper (counting web packets and forwarding all packets from one interface to another) web bumper wwbump (core) output port input port XScale microengines rx block processing encompasses all operations performed as packets arrive tx block processing encompasses all operations applied as packets depart rx microblock tx microblock wwbump (microblock) the wwbump core components checks a packet forwarded by the wwbump microblock count ping packet - add 1 to icmp counter send back to wwbump microblock the wwbump microblock checks all packets from rx block: if it is a ping or web packet: if web packet, add 1 to web counter and forward to tx block if ping packet, forward to wwbump core component if neither, forward to tx block The wwbump microblock forwards all packets from the wwbump core component to the tx block
43
INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Starting and Stopping OO n the host machine −L−Location of the example: / root/ixa/wwpingbump −R−Rebooting the IXP card: m ake reset −I−Installing the example: m ake install −T−Telnet to the card: telnet 192.168.1.5 minicom OO n the card −T−To start the example:. /wwbump −T−To stop the example: C TRL-C LL et’s look at an example...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.