Presentation is loading. Please wait.

Presentation is loading. Please wait.

Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 1 Implementation of.

Similar presentations


Presentation on theme: "Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 1 Implementation of."— Presentation transcript:

1 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 1 Implementation of IXP1200 Network Processor Packet Filtering Software and Parameterization for Higher Performance Network Processors Shyamal H. Pandya

2 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 2 Agenda Introduction and Goal of the Thesis Brief description of IXP1200 Network Processor and the ENP-2505 ESB Software Environment Packet Filter Design Implementation Tests, Results and Parameterization Conclusion

3 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 3 Introduction Network Processors –A class of programmable processors designed for applications –flexible and efficient alternative to ASICs and General Purpose Processors –Employ several architectural features to achieve their design goals: A number of processing elements Intelligent and fast memory units and buses Instruction set architecture specifically tailored for packet processing operations –Examples: Intel IXP1200, IBM PowerNP series, Vitesse IQ2200

4 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 4 IXP1200 Belongs to the IXP family of Network Processors from Intel (IXP1200, IXP2400, IXP2800) Major Components –Intel StrongARM core processor –Six programmable RISC microengines 4 hardware contexts per microengine instruction set tailored to suit network applications –Memory Units 32-bit SRAM unit supporting upto 8 MB 64-bit SDRAM unit supporting upto 256 MB 8 KB of 32-bit Scratchpad Memory

5 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 5 Goal Network Processors targeted towards network applications - e.g. routers, VoIP, intrusion detection, packet filtering. These applications are characterized by the need to process packets at extremely fast rates to keep up with the speed of network traffic. Goal: to investigate the programmability of the IXP1200 through the design and implementation of a packet filter. Linux IP Tables - the Linux packet filtering framework, chosen as the basis of our packet filter. Parameterization - based on the experiences with packet filter implementation on the IXP1200, the architectural enhancements of the IXP2400 and higher performance network processor of the same family is analyzed to estimate its benefits.

6 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 6 IXP1200 in more Detail

7 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 7 IXP1200 in Operation

8 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 8 ENP-2505 ESB based on IXP1200 Pluggable in a PCI slot of a host computer Supports 4 10/100 Mbps ethernet ports 8 MB SRAM, 256 MB SDRAM StrongARM core processor and Microengines operate at 232 MHz 8 MB of flash memory that holds a RAM disk. ENP-2505 and Host Setup

9 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 9 Programming Model The ACE framework - A software framework to design applications that consists of isolated software components performing well-defined tasks –An ACE encapsulates the tasks or modules performing independent packet processing functions –One or more input targets and one or more output targets –Packets arrive at the input targets, are processed within the ACE and are transmitted through one of its output targets –An ACE can be bound to another by binding its output target to the other’s input target –An application is comprised of several ACEs bound to each other

10 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 10 Example ACE Application (Packet Forwarder)

11 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 11 MicroACE An extension to the ACE model: part of the ACE implemented on core processor, other part on the microengines Microblock performs fast path packet processing Core component a conventional ACE, manages the microblock MicroACE model can be exploited to divide the tasks between the microengines and the core processor Forwarding Application using MicroACEs

12 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 12 Packet Filter Design IP Tables –Packet filtering Infrastructure for the Linux OS –A set of modules that maintain tables of rules –A rule contains a specifications in terms of values that fields of a header must match and a target (ACCEPT/DROP) –Tables correspond to the kind of manipulation a packet undergoes - e.g. filter table, NAT table etc. –Table contains a number of chains, each chain to be traversed at particular points in the packets path, e.g INPUT, OUTPUT, FORWARD –Extensibility - each rule has at a minimum specs for IP Header matching. More examination can be specified by adding match structures, e.g tcp_match structures has specifications for matching packet TCP headers.

13 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 13 Packet Filter Design - Data Structures

14 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 14 Packet Filter Design - Algorithm For each rule in the chain of interest –match packet IP header against the specs in the rule. If the match succeeds, look for other match structures in the rule. – match the packet against each match structure found in the rule. If the packet satisfies all matches, the packet has successfully matched the rule. –For a successful match, look at the target of the rule if the target is ACCEPT, let the packet pass if the target is DROP, drop the packet and free its resources –For unsuccessful match, go to the next rule and repeat the process last rule matches all packets. Target specified is default policy

15 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 15 Implementation Task Division between the core processor and the microengines –Data Plane(Microengines): Ingress, Filtering, Forwarding, egress. –Control Plane(Core): Filter table, route table management. –Management Plane(Core): User Interface, Deployment Chains - INPUT, OUTPUT, FORWARD –INPUT and OUTPUT chains are traversed infrequently –FORWARD chain is used most frequently, hence implemented on microengines Software Components –Ingress, Egress, Forwarder MicroACEs and Stack ACE. Provided as part of SDK. –PacketFilter MicroACE - Designed and Implemented as part of the thesis.

16 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 16 Implementation Application Design in terms of MicroACEs

17 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 17 Implementation User Interface - iptables command –used to manipulate filter table by adding, deleting, inserting, replacing rules –an executable and libraries implement the user interface –Algorithm parse the command line,validate all the options and arguments obtain a local copy of the filter table by making a cross- call to the PacketFilter core component modify the local copy according to the command make a cross-call to the PacketFilter core component to replace old filter table with the new one, passing the modified filter table as argument

18 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 18 Implementation PacketFilter Core Component –Initialization Control Data Structures, filter table allocation in SRAM, patching filter table address to microcode –Cross-call Interface function do_replace, used by user interface to replace the current filter table with a new filter table in the SRAM

19 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 19 Implementation Microcode - Each microengine can run more than one microblock Flow of control is governed by a dispatch loop running on each enabled microengine Microblock partitioning across microengines

20 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 20 Implementation Dispatch Loops - Microengine 0 –Initialize the Ingress and PacketFilter Microblocks –In an infinite loop, do the following Call Ingress Microblock If a packet has arrived, call the PacketFilter Microblock, else if there is an exception, queue the packet for Ingress core component, else continue from beginning of the loop If PacketFilter microblock returns ACCEPT, queue the packet for Microengine 2, running the Forwarder If PacketFilter microblock returns DROP, drop the packet –Every SA_CONSUME_NUM times around the loop, poll the Core to ME packet queue for packets from core components. If there is a packet, determine its source (Ingress core or PacketFilter Core) and call the corresponding microblock –SA_CONSUME_NUM - tunable parameter to control frequency of memory accesses w.r.t. Core to ME packet queue

21 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 21 Implementation Dispatch Loops - Microengine 2 –Initialize the Forwarder Microblock –In an infinite loop, do the following Poll the packet queue from Microengine 0 to see if there is a packet. If packet available, call the Forwarder microblock, else continue from the beginning If Forwarder microblock returns success, queue the packet for microengine 5 to be scheduled for output, else if it returns an exception, queue the packet for the core component, else drop the packet –Poll the Core to ME packet buffer every SA_CONSUME_NUM times, and if there is a packet from the core component, call the microblock

22 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 22 Implementation Dispatch Loops - Microengine 5 –Initialize Egress microblock –4 output queues, contain packets for each output port –Context 0 polls the 4 output queues in a round-robin manner –Contexts 1-3 fill up the TFIFO with data from the current packet to be transmitted PacketFilter Microblock macros –PacketFilter() - main macro –ip_packet_match() - called from PacketFilter() –ipt_tcp_match() - TCP extension to core packet filtering code

23 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 23 Implementation - Microengine Re-tasking Triggered when the first rule specifying TCP match specs is added to the table Implementation –Core component sends inter-thread signals to all threads of microengine 0 –Each time around the dispatch loop, each thread checks for a signal –If signal is present, the thread stops its execution and sends interrupt to the StrongARM –Interrupt Handler - when an interrupt is received from each of the 4 threads of microengine 0, it wakes up the process sleeping on the interrupt (PacketFilter core component) –The core component disables microengine 0, reloads it with a new image containing ipt_tcp_match() macro and enables the microengine Above design makes sure that microengines are not interrupted while processing a packet thus preventing packet loss

24 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 24 Tests and Results Test setup Packets sent from host machine to the notebook Libnet library used to build packets host machine runs tcpdump and windows laptop runs ethereal

25 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 25 Tests and Results Experiment 1 - Code size Experiment 2 - Packet filtering operations –various commands to add, delete rules from the filter table –packet filtering operations performed correctly from observations of packet transmission and reception from tcpdump and ethereal

26 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 26 Tests and Results Experiment 3 - performance penalty due to task partitioning across microengines Experiment 4 - Microengine Re-tasking –command to add a TCP match specs rule to the filter table –Microengine 0 was re-tasked successfully and packet filtering operations continued

27 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 27 Parameterization IXP2400 Network Processor –Higher performance network processor of same family, with significant architectural enhancements Microstore (4Kb v/s 16KB) –1 K instructions limit - split tasks across 2 microengines –4K instructions: not necessary, performance penalty avoided

28 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 28 Parameterization –Total number of microwords for Ingress+PacketFilter+Forwarder = 1156 –extra instruction store space can be used for other components, UDP match, limit match, NAT, connection tracking Number of Microengines and Contexts –IXP1200 serves 8 ports with 16 contexts for input and 8 contexts for output to forward packets –Number of context per microengine is doubled, so each microengine can serve 4 ports for the input process (2 contexts per port as in IXP1200) –with 5 microengines for input and 3 for output, the number of ports service could be 20

29 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 29 Parameterization Next neighbor register set –data sharing very fast, avoiding memory accesses –Task partitioning between microengines = packet queues. Inter-microengine data communication - SRAM accesses, performance penalty –IXP2400 - packet queues avoided, buffer handles shared through next neighbor registers. Performance penalty avoided. Memory –ENP-2505 has 48 MB DRAM and 3 MB SRAM accessible to microengines –SRAM could accommodate 9K rules of average size. Thus memory was enough for PacketFilter application –Increase in memory in IXP2400 could benefit simultaneous execution of many memory hungry applications

30 Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 30 Conclusions Successfully implemented Packet filter core code and TCP header match extension Had to split filtering and forwarding across 2 microengines due to instruction store size limits MicroACE software framework was ideal for the design of the packet filter Microengine re-tasking complicated by the lack of smooth interface to microengine signals and interrupt handling Future work: investigating simultaneous operation of more than one application, more IP Tables extensions to the packet filter. Future work: incorporating interface to inter-thread signals and call-backs to MicroACE Framework


Download ppt "Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 1 Implementation of."

Similar presentations


Ads by Google