Intel ® IXP2XXX Network Processor Architecture and Programming Prof. Laxmi Bhuyan Computer Science UC Riverside.

Slides:



Advertisements
Similar presentations
Task Partitioning for Multi-Core Network Processors Rob Ennals, Richard Sharp Intel Research, Cambridge Alan Mycroft Programming Languages Research Group,
Advertisements

AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
MICROPROCESSORS TWO TYPES OF MODELS ARE USED :  PROGRAMMER’S MODEL :- THIS MODEL SHOWS FEATURES, SUCH AS INTERNAL REGISTERS, ADDRESS,DATA & CONTROL BUSES.
IXP: Bump in the Wire IXP: Bump in the Wire INF5063: Programming Asymmetric Multi-Core Processors 15 April 2015.
A First Example: The Bump in the Wire A First Example: The Bump in the Wire 9/ INF5061: Multimedia data communication using network processors.
A First Example: The Bump in the Wire A First Example: The Bump in the Wire 8/ INF5062: Programming Asymmetric Multi-Core Processors.
The AMD Athlon ™ Processor: Future Directions Fred Weber Vice President, Engineering Computation Products Group.
IXP: The Bump in the Wire IXP: The Bump in the Wire INF5062: Programming Asymmetric Multi-Core Processors 22 April 2015.
©UCR CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II) Instructor: L.N. Bhuyan
CSC457 Seminar YongKang Zhu December 6 th, 2001 About Network Processor.
CCU EE&CTR1 Software Architecture Overview Nick Wang & Ting-Chao Hou National Chung Cheng University Control Plane-Platform Development Kit.
Page 1 John Morgan Infrastructure Processor Division September 2004 Intel® IXP2XXX Network Processor Architecture Overview.
IXP1200 Microengines Apparao Kodavanti Srinivasa Guntupalli.
t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.
Chess Review May 10, 2004 Berkeley, CA A Comparison of Network Processor Programming Environments Niraj Shah William Plishker Kurt Keutzer.
4/22/2003 Network Processor & Its Applications1 Network Processor and Applications Prof. Laxmi Bhuyan
Performance Analysis of the IXP1200 Network Processor Rajesh Krishna Balan and Urs Hengartner.
Programmable logic and FPGA
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Intel IXP1200 Network Processor q Lab 12, Introduction to the Intel IXA q Jonathan Gunner, Sruti.
©UCR CS 260 Lecture 1: Introduction to Network Processors Instructor: L.N. Bhuyan
©UCB CS 162 Computer Architecture Lecture 2: Introduction & Pipelining Instructor: L.N. Bhuyan
Shyamal Pandya Implementation of Network Processor Packet Filtering and Parameterization for Higher Performance Network Processors 1 Implementation of.
ECE 526 – Network Processing Systems Design IXP XScale and Microengines Chapter 18 & 19: D. E. Comer.
DAP Spr.‘98 ©UCB 1 CS 203 A Lecture 16: Review for Test 2.
Network Processors and Web Servers CS 213 LECTURE 17 From: IBM Technical Report.
System Architecture A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Hyong-Youb Kim, Paul Willmann, Dr. Scott Rixner Rice.
Micro controllers A self-contained system in which a processor, support, memory, and input/output (I/O) are all contained in a single package.
Hardware Overview Net+ARM – Well Suited for Embedded Ethernet
A Scalable, Cache-Based Queue Management Subsystem for Network Processors Sailesh Kumar, Patrick Crowley Dept. of Computer Science and Engineering.
Cortex-M3 Debugging System
Lecture Note on Network Processors. What Is a Network Processor? Processor optimized for processing communications related tasks. Often implemented with.
Paper Review Building a Robust Software-based Router Using Network Processors.
Motivation Mobile embedded systems are present in: –Cell phones –PDA’s –MP3 players –GPS units.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Network Processors : Building Block for Programmable High- Speed Networks Introduction to the.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Revised: Aug 1, ECE 263 Embedded System Design Lesson 1 68HC12 Overview.
1 Lecture 20: I/O n I/O hardware n I/O structure n communication with controllers n device interrupts n device drivers n streams.
Samsung ARM S3C4510B Product overview System manager
Applied research laboratory David E. Taylor Users Guide: Fast IP Lookup (FIPL) in the FPX Gigabit Kits Workshop 1/2002.
Page 1 John Morgan Infrastructure Processor Division September 2004 Intel® IXP2XXX Network Processor Architecture Overview.
CSE 58x: Networking Practicum Instructor: Wu-chang Feng TA: Francis Chang.
Buffer-On-Board Memory System 1 Name: Aurangozeb ISCA 2012.
CPEN Digital System Design
Developing Power-Aware Strategies for the Blackfin Processor Steven VanderSanden Giuseppe Olivadoti David Kaeli Richard Gentile Northeastern University.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
1 TM The ARM Architecture - 1 Embedded Systems Lab./Honam University ARM Architecture SA-110 ARM7TDMI 4T 1 Halfword and signed halfword / byte support.
IXP Lab 2012: Part 1 Network Processor Brief. NCKU CSIE CIAL Lab2 Outline Network Processor Intel IXP2400 Processing Element Register Memory Interface.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
XStream: Rapid Generation of Custom Processors for ASIC Designs Binu Mathew * ASIC: Application Specific Integrated Circuit.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Ethernet Bomber Ethernet Packet Generator for network analysis
 Program Abstractions  Concepts  ACE Structure.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.
ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
ECE 526 – Network Processing Systems Design Network Address Translator II.
Jehandad Khan and Peter Athanas Virginia Tech
System On Chip.
ARM Architecture T 5TE 5TEJ Improved ARM/Thumb Interworking
An NP-Based Router for the Open Network Lab Hardware
Lec 11 – Multicore Architectures and Network Processors
Apparao Kodavanti Srinivasa Guntupalli
IXP Based Router for ONL: Architecture
Network-on-Chip Programmable Platform in Versal™ ACAP Architecture
Instructor: L.N. Bhuyan CS 213 Computer Architecture Lecture 7: Introduction to Network Processors Instructor: L.N. Bhuyan.
ADSP 21065L.
Presentation transcript:

Intel ® IXP2XXX Network Processor Architecture and Programming Prof. Laxmi Bhuyan Computer Science UC Riverside

MEv2 6 MEv2 7 MEv2 5 MEv2 8 Intel® XScale™ Core 32K IC 32K DC Rbuf 128B Tbuf 128B Hash 64/48/128 Scratch 16KB QDR SRAM 1 QDR SRAM 2 DDRAM GASKETGASKET PCI (64b) 66 MHz 32b 32b b S P I 3 or C S I X E/D Q MEv2 2 MEv2 3 MEv2 1 MEv2 4 CSRs -Fast_wr-UART -Timers-GPIO -BootROM/Slow Port IXP2400

IXP2400 Bandwidths 600 MHz Operation 4.8+ GOPs 2.5 Gb/s Full Duplex Media Interface POS-PHY Utopia CSIX-L1 2.4 GBs DDR Memory Bandwidth at 300 MTs 1.6 GBs QDR Memory Bandwidth with 200 MHz QDRII devices

IXP2400 Resources Summary Half Duplex OC-48 / 2.5 Gb/sec Network Processor (8) Multi-Threaded Microengines Intel ® XScale ™ Core Media / Switch Fabric Interface PCI interface 2 QDR SRAM interface controllers 1 DDR SDRAM interface controller 8 bit asynchronous port Flash and CPU bus Additional integrated feature Hardware Hash Unit 16 KByte Scratchpad Memory,Serial UART port 8 general purpose I/O pins Four 32-bit timers JTAG Support

SDRAM IXP2400 Full-Duplex OC-48 System Implementation IXF6048 Framer IXP2400 Ingress Processor IXP2400 Egress Processor Switch Fabric Gasket SDRAM QDRQDRQDRQDR Q QQDRDRQQDRDR DDR SDRAM Packet Memory QDR SRAM Queues & Tables DDR SDRAM Packet Memory QDR SRAM Queues & Tables 1x OC-48 or 4x OC-12 OC-48OC48 QDRQDRQDRQDR QDRQDRQDRQDR TCAM Classification Accelerator TCAM Host CPU (IOP or iA) SAR’ing Classification Metering Policing Initial Congestion Management Ingress Processor Traffic Shaping Flexible Choices diff serve TM 4.0 … Egress Processor

IXP2400 Chaining PCI 64/66 2.5Gbs CSIX-L1 IXP2400 Processor DDR Packet Memory IXP2400 Processor QDR SRAM Queues & Tables DRAMQ QQDRDRQQDRDRQ QQDRDRQQDRDR DRAMQ QQDRDRQQDRDRQ QQDRDRQQDRDR DDR Packet Memory 2.5 Gbs CSIX-L1 IXP2400 Processor QDR SRAM Queues & Tables DRAMQ QQDRDRQQDRDRQ QQDRDRQQDRDR DDR Packet Memory Glueless Interface between IXP2400 Devices using CSIX-L1 Control Plane Processor 2.5Gbs CSIX-L1 2.5Gbs SPI3

Intel® XScale™ Core 32K IC 32K DC MEv2 10 MEv2 11 MEv2 12 MEv2 15 MEv2 14 MEv2 13 Rbuf 128B Tbuf 128B Hash 48/64/128 Scratch 16KB QDR SRAM 2 QDR SRAM 1 RDRAM 1 RDRAM 3 RDRAM 2 GASKETGASKET PCI (64b) 66 MHz IXP b 16b b S P I 4 or C S I X Stripe E/D Q QDR SRAM 3 E/D Q 1818 MEv2 9 MEv2 16 MEv2 2 MEv2 3 MEv2 4 MEv2 7 MEv2 6 MEv2 5 MEv2 1 MEv2 8 CSRs -Fast_wr-UART -Timers-GPIO -BootROM/SlowPort QDR SRAM 4 E/D Q 1818

IXP2800 Bandwidths 1.4 GHz Operation 20+ GOPs 10Gbs Full Duplex Media Interface SPI-4.2 CSIX-L1 1.9 GB/s QDR SRAM Memory Bandwidth/Channel 2.1 GB/s RDRAM Memory Bandwidth/Channel

IXP2800 Resources Summary Half Duplex OC-192 / 10 Gb/sec Network Processor (16) Multi-Threaded Microengines Intel ® XScale ™ Core Media / Switch Fabric Interface PCI interface 4 QDR SRAM Interface Controllers 3 Rambus* DRAM Interface Controllers 8 bit asynchronous port Flash and CPU bus Additional integrated features Hardware Hash Unit for generating of 48-, 64-, or 128-bit adaptive polynomial hash keys 16 KByte Scratchpad Memory Serial UART port for debug 8 general purpose I/O pins Four 32-bit timers JTAG Support

IXP2800 and IXP2400 Comparison Dual chip full duplex OC48Dual chip full duplex OC192 Performance 8 (MEv2)16 (MEv2)Number of MicroEngines Separate 32 bit Tx & Rx configurable to SPI-3, UTOPIA 3 or CSIX_L1 Separate 16 bit Tx & Rx configurable to SPI-4 P2 or CSIX_L1 Media Interface 2 channels QDR (or co- processor) 4 channels QDR (or co- processor) SRAM Memory 1 channel DDR DRAM - 150MHz; Up to 2GB 3 channels RDRAM 800/1066MHz; Up to 2GB DRAM Memory 600/400MHz1.4/1.0 GHz/ 650 MHzFrequency IXP2400IXP2800

128 GPR Control Store 4K/8K Instructions 128 GPR Local Memory 640 words 128 Next Neighbor 128 S Xfer Out 128 D Xfer Out Other Local CSRs CRC Unit 128 S Xfer In 128 D Xfer In LM Addr 1 LM Addr 0 D-Push Bus S-Push Bus D-Pull BusS-Pull Bus To Next Neighbor From Next Neighbor A_Operand B_Operand ALU_Out P-Random # 32-bit Execution Data Path Multiply Find first bit Add, shift, logical 2 per CTX CRC remain Lock 0-15 Status and LRU Logic (6-bit) TAGs 0-15 Status Entry# CAM Timers Timestamp Prev B B_op Prev A A_op MicroEngine v2

Microengine v2 Features – Part 1 Clock Rates IXP2400 – 600/400 MHz IXP /1.0 GHz/ 650 MHz Control Store IXP2400 – 4K Instruction store IXP2800 – 8K Instruction store Configurable to 4 or 8 threads Each thread has its own program counter, registers, signal and wakeup events Generalized Thread Signaling (15 signals per thread) Local Storage Options 256 GPRs 256 Transfer Registers 128 Next Neighbor Registers bit words of local memory

Microengine v2 Features – Part 2 CAM (Content Addressable Memory) Performs parallel lookup on bit entries Reports a 9-bit lookup result 4 State bits (software controlled, no impact to hardware) Hit – entry number that hit; Miss – LRU entry 4-bit index of Cam entry (Hit) or LRU (Miss) Improves usage of multiple threads on same data CRC hardware IXP Provides CRC_16, CRC_32 IXP Provides CRC_16, CRC_32, iSCSI, CRC_10 and CRC_5 Accelerates CRC computation for ATM AAL/SAR, ATM OAM and Storage applications Multiply hardware Supports 8x24, 16x16 and 32x32 Accelerates metering in QoS algorithms DiffServ, MPLS Pseudo Random Number generation Accelerates RED, WRED algorithms 64-bit Time-stamp and 16-bit Profile count

Intel ® XScale ™ Core Overview High-performance, Low-power, 32-bit Embedded RISC processor Clock rate IXP MHz IXP /500/325 MHz 32 Kbyte instruction cache 32 Kbyte data cache 2 Kbyte mini-data cache Write buffer Memory management unit

IXA Software Framework XScale™ Core Programming Model External Processors Resource Manager Library Control Plane PDK Control Plane Protocol Stacks Core Components Core Component Infrastructure Library OSSL Microengine Programming Model Hardware Abstraction Library Utility LibraryProtocol Library Micro block Micro block Micro block Microblock Infrastructure Library

IXA Software Framework - Goals Accelerate software development for the IXP family of network processors Provide a simple and consistent infrastructure to write networking applications Enable reuse of components across applications Improve portability of code across the IXP family

Microengine Programming Model Hardware Abstraction Library Utility LibraryProtocol Library Micro block Micro block Micro block Microblock Infrastructure Library Dispatch Loop Hardware Abstraction Library Utility LibraryProtocol Library Microblock Infrastructure Library Micro block Micro block Micro block Microblock Group SourceSink

Microblock Programming Model Data Plane Libraries Libraries for commonly used functions Microblock Infrastructure Library Used by the Microblocks and the DL to manage packet meta data and DL variables Microblocks Enable development of modular code building blocks Define the data flow model, common data structures, state sharing between code blocks etc. Ensures consistency and improves reuse across the different reference applications Dispatch Loop (DL) The Glue code that binds Microblocks together to form Microblock Group

Microblocks A combined set of macros/functions that perform a data plane network processing function Each Microblock performs a major function on a packet 5-Tuple Classification, IPv4 Forwarding, NAT Written independent of each other Reusable across applications Use the infrastructure library Access and modify packet meta data and DL variables Use data plane libraries Hardware abstraction and code reusability

Microblock Architecture Dispatch Loop Hardware Abstraction Library Utility LibraryProtocol Library Microblock Infrastructure Library Micro block Micro block Micro block Microblock Group SourceSink Microblock Group bounded by a DL SourceSink ClassEncapIPv4 LM GPRs Packet Meta Data IP Header DL Variables Source Sink LMGPRsLMGPRs Rxtx Driver Microblocks Packet Processing Microblocks