Download presentation
Presentation is loading. Please wait.
Published byFlora Weaver Modified over 9 years ago
1
CSE 58x: Networking Practicum Instructor: Wu-chang Feng TA: Francis Chang
2
About the course ● Prerequisite: CSE 524 or the equivalent ● Implementation-focused course – Intel's IXA network processor platform ● Contents – Brief lecture material on network processors and the IXP – 5 weeks of designed laboratories – 3 weeks of final projects
3
Modern router architectures ● Split into a fast path and a slow path ● Control plane – High-complexity functions – Route table management – Network control and configuration – Exception handling ● Data plane – Low complexity functions – Fast-path forwarding
4
Router functions ● RFC 1812 plus... – Error detection and correction – Traffic measurement and policing – Frame and protocol demultiplexing – Address lookup and packet forwarding – Segmentation, fragmentation, reassembly – Packet classification – Traffic shaping – Timing and scheduling – Queuing – Security
5
Design choices for network products ● General purpose processors ● Embedded RISC processors ● Network processors ● Field-programmable gate arrays (FPGAs) ● Application-specific integrated circuits (ASICs)
6
General purpose processors (GPP) ● Programmable ● Mature development environment ● Typically used to implement control plane ● Too slow to run data plane effectively – Sequential execution – CPU/Network 50x increase over last decade – Memory latencies 2x decrease over last decade ● Gigabit ethernet: 333 nanosecond per packet budget ● Cache miss: ~150-200 nanoseconds
7
Embedded RISC processors (ERP) ● Same as GPP, but – Slower – Cheaper – Smaller (require less board space) – Designed specifically for network applications ● Typically used for control plane functions
8
Application-specific integrated circuits (ASIC) ● Custom hardware ● Long time to market ● Expensive ● Difficult to develop and simulate ● Not programmable ● Not reusable ● But, the fastest of the bunch ● Suitable for data plane
9
Field Programmable Gate Arrays (FPGA) ● Flexible re-programmable hardware ● Less dense and slower than ASICs ● Cheaper than ASICs ● Good for providing fast custom functionality ● Suitable for data plane
10
Network processors ● The speed of ASICs/FPGAs ● The programmability and cost of GPPs/ERPs ● Flexible ● Re-usable components ● Lower cost ● Suitable for data plane
11
Network processors ● Common features – Small, fast, on-chip instruction stores (no caching) – Custom network-specific instruction set programmed at assembler level ● What instructions are needed for NPs? Open question. ● Minimality, Generality – Multiple processing elements – Multiple thread contexts per element – Multiple memory interfaces to mask latency – Fast on-chip memory (headers) and slow off-chip memory (payloads) – No OS, hardware-based scheduling and thread switching
12
Why network processors? ● The propaganda ● Take the current vertical network device market ● Commoditize horizontal slices of it ● PC market – Initially, an IBM custom vertical – Now, a commodity market with Intel providing the chip-set ● Network device market – Draw your own conclusions
13
Network processing approaches Programming/Development Ease Speed ASIC Network processor FPGA GPP Embedded RISC Processor
14
Network processor architectures ● Packet path – Store and forward ● Packet payload completely stored in and forwarded from off-chip memory ● Allows for large packet buffers ● Re-ordering problems with multiple processing elements ● Intel IXP, Motorola C5 – Cut-through ● Packet held in an on-chip FIFO and forwarded through directly ● Small packet buffers ● Built-in packet ordering ● AMCC
15
Network processor architectures ● Processing architecture – Parallel ● Each element independently performs entire processing function ● Packet re-ordering problems ● Larger instruction store needed per element – Pipelined ● Each element performs one part of larger processing function ● Communicates result to next processing element in pipeline ● Smaller code space ● Packet ordering retained ● Deterministic behavior (no memory thrashing) – Hybrid
16
Network processor architectures ● Processing hierarchy – ASICs – Embedded RISC processors – Specialized co-processors – See figure 13.7 in book
17
Network processor architectures ● Memory hierarchy – Small on-chip memory ● Control/Instruction store ● Registers ● Cache ● RAM – Large off-chip memory ● Cache ● Static RAM ● Dynamic RAM
18
Network processor architectures ● Internal interconnect – Bus – Cross-bar – FIFO – Transfer registers
19
Network processor architectures ● Concurrency – Hardware support for multiple thread contexts – Operating system support for multiple thread contexts – Pre-emptiveness – Migration support
20
Increasing network processor performance ● Processing hierarchy – Increase clock speed – Increase elements ● Memory hierarchy – Increase size – Decrease latency – Pipelining – Add hierachies – Add memory bandwidth (parallel stores) – Add functional memory (CAMs)
21
Focus of this class... ● Network processors – Intel IXA
22
IXP 1200 features ● One embedded RISC processor (StrongARM) – Runs control plane (Linux) ● 6 programmable packet processors ( -engines) – Runs data plane ( -engine assembler or -engine C) ● Central hash unit ● Multiple, bus interconnects – IXBus (4.4Gbps) to overcome PCI's 2.2Gbps limit ● Small on-board memory ● Serial interface for control ● External interfaces for memory
24
IXP12xx -engine
25
IXP2xxx -engine
26
-engine functions ● Packet ingress from physical layer interface ● Checksum verification ● Header processing and classification ● Packet buffering in memory ● Table lookup and forwarding ● Header modification ● Checksum computation ● Packet egress to physical layer interface
27
-engine characteristics ● Programmable microcontroller – Custom RISC instruction set – Private 2048 instruction store per -engine (loaded by StrongARM) – 5-stage execution pipeline ● Hardware support for 4 threads and context switching – Each -engine has 4 hardware contexts (mask memory latency)
28
-engine characteristics ● 128 general purpose registers – Can be partitioned or shared – Absolute or context-relative ● 128 transfer registers – Staging registers for memory transfers – 4 blocks of 32 registers ● SDRAM or SRAM ● Read or Write ● Local Control and Status Registers (CSRs) – USTORE instructions, CTX, etc. (p. 315)
29
-engine characteristics ● FBI unit – Scratchpad memory – Hash unit – FBI CSRs – IXBus control – IXBus FIFOs ● Transmit and Receive FIFOs to external line cards
30
-engine opcodes ● ALU instructions – ALU, ALU_SHF, DBL_SHIFT ● Branch/Jump instructions – BR, BR=0, BR!=0, BR_BSET, BR=BYTE, BR=CTX, BR_INP_STATE, BR_!SIGNAL, JUMP, RTN, etc. ● Reference instructions – CSR, FAST_WR, LOCAL_CSR_RD, R_FIFO_RD, PCI_DMA, SCRATCH, SDRAM, SRAM, T_FIFO_WR, etc. ● Local register instructions – FIND_BST, IMMED, LD_FIELD, LOAD_ADDR, LOAD_BSET_RESULT1, etc.
31
-engine functions ● Miscellaneous – CTX_ARB – NOP – HASH1_48, HASH1_64, etc.
32
1. Packet received on physical interface (MAC) 2. Ready-bus sequencer polls MAC for mpacket Updates receive-ready upon a full mpacket 3. -engine polls for receive-ready 4. -engine instructs FBI to move mpacket from MAC to RFIFO 5. -engine moves mpacket directly from RFIFO to SDRAM 6. Repeat 1-5 until full packet received 7. -engine or StrongARM processing 8. Packet header read from SDRAM or RFIFO into m-engine and classified (via SRAM tables) 9. Packet headers modified 10. mpackets sent to interface 11. Poll for space on MAC Update transmit-ready if room for mpacket 12. mpackets transferred to MAC 8 9 8 8 9
33
Programming the IXP ● Focus of this course on steps 7, 8, and 9 ● 2 programming frameworks – Command-line, IXA Active Computing Engine (ACE) framework – Graphical microengine C development environment
34
Programming the IXP ● Command-line, IXA Active Computing Engine (ACE) framework – Re-usable function blocks chained together to build an application (Chapters 22-24) – New functions implemented as new blocks in chain ● Core ACEs (StrongARM) – Written in C ● Microblock ACEs (microengines) – Written in assembler
36
Programming the IXP ● Graphical microengine C development environment – Monolithic microengine C code (can not be used on IXP1200 hardware) – Demos forthcoming
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.