Download presentation
Presentation is loading. Please wait.
1
1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz E-mail: stchuang@stanford.edu Optical Router Project: http://klamath.stanford.edu/or/
2
2 Outline Load-Balancing Background Mis-sequencing Problem Datapath Architecture First stage - Segmentation Second stage – Main Buffering Third stage - Reassembly
3
3 Arbitration 160Gb/s Switch Fabric Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering 160 Gb/s 160 Gb/s Electronic Linecard #1 Electronic Linecard #625 Request Grant (100Tb/s = 625 * 160Gb/s) 100Tb/s router
4
4 Load-Balanced Switch External Outputs Internal Inputs 1 N External Inputs Load-balancing cyclic shift Switching cyclic shift 1 N 1 N 1 1 2 2
5
5 160 Gbps Linecard Fixed-size Packets Reassembly Segmentation Lookup/ Processing R 1 N 2 VOQs Intermediate Input Block Load-balancing Switching Input Block Output Block R R R R R
6
6 Outline Load-Balancing Background Mis-sequencing Problem Datapath Architecture First stage - Segmentation Second stage – Main Buffering Third stage - Reassembly
7
7 Problem: Unbounded Mis-sequencing External Outputs Internal Inputs 1 N External Inputs Spanning Set of Permutations 1 N 1 N 1 1 2 2
8
8 Preventing Mis-sequencing Uniform Frame Spreading: Group cells by frames of N cells each (frame building) Spread each frame across all middle linecards Each middle stage receives the same type of packets => has the same queue occupancy state 111 N Middle stage NN 1N 1 N N 1 N 1
9
9 Outline Load-Balancing Background Missequencing Problem Datapath Architecture First stage - Segmentation Second stage – Main Buffering Third stage - Reassembly
10
10 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly
11
11 Technology Assumptions in 2005 DRAM Technology Access Time ~ 40 ns Size ~ 1 Gbits Memory Bandwidth ~ 16 Gbps (16 data pins) On-chip SRAM Technology Access Time ~ 2.5 ns Size ~ 64 Mbits Serial Link Technology Bandwidth ~ 10 Gb/s >100 serial links per chip
12
12 First Stage Segmentation 1 2 N R variable-size packets 128-byte cells 16-bytes 1 2 N 1 2 N 1 2 N 108-127 16-31 0-15 R/8 16-bytes R/8 Frame Building 108-127 16-31 0-15
13
13 Segmentation Chip (1st stage) Segmentation 1 2 N R variable-size packets 128-byte cells R/8 Incoming: 16x10 Gb/s Outgoing: 8x2x10 Gb/s On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM 16-bytes 108-127 16-31 0-15
14
14 Frame Building Chip (1st stage) Incoming: 2x10 Gb/s Outgoing: 2x10 Gb/s On-chip Memory: N^2 x 16 bytes = 48 Mbits 3.2ns SRAM 16-bytes 1 2 N 0-15 R/8 16-bytes 0-15 R/8 Frame Building
15
15 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly
16
16 Packet Buffering Problem Packet buffers for a 160Gb/s router linecard Buffer Memory Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns 40Gbits Buffer Manager
17
17 Memory Technology Use SRAM? + Fast enough random access time, but - Too low density to store 40Gbits of data. Use DRAM? + High density means we can store data, but - Can’t meet random access time.
18
18 Can’t we just use lots of DRAMs in parallel? Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager Buffer Memory Read/write 1280B every 32ns 1280B Buffer Memory Buffer Memory Buffer Memory Buffer Memory 0-127128-255…1152-1279………………
19
19 128B Works fine if there is only one FIFO Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager 1280B Buffer Memory 1280B 128B 1280B 0-127128-255…1152-1279……………… 128B
20
20 In practice, buffer holds many FIFOs 1280B 1 2 Q e.g. In an IP Router, Q might be 200. In an ATM switch, Q might be 10 6. Write Rate, R One 128B packet every 6.4ns Read Rate, R One 1280B packet every 6.4ns Buffer Manager 1280B 320B ?B 320B 1280B ?B How can we write multiple packets into different queues? 0-127128-255…1152-1279………………
21
21 Arriving Packets R Arbiter or Scheduler Requests Departing Packets R 12 1 Q 2 1 2 34 34 5 123456 Small head SRAM cache for FIFO heads SRAM Hybrid Memory Hierarchy Large DRAM memory holds the body of FIFOs 5768109 798 11 12141315 5052515354 868887899190 8284838586 929493 95 68791110 1 Q 2 Writing b bytes Reading b bytes cache for FIFO tails 55 56 9697 87 88 57585960 899091 1 Q 2 Small tail SRAM DRAM
22
22 SRAM/DRAM results How much SRAM buffering, given: DRAM Trc = 40ns Write and read a 128-byte cell every 6.4ns Let Q = 625, b = 2*40ns/6.4ns = 12.5 Two Options [Iyer] Zero Latency Qb[2+lnQ] = 61k cells = 66 Mbits Some Latency Q(b-1) = 7.5k cells = 7.5 Mbits
23
23 Outline Load-Balancing Background Missequencing Problem Datapath Architecture First stage - Segmentation Second stage – Main Buffering Third stage - Reassembly
24
24 Problem Statement Queue Manager 40 Gb DRAM 160 Gb/s One 128B cell every 6.4ns One 128B cell every 6.4ns Write Rate, R Read Rate, R
25
25 Second Stage R/8 Main Buffering 1 2 N R/N 1 2 N 1 2 N R/8 16-bytes 108-127 16-31 0-15 16-bytes 108-127 16-31 0-15
26
26 Queue Manager Chip (2nd stage) Incoming: 2x10 Gb/s Outgoing: 2x10 Gb/s 35 pins/DRAM x 5 DRAMs = 175 pins SRAM/DRAM Memory: Q(b-1) = 2.8 Mbits 3.2ns SRAM SRAM linked list = 1 Mbit 3.2ns SRAM 16-bytes 0-15 R/8 16-bytes 0-15 R/8 Main Buffering 1 2 N R/N 5 x 1Gb DRAM R/4
27
27 Outline Load-Balancing Background Missequencing Problem Datapath Architecture First stage - Segmentation Second stage – Main Buffering Third stage - Reassembly
28
28 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly
29
29 Third stage Reassembly 1 2 N R variable-size packets R/8 Incoming: 8x2x10 Gb/s Outgoing: 16x10 Gb/s On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM 16-bytes 108-127 16-31 0-15
30
30 1st stage 1 segmentation chip 8 frame building chips 2nd stage 8 queue manager chips 40 1 Gb DRAMs 3rd stage 1 reassembly chip Total chip count 18 ASIC chips 40 1 Gb DRAMs Linecard Datapath Requirements
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.