Download presentation
Presentation is loading. Please wait.
Published byBrice Hodges Modified over 8 years ago
1
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas
2
HPSR, June 13-16, 2010 Packet Buffer in Routers Scheduler and Packet Buffers inin Input linecards have 40byte @ 40Gbps = 8ns to read and write a packet. Routers need to store the packets to deal with congestion – Bandwidth X RTT = 40Gb/s*250ms = 10Gb buffer. – Too big to store in SRAM, hence need to use DRAM. Problem: DRAM access time ~40ns. Roughly 10x speed difference. inin inin out Hao Wang and Bill Lin2
3
HPSR, June 13-16, 2010 Parallel and Interleaved DRAM DRAM banks Assume DRAM-to-SRAM access latency ratio is 3 PPP PPP Hao Wang and Bill Lin3
4
HPSR, June 13-16, 2010 Problems with Parallelism Access patterns may create problems. To access 3, 6, 9 and 11 one after another, it is possible to issue interleaved read requests and read those packets out at line rate. DRAMs 1 3 14 1110 654 89 13 12 2 7 Hao Wang and Bill Lin4
5
HPSR, June 13-16, 2010 Problems with Parallelism But, accessing 2 & 3 or 10 & 11 in succession is problematic. This is an example of Packet Access Conflict DRAMs 1 3 14 1110 654 89 13 12 2 7 Hao Wang and Bill Lin5
6
HPSR, June 13-16, 2010 Use Packet Departure Time Wide classes of routers (Crossbar Routers) where the packets departures are determined by the scheduler on the fly. – Packet buffers which cater to these routers exist but are complex There are other high performance routers such as Switch-Memory-Switch, Load Balanced Routers for which packet departure time can be calculated when the packet is inserted in the buffer. Hao Wang and Bill Lin6
7
Solution We will use the known departure times of the packets to schedule them to different DRAM banks such that there won’t be any conflicts at arrival or departure. HPSR, June 13-16, 2010Hao Wang and Bill Lin7
8
HPSR, June 13-16, 2010 Packet Buffer Abstraction Fixed sized packets, time is slotted (Example: 40Gb/s, 40 byte packet => 8ns). The buffer may contain arbitrary large number of logical queues, but with deterministic access. Single-write Single-read time-deterministic packet buffer model. Hao Wang and Bill Lin8
9
HPSR, June 13-16, 2010 Packet Buffer Architecture Interleaved memory architecture with multiple slower DRAM banks. – K slower DRAM banks b time slots to complete a single memory read or write operation b consecutive time slots is a frame Each bank is segmented into several sections Memory block is a collection of sections Hao Wang and Bill Lin9
10
HPSR, June 13-16, 2010 Proposed Architecture …… reservation table D1D1 DRAMs arriving packets departing packets bypass buffer departure reorder buffers 12K ……… … … … … 1 2 M 12b …… … … … … … 1 2 N 12b …… … … … … … 1 2 N … D2D2 DKDK memory block Hao Wang and Bill Lin10
11
HPSR, June 13-16, 2010 Reservation Table … 19242022 … 2120 12345 … K 0011 … 33 … blocks 1 i 23252220 … 2419 2 Hao Wang and Bill Lin11 Use a counter of size log 2 N bits to keep track of the actual number of packets in N packet locations. Reduce the size of the reservation table by
12
HPSR, June 13-16, 2010 Packet Access Conflicts Arrival conflicts – An arriving packet keeps a bank busy for b cycles – Need b-1 additional banks Departure conflicts – It takes b cycles to read a packet to output – Need b additional banks. Overflow conflicts – Incoming packets with departure times within N frames are stored in the same memory block – N×b arrivals, however, each memory section stores at most N packets Hao Wang and Bill Lin12
13
memory section Water-filling Algorithm HPSR, June 13-16, 2010Hao Wang and Bill Lin13 busy … memory block occupied most empty available bank A memory block is managed by a row of the reservation table
14
HPSR, June 13-16, 2010 Packet Access Conflicts Water-filling Algorithm – Pick the most empty bank to store the arriving packet – Solve overflow conflicts Theorem: With at least 3b-1 DRAM banks, it is always possible to admit all the arrival packets and write them into memory blocks based on their departure times. Hao Wang and Bill Lin14
15
HPSR, June 13-16, 2010 DRAM Selection Logic … 17192423 … 2620 12345 … K K columns … M rows s 20161922 … 2325 s+1 10010 … 0 write candidate vector W 15∞19∞ … ∞∞ m=3 X … 15221921 … 2023 s+u reservation table R Hao Wang and Bill Lin15
16
HPSR, June 13-16, 2010 Packet Arrival … 17192423 … 2620 12345 … K K columns … M rows s 20161922 … 2325 s+1 10010 … 0 write candidate vector W 15∞19∞ … ∞∞ m=3 X … 15221921 … 2023 s+u reservation table R Hao Wang and Bill Lin16 Use write candidate vector W to check arrival conflicts and departure conflicts
17
HPSR, June 13-16, 2010 Packet Arrival … 17192423 … 2620 12345 … K K columns … M rows s 20161922 … 2325 s+1 10010 … 0 write candidate vector W 15∞19∞ … ∞∞ m=3 X … 15221921 … 2023 s+u reservation table R Hao Wang and Bill Lin17 Pick the most empty bank to store the incoming packet
18
HPSR, June 13-16, 2010 Packet Departure Hao Wang and Bill Lin18 Packets in a memory block are moved to one of the departure reorder buffers before their departure times. Pick the fullest memory section first upon departure It is always possible to read all the packets from a memory section even if the section is full All packets are guaranteed to depart on time.
19
HPSR, June 13-16, 2010 SRAM Bypass Buffer The worst case of the minimum round-trip latency for storing and retrieving a packet to and from one of the DRAM banks is (2N+1)×b time slots. A bypass buffer to store packets with departure times shorter than (2N+1)×b time slots away. Hao Wang and Bill Lin19 … arriving packets departing packets packet locator …… head pointer
20
HPSR, June 13-16, 2010 SRAM Requirement (in MB) Hao Wang and Bill Lin20 N is the number of packets represented by one entry in the reservation table. Line rate is 100Gb/s N reservation table departure buffers bypass buffer TOTAL 1300.01 30.01 324.690.04 4.77 642.820.08 2.97 1281.650.16 1.96 2560.940.32 1.57 5120.530.63 1.78 10240.301.251.262.80
21
HPSR, June 13-16, 2010 SRAM Requirement Comparison Hao Wang and Bill Lin21 Line rate is 40Gb/s. RTT 250 ms. b = 16. K = 3 b -1 Average packet size 40 bytes The total SRAM size in our proposed block-based packet buffer is only 8.3% of the previous frame- based scheme and 1.6% of the state-of-the-art SRAM/DRAM prefetching buffer scheme. prefetching-basedframe-basedThis paper 64 MB12 MB1 MB
22
HPSR, June 13-16, 2010 Conclusion Packet buffer architecture with deterministic packet departure, e.g., Switch-Memory-Switch and Load- Balanced Routers. SRAM requirement grows logarithmically with the line rate. Required number of DRAM banks is a small constant independent of the arrival traffic patterns, the number of flows and the number of priority classes. Scalable to growing packet storage requirements in future routers while matching increasing line rates Hao Wang and Bill Lin22
23
Thank You for Your Kind Attention
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.