Download presentation
Presentation is loading. Please wait.
1
Analysis of a Memory Architecture for Fast Packet Buffers Sundar Iyer, Ramana Rao Kompella & Nick McKeown (sundaes,ramana, nickm)@stanford.edu Departments of Electrical Engineering & Computer Science, Stanford University http://klamath.stanford.edu
2
Stanford University 2 Packet Buffer Architecture Goal Determine and analyze techniques for building high speed (>40Gb/s) electronic packet buffers. Buffer Memory Example OC768c = 40Gb/s; RTT*BW = 10Gb; 64 byte packets Write Rate, R 1 packet every 12.8 ns Read Rate, R 1 packet every 12.8 ns
3
Stanford University 3 This is not Possible Today Use SRAM? + fast enough random access time, but - too expensive, and - too low density to store 10Gb of data. Use DRAM? + high density means we can store data, but - too slow (typically 50ns random access time)
4
Stanford University 4 Characteristics of Packet Buffer Architectures The total throughput needed is at least 2(Ingress Rate) Size of Buffer is at least R * RTT The buffers have one or more FIFOs The sequence in which the FIFOs are accessed is determined by an arbiter and is unknown apriori
5
Stanford University 5 Memory Hierarchy Arriving Packets R Arbiter or Scheduler Requests Departing Packets R 12 1 Q 2 1 2 34 34 5 123456 Small egress SRAM cache of FIFO heads Large DRAM memory holds the middle of FIFOs 5768109 798 11 12141315 5052515354 868887899190 8284838586 929493 95 68791110 1 Q 2 Writing b cells Reading b cells cache of FIFO tails 55 56 9697 87 88 57585960 899091 1 Q 2 Small ingress SRAM
6
Stanford University 6 Some Thoughts The buffer architecture itself is well known Many buffers are designed to give only statistical guarantees We would like to be able to give deterministic guarantees
7
Stanford University 7 System Design Parameters Main Parameters –SRAM Size –Latency faced by a cell System Parameters –I/O Bandwidth –Number of addresses Use single address on every DRAM Use different addresses on every DRAM –Use/Non Use of DRAM Burst Mode –(non) Existence of Bank conflicts
8
Stanford University 8 The Six Commandments…. Thou shall … Assumptions on system parameters –Have no speedup on DRAM –Have no speedup on I/O = 2R –Use a single address bus for every DRAM Read/Write cells from the same queue –Split packets into cells of size “C”. –Use no DRAM Burst Mode –Design with no bank conflicts
9
Stanford University 9 Symmetry Argument The analysis and working of the ingress and egress buffer architectures are similar We shall analyze only the egress buffer architecture
10
Stanford University 10 Questions How large does the SRAM cache need to be: 1.To guarantee that a packet is immediately available in SRAM when requested, or 2.To guarantee that a packet is available within a maximum bounded time? What Memory Management Algorithm (MMA) should we use?
11
Stanford University 11 A Bad Case for the Queues …1 Q=5, w=9+, b=5 t = 1 Cells t = 3 Cells t = 4 Cells t = 5 Cells t = 7 Cells t = 2 Cells t = 6 Cells t = 0 Cells Q w
12
Stanford University 12 A Bad Case for the Queues …2 Q=5, w=9+, b=5 t = 8 Cells t = 13 Cells t = 9 Cells t = 10 Cells t = 11 Cells t = 12 Cells t = 14 Cells … t = 17 Cells
13
Stanford University 13 Result Single Address Bus Impatient Arbiter: MDQF-MMA (maximum deficit queue first), with a SRAM buffer of size Qb[2 +ln Q] guarantees zero latency.
14
Stanford University 14 Questions How large does the SRAM cache need to be: 1.To guarantee that a packet is immediately available in SRAM when requested, or 2.To guarantee that a packet is available within a maximum bounded time? What Memory Management Algorithm (MMA) should we use? This talk…
15
Stanford University 15 Earliest Critical Queue First (ECQF-MMA) 3.Compute: Find out the queue which will runs into “trouble” earliest green! 2.Lookahead: Arbiter requests for future Q(b-1) + 1 slots are known Q(b-1) + 1 Requests in lookahead buffer b - 1 1.In SRAM: A Dynamic Buffer of size Q(b-1) Q Cells 4.Replenish: “b” cells for the “troubled” queue Q b - 1 Cells
16
Stanford University 16 Example of ECQF-MMA: Q=4, b=4 t = 0; Green Critical Requests in lookahead buffer Cells t = 1 Cells Requests in lookahead buffer t = 2 Cells Requests in lookahead buffer t = 3 Cells Requests in lookahead buffer t = 4; Blue Critical Cells Requests in lookahead buffer t = 5 Cells Requests in lookahead buffer t = 6 Cells Requests in lookahead buffer t = 7 Cells Requests in lookahead buffer t = 8; Red Critical Cells Requests in lookahead buffer
17
Stanford University 17 Result Single Address Bus Patient Arbiter: ECQF-MMA (earliest critical queue first), minimizes the size of SRAM buffer to Q(b-1) cells; and guarantees that requested cells are dispatched within Q(b-1)+1 cell slots.
18
Stanford University 18 Implementation Numbers (64byte cells, b = 8, DRAM T = 50ns) 1.VOQ Switch - 32 ports –Brute Force: Egress. SRAM = 10 Gb, no DRAM –Patient Arbiter: Egress. SRAM = 115kb, Lat. = 2.9 us, DRAM = 10Gb –Impatient Arbiter: Egress. SRAM = 787 kb, DRAM = 10Gb –Patient Arbiter(MA): No SRAM, Lat. = 3.2us, DRAM = 10Gb 2.VOQ Switch - 32 ports, 16 Diffserv classes –Brute Force: Egress. SRAM = 10Gb, no DRAM –Patient Arbiter: Egress. SRAM = 1.85Mb, Lat. = 45.9us, DRAM 10Gb –Impatient Arbiter: Egress. SRAM = 18.9Mb, DRAM = 10Gb –Patient Arbiter(MA): No SRAM, Lat. = 51.2us, DRAM = 10Gb
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.