Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.

Slides:



Advertisements
Similar presentations
1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
Advertisements

Main MemoryCS510 Computer ArchitecturesLecture Lecture 15 Main Memory.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego.
1 Adaptive History-Based Memory Schedulers Ibrahim Hur and Calvin Lin IBM Austin The University of Texas at Austin.
Design and Analysis of a Robust Pipelined Memory System Hao Wang †, Haiquan (Chuck) Zhao *, Bill Lin †, and Jun (Jim) Xu * † University of California,
1 Statistical Analysis of Packet Buffer Architectures Gireesh Shrimali, Isaac Keslassy, Nick McKeown
Router Architecture : Building high-performance routers Ian Pratt
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Frame-Aggregated Concurrent Matching Switch Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Routers with a Single Stage of Buffering Sundar Iyer, Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University,
Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University The Load-Balanced Router.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
Pipelined van Emde Boas Tree: Algorithms, Analysis, and Applications Hao Wang and Bill Lin University of California, San Diego.
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group
1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz
Using Load-Balancing To Build High-Performance Routers Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
Providing Performance Guarantees in Multipass Network Processors Isaac KeslassyKirill KoganGabriel ScalosubMichael Segal TechnionCisco & BGU (Ben-Gurion.
Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
1 BRICK: A Novel Exact Active Statistics Counter Architecture Nan Hua 1, Bill Lin 2, Jun (Jim) Xu 1, Haiquan (Chuck) Zhao 1 1 Georgia Institute of Technology.
Analysis of a Memory Architecture for Fast Packet Buffers Sundar Iyer, Ramana Rao Kompella & Nick McKeown (sundaes,ramana, Departments.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Paper Review Building a Robust Software-based Router Using Network Processors.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Designing Packet Buffers for Internet Routers Friday, October 23, 2015 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.
Addressing Queuing Bottlenecks at High Speeds Sailesh Kumar Patrick Crowley Jonathan Turner.
Main Memory CS448.
Authors: Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim) Xu Conf. : The 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems.
Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1.
Designing Packet Buffers for Router Linecards Sundar Iyer, Ramana Kompella, Nick McKeown Reviewed by: Sarang Dharmapurikar.
Winter 2006EE384x1 EE384x: Packet Switch Architectures I Parallel Packet Buffers Nick McKeown Professor of Electrical Engineering and Computer Science,
Applied research laboratory 1 Scaling Internet Routers Using Optics Isaac Keslassy, et al. Proceedings of SIGCOMM Slides:
Authors: Matteo Varvello, Diego Perino, and Leonardo Linguaglossa Publisher: NOMEN 2013 (The 2nd IEEE International Workshop on Emerging Design Choices.
Nick McKeown1 Building Fast Packet Buffers From Slow Memory CIS Roundtable May 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
Nick McKeown Spring 2012 Lecture 2,3 Output Queueing EE384x Packet Switch Architectures.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
1 CSCI 2510 Computer Organization Memory System II Cache In Action.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
1 How scalable is the capacity of (electronic) IP routers? Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer.
The Fork-Join Router Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University
A Load Balanced Switch with an Arbitrary Number of Linecards I.Keslassy, S.T.Chuang, N.McKeown ( CSL, Stanford University ) Some slides adapted from authors.
Contemporary DRAM memories and optimization of their usage Nebojša Milenković and Vladimir Stanković, Faculty of Electronic Engineering, Niš.
Techniques for Fast Packet Buffers Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science, Stanford.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
Memory Management.
Topics discussed in this section:
Reducing Hit Time Small and simple caches Way prediction Trace caches
Buffer Management and Arbiter in a Switch
Lecture 16: Data Storage Wednesday, November 6, 2006.
Packet Forwarding.
Cache Memory Presentation I
Lecture 15: DRAM Main Memory Systems
Virtually Pipelined Network Memory
Lecture: DRAM Main Memory
Lecture: DRAM Main Memory
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Techniques and problems for
Techniques for Fast Packet Buffers
Presentation transcript:

Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010 Packet Buffer in Routers Scheduler and Packet Buffers inin Input linecards have 40Gbps = 8ns to read and write a packet. Routers need to store the packets to deal with congestion – Bandwidth X RTT = 40Gb/s*250ms = 10Gb buffer. – Too big to store in SRAM, hence need to use DRAM. Problem: DRAM access time ~40ns. Roughly 10x speed difference. inin inin out Hao Wang and Bill Lin2

HPSR, June 13-16, 2010 Parallel and Interleaved DRAM DRAM banks Assume DRAM-to-SRAM access latency ratio is 3 PPP PPP Hao Wang and Bill Lin3

HPSR, June 13-16, 2010 Problems with Parallelism Access patterns may create problems. To access 3, 6, 9 and 11 one after another, it is possible to issue interleaved read requests and read those packets out at line rate. DRAMs Hao Wang and Bill Lin4

HPSR, June 13-16, 2010 Problems with Parallelism But, accessing 2 & 3 or 10 & 11 in succession is problematic. This is an example of Packet Access Conflict DRAMs Hao Wang and Bill Lin5

HPSR, June 13-16, 2010 Use Packet Departure Time Wide classes of routers (Crossbar Routers) where the packets departures are determined by the scheduler on the fly. – Packet buffers which cater to these routers exist but are complex There are other high performance routers such as Switch-Memory-Switch, Load Balanced Routers for which packet departure time can be calculated when the packet is inserted in the buffer. Hao Wang and Bill Lin6

Solution We will use the known departure times of the packets to schedule them to different DRAM banks such that there won’t be any conflicts at arrival or departure. HPSR, June 13-16, 2010Hao Wang and Bill Lin7

HPSR, June 13-16, 2010 Packet Buffer Abstraction Fixed sized packets, time is slotted (Example: 40Gb/s, 40 byte packet => 8ns). The buffer may contain arbitrary large number of logical queues, but with deterministic access. Single-write Single-read time-deterministic packet buffer model. Hao Wang and Bill Lin8

HPSR, June 13-16, 2010 Packet Buffer Architecture Interleaved memory architecture with multiple slower DRAM banks. – K slower DRAM banks b time slots to complete a single memory read or write operation b consecutive time slots is a frame Each bank is segmented into several sections Memory block is a collection of sections Hao Wang and Bill Lin9

HPSR, June 13-16, 2010 Proposed Architecture …… reservation table D1D1 DRAMs arriving packets departing packets bypass buffer departure reorder buffers 12K ……… … … … … 1 2 M 12b …… … … … … … 1 2 N 12b …… … … … … … 1 2 N … D2D2 DKDK memory block Hao Wang and Bill Lin10

HPSR, June 13-16, 2010 Reservation Table … … … K 0011 … 33 … blocks 1 i … Hao Wang and Bill Lin11 Use a counter of size log 2 N bits to keep track of the actual number of packets in N packet locations. Reduce the size of the reservation table by

HPSR, June 13-16, 2010 Packet Access Conflicts Arrival conflicts – An arriving packet keeps a bank busy for b cycles – Need b-1 additional banks Departure conflicts – It takes b cycles to read a packet to output – Need b additional banks. Overflow conflicts – Incoming packets with departure times within N frames are stored in the same memory block – N×b arrivals, however, each memory section stores at most N packets Hao Wang and Bill Lin12

memory section Water-filling Algorithm HPSR, June 13-16, 2010Hao Wang and Bill Lin13 busy … memory block occupied most empty available bank A memory block is managed by a row of the reservation table

HPSR, June 13-16, 2010 Packet Access Conflicts Water-filling Algorithm – Pick the most empty bank to store the arriving packet – Solve overflow conflicts Theorem: With at least 3b-1 DRAM banks, it is always possible to admit all the arrival packets and write them into memory blocks based on their departure times. Hao Wang and Bill Lin14

HPSR, June 13-16, 2010 DRAM Selection Logic … … … K K columns … M rows s … 2325 s … 0 write candidate vector W 15∞19∞ … ∞∞ m=3 X … … 2023 s+u reservation table R Hao Wang and Bill Lin15

HPSR, June 13-16, 2010 Packet Arrival … … … K K columns … M rows s … 2325 s … 0 write candidate vector W 15∞19∞ … ∞∞ m=3 X … … 2023 s+u reservation table R Hao Wang and Bill Lin16 Use write candidate vector W to check arrival conflicts and departure conflicts

HPSR, June 13-16, 2010 Packet Arrival … … … K K columns … M rows s … 2325 s … 0 write candidate vector W 15∞19∞ … ∞∞ m=3 X … … 2023 s+u reservation table R Hao Wang and Bill Lin17 Pick the most empty bank to store the incoming packet

HPSR, June 13-16, 2010 Packet Departure Hao Wang and Bill Lin18 Packets in a memory block are moved to one of the departure reorder buffers before their departure times. Pick the fullest memory section first upon departure It is always possible to read all the packets from a memory section even if the section is full All packets are guaranteed to depart on time.

HPSR, June 13-16, 2010 SRAM Bypass Buffer The worst case of the minimum round-trip latency for storing and retrieving a packet to and from one of the DRAM banks is (2N+1)×b time slots. A bypass buffer to store packets with departure times shorter than (2N+1)×b time slots away. Hao Wang and Bill Lin19 … arriving packets departing packets packet locator …… head pointer

HPSR, June 13-16, 2010 SRAM Requirement (in MB) Hao Wang and Bill Lin20 N is the number of packets represented by one entry in the reservation table. Line rate is 100Gb/s N reservation table departure buffers bypass buffer TOTAL

HPSR, June 13-16, 2010 SRAM Requirement Comparison Hao Wang and Bill Lin21 Line rate is 40Gb/s. RTT 250 ms. b = 16. K = 3 b -1 Average packet size 40 bytes The total SRAM size in our proposed block-based packet buffer is only 8.3% of the previous frame- based scheme and 1.6% of the state-of-the-art SRAM/DRAM prefetching buffer scheme. prefetching-basedframe-basedThis paper 64 MB12 MB1 MB

HPSR, June 13-16, 2010 Conclusion Packet buffer architecture with deterministic packet departure, e.g., Switch-Memory-Switch and Load- Balanced Routers. SRAM requirement grows logarithmically with the line rate. Required number of DRAM banks is a small constant independent of the arrival traffic patterns, the number of flows and the number of priority classes. Scalable to growing packet storage requirements in future routers while matching increasing line rates Hao Wang and Bill Lin22

Thank You for Your Kind Attention