1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz

Slides:



Advertisements
Similar presentations
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switch (Borrowed from Isaac Keslassys Defense Talk) Nick McKeown Professor of Electrical Engineering.
Advertisements

1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego.
Design and Analysis of a Robust Pipelined Memory System Hao Wang †, Haiquan (Chuck) Zhao *, Bill Lin †, and Jun (Jim) Xu * † University of California,
1 Statistical Analysis of Packet Buffer Architectures Gireesh Shrimali, Isaac Keslassy, Nick McKeown
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 High Speed Router Design Shivkumar Kalyanaraman Rensselaer Polytechnic Institute
Router Architecture : Building high-performance routers Ian Pratt
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
A Load-Balanced Switch with an Arbitrary Number of Linecards Isaac Keslassy, Shang-Tse Chuang, Nick McKeown.
Routers with a Single Stage of Buffering Sundar Iyer, Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University,
Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.
May 28th, 2002Nick McKeown 1 Scaling routers: Where do we go from here? HPSR, Kobe, Japan May 28 th, 2002 Nick McKeown Professor of Electrical Engineering.
Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University The Load-Balanced Router.
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
Analysis of a Statistics Counter Architecture Devavrat Shah, Sundar Iyer, Balaji Prabhakar & Nick McKeown (devavrat, sundaes, balaji,
Analysis of a Packet Switch with Memories Running Slower than the Line Rate Sundar Iyer, Amr Awadallah, Nick McKeown Departments.
Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004.
1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group
Using Load-Balancing To Build High-Performance Routers Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
Sizing Router Buffers (Summary)
Sizing Router Buffers Nick McKeown Guido Appenzeller & Isaac Keslassy SNRC Review May 27 th, 2004.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
A Load-Balanced Switch with an Arbitrary Number of Linecards Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department.
Scheduling Proposals Scheduling Group Giulio Galante, Wensheng Hua, Sundar Iyer, Isaac Keslassy, Pablo Molinero, Gireesh Shrimali, Rui Zhang.
Modeling TCP in Small-Buffer Networks
1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
IEE, October 2001Nick McKeown1 High Performance Routers Slides originally by Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.
Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.
Ph. D Oral Examination Load-Balancing and Parallelism for the Internet Stanford University Ph.D. Oral Examination Tuesday, Feb 18 th 2003 Sundar Iyer
Optimal Load-Balancing Isaac Keslassy (Technion, Israel), Cheng-Shang Chang (National Tsing Hua University, Taiwan), Nick McKeown (Stanford University,
Analysis of a Memory Architecture for Fast Packet Buffers Sundar Iyer, Ramana Rao Kompella & Nick McKeown (sundaes,ramana, Departments.
1 Growth in Router Capacity IPAM, Lake Arrowhead October 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Can Google Route? Building a High-Speed Switch from Commodity Hardware Guido Appenzeller, Matthew Holliman Q2/2002.
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Nick McKeown CS244 Lecture 7 Valiant Load Balancing.
Professor Yashar Ganjali Department of Computer Science University of Toronto
Optics in Internet Routers Mark Horowitz, Nick McKeown, Olav Solgaard, David Miller Stanford University
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
Designing Packet Buffers for Internet Routers Friday, October 23, 2015 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.
Designing Packet Buffers for Router Linecards Sundar Iyer, Ramana Kompella, Nick McKeown Reviewed by: Sarang Dharmapurikar.
Winter 2006EE384x1 EE384x: Packet Switch Architectures I Parallel Packet Buffers Nick McKeown Professor of Electrical Engineering and Computer Science,
Applied research laboratory 1 Scaling Internet Routers Using Optics Isaac Keslassy, et al. Proceedings of SIGCOMM Slides:
Nick McKeown1 Building Fast Packet Buffers From Slow Memory CIS Roundtable May 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
An Introduction to Packet Switching Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Challenges in Modern Multi-Tera- bit Class Switch Design.
Winter 2006EE384x Handout 11 EE384x: Packet Switch Architectures Handout 1: Logistics and Introduction Professor Balaji Prabhakar
Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer.
Opticomm 2001Nick McKeown1 Do Optics Belong in Internet Core Routers? Keynote, Opticomm 2001 Denver, Colorado Nick McKeown Professor of Electrical Engineering.
IEE, October 2001Nick McKeown1 High Performance Routers IEE, London October 18 th, 2001 Nick McKeown Professor of Electrical Engineering and Computer Science,
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
1 How scalable is the capacity of (electronic) IP routers? Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer.
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.
The Fork-Join Router Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University
A Load Balanced Switch with an Arbitrary Number of Linecards I.Keslassy, S.T.Chuang, N.McKeown ( CSL, Stanford University ) Some slides adapted from authors.
Techniques for Fast Packet Buffers Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science, Stanford.
1 Building big router from lots of little routers Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University.
Weren’t routers supposed
Addressing: Router Design
Parallelism in Network Systems Joint work with Sundar Iyer
Write about the funding Sundar Iyer, Amr Awadallah, Nick McKeown
Techniques and problems for
Techniques for Fast Packet Buffers
Presentation transcript:

1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz Optical Router Project:

2 Outline  Load-Balancing Background  Mis-sequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

3 Arbitration 160Gb/s Switch Fabric Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering 160 Gb/s 160 Gb/s Electronic Linecard #1 Electronic Linecard #625 Request Grant (100Tb/s = 625 * 160Gb/s) 100Tb/s router

4 Load-Balanced Switch External Outputs Internal Inputs 1 N External Inputs Load-balancing cyclic shift Switching cyclic shift 1 N 1 N

5 160 Gbps Linecard Fixed-size Packets Reassembly Segmentation Lookup/ Processing R 1 N 2 VOQs Intermediate Input Block Load-balancing Switching Input Block Output Block R R R R R

6 Outline  Load-Balancing Background  Mis-sequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

7 Problem: Unbounded Mis-sequencing External Outputs Internal Inputs 1 N External Inputs Spanning Set of Permutations 1 N 1 N

8 Preventing Mis-sequencing  Uniform Frame Spreading:  Group cells by frames of N cells each (frame building)  Spread each frame across all middle linecards  Each middle stage receives the same type of packets => has the same queue occupancy state 111 N Middle stage NN 1N 1 N N 1 N 1

9 Outline  Load-Balancing Background  Missequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

10 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly

11 Technology Assumptions in 2005 DRAM Technology Access Time ~ 40 ns Size ~ 1 Gbits Memory Bandwidth ~ 16 Gbps (16 data pins) On-chip SRAM Technology Access Time ~ 2.5 ns Size ~ 64 Mbits Serial Link Technology Bandwidth ~ 10 Gb/s >100 serial links per chip

12 First Stage Segmentation 1 2 N R variable-size packets 128-byte cells 16-bytes 1 2 N 1 2 N 1 2 N R/8 16-bytes R/8 Frame Building

13 Segmentation Chip (1st stage) Segmentation 1 2 N R variable-size packets 128-byte cells R/8  Incoming: 16x10 Gb/s  Outgoing: 8x2x10 Gb/s  On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM 16-bytes

14 Frame Building Chip (1st stage)  Incoming: 2x10 Gb/s  Outgoing: 2x10 Gb/s  On-chip Memory: N^2 x 16 bytes = 48 Mbits 3.2ns SRAM 16-bytes 1 2 N 0-15 R/8 16-bytes 0-15 R/8 Frame Building

15 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly

16 Packet Buffering Problem Packet buffers for a 160Gb/s router linecard Buffer Memory Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns 40Gbits Buffer Manager

17 Memory Technology  Use SRAM? + Fast enough random access time, but - Too low density to store 40Gbits of data.  Use DRAM? + High density means we can store data, but - Can’t meet random access time.

18 Can’t we just use lots of DRAMs in parallel? Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager Buffer Memory Read/write 1280B every 32ns 1280B Buffer Memory Buffer Memory Buffer Memory Buffer Memory … ………………

19 128B Works fine if there is only one FIFO Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager 1280B Buffer Memory 1280B 128B 1280B … ……………… 128B

20 In practice, buffer holds many FIFOs 1280B 1 2 Q e.g.  In an IP Router, Q might be 200.  In an ATM switch, Q might be Write Rate, R One 128B packet every 6.4ns Read Rate, R One 1280B packet every 6.4ns Buffer Manager 1280B 320B ?B 320B 1280B ?B How can we write multiple packets into different queues? … ………………

21 Arriving Packets R Arbiter or Scheduler Requests Departing Packets R 12 1 Q Small head SRAM cache for FIFO heads SRAM Hybrid Memory Hierarchy Large DRAM memory holds the body of FIFOs Q 2 Writing b bytes Reading b bytes cache for FIFO tails Q 2 Small tail SRAM DRAM

22 SRAM/DRAM results  How much SRAM buffering, given:  DRAM Trc = 40ns  Write and read a 128-byte cell every 6.4ns  Let Q = 625, b = 2*40ns/6.4ns = 12.5  Two Options [Iyer]  Zero Latency Qb[2+lnQ] = 61k cells = 66 Mbits  Some Latency Q(b-1) = 7.5k cells = 7.5 Mbits

23 Outline  Load-Balancing Background  Missequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

24 Problem Statement Queue Manager 40 Gb DRAM 160 Gb/s One 128B cell every 6.4ns One 128B cell every 6.4ns Write Rate, R Read Rate, R

25 Second Stage R/8 Main Buffering 1 2 N R/N 1 2 N 1 2 N R/8 16-bytes bytes

26 Queue Manager Chip (2nd stage)  Incoming: 2x10 Gb/s  Outgoing: 2x10 Gb/s  35 pins/DRAM x 5 DRAMs = 175 pins  SRAM/DRAM Memory: Q(b-1) = 2.8 Mbits 3.2ns SRAM  SRAM linked list = 1 Mbit 3.2ns SRAM 16-bytes 0-15 R/8 16-bytes 0-15 R/8 Main Buffering 1 2 N R/N 5 x 1Gb DRAM R/4

27 Outline  Load-Balancing Background  Missequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

28 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly

29 Third stage Reassembly 1 2 N R variable-size packets R/8  Incoming: 8x2x10 Gb/s  Outgoing: 16x10 Gb/s  On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM 16-bytes

30  1st stage  1 segmentation chip  8 frame building chips  2nd stage  8 queue manager chips  40 1 Gb DRAMs  3rd stage  1 reassembly chip  Total chip count  18 ASIC chips  40 1 Gb DRAMs Linecard Datapath Requirements