Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS, Technion), Isaac Keslassy (EE, Technion)

Slides:



Advertisements
Similar presentations
1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
Advertisements

Configuring a Load-Balanced Switch in Hardware Srikanth Arekapudi, Shang-Tse (Da) Chuang, Isaac Keslassy, Nick McKeown Stanford University.
Optimal-Complexity Optical Router Hadas Kogan, Isaac Keslassy Technion (Israel)
Lecture 12. Emulating the Output Queue So far we have shown that it is possible to obtain the same throughput with input queueing as with output queueing.
High-Performance Networking Group Isaac Keslassy, Nick McKeown
KARL NADEN – NETWORKS (18-744) FALL 2010 Overview of Research in Router Design.
Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Frame-Aggregated Concurrent Matching Switch Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Routers with a Single Stage of Buffering Sundar Iyer, Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University,
Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University The Load-Balanced Router.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.
CS 268: Router Design Ion Stoica March 1, 2004.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Analysis of a Packet Switch with Memories Running Slower than the Line Rate Sundar Iyer, Amr Awadallah, Nick McKeown Departments.
1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group
Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Input-Queued.
Using Load-Balancing To Build High-Performance Routers Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
A Load-Balanced Switch with an Arbitrary Number of Linecards Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
The Crosspoint Queued Switch Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)
1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
EE 122: Router Design Kevin Lai September 25, 2002.
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Scheduling in Delay Graphs with Applications to Optical Networks Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis.
Fundamental Complexity of Optical Systems Hadas Kogan, Isaac Keslassy Technion (Israel)
1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
Optimal Load-Balancing Isaac Keslassy (Technion, Israel), Cheng-Shang Chang (National Tsing Hua University, Taiwan), Nick McKeown (Stanford University,
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
Surprise Quiz EE384Z: McKeown, Prabhakar ”Your Worst Nightmares in Packet Switching Architectures”, 3 units [Total time = 15 mins, Marks: 15, Credit is.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Computer Networks Switching Professor Hui Zhang
Load Balanced Birkhoff-von Neumann Switches
Nick McKeown CS244 Lecture 7 Valiant Load Balancing.
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
High Speed Stable Packet Switches Shivendra S. Panwar Joint work with: Yihan Li, Yanming Shen and H. Jonathan Chao New York State Center for Advanced Technology.
Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms Thursday, October 08, 2015 Feng Wang Siu Hong Yuen.
Summary of switching theory Balaji Prabhakar Stanford University.
1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
Winter 2006EE384x1 EE384x: Packet Switch Architectures I a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis Nick McKeown.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
Queuing Delay 1. Access Delay Some protocols require a sender to “gain access” to the channel –The channel is shared and some time is used trying to determine.
SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
Topics in Internet Research: Project Scope Mehreen Alam
A Load-Balanced Switch with an Arbitrary Number of Linecards Offense Anwis Das.
Input buffered switches (1)
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
scheduling for local-area networks”
CS 268: Router Design Ion Stoica February 27, 2003.
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Write about the funding Sundar Iyer, Amr Awadallah, Nick McKeown
Presentation transcript:

Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS, Technion), Isaac Keslassy (EE, Technion)

CIOQ Switches

Cell-Mode Scheduling

Trend towards Packet-Mode Cell-mode scheduling is getting too hard  Fragmentation and reassembly should work very fast, at the external rate  Extra header for each cell  loss of bandwidth For optical switches such fragmentation and reassembly are prohibitive Cell-mode schedulers are packet-oblivious  Degradation of the overall performance

Packet-Mode Scheduling

No need for fragmentation and reassembly Must ensure contiguous packet delivery over the fabric  While input i delivers a packet to output j, neither input i nor output j can handle other packets. Can packet-mode schedulers provide similar performance guarantees as cell-mode schedulers? [Marsan et al., 2002][Ganjali et al., 2003][Turner, 2006]

Output Queuing Emulation OQ switches are considered optimal with respect to queuing delay and throughput  But too hard to implement in practice… Emulation: Same input traffic  same output traffic How hard is it for cell-mode / packet-mode CIOQ switch to emulate OQ switch?

Output Queuing Emulation OQ switches are considered optimal with respect to queuing delay and throughput  But too hard to implement in practice… Emulation: Same input traffic  same output traffic How hard is it for cell-mode / packet-mode CIOQ switch to emulate OQ switch?

Easy with speedup S=N  N scheduling decisions every time-slot:  In the 1 st decision forward the cell of input 1  In the 2 nd decision forward the cell of input 2 ⋮  In the N th decision forward the cell of input N Possible with speedup S  2: CCF algorithm Lower bound: S≥2-1/N is required [Chuang et al.,1999] Cell-Mode Emulation is Possible What is the speedup required for packet-mode emulation?

Packet-Mode Emulation is Impossible Regardless of speedup  Even with speedup S=N

Packet-Mode Emulation is Impossible

Emulation w/ Relative Queuing Delay The CIOQ switch is allowed a bounded lag behind the shadow OQ switch  Exact same behavior as the optimal OQ switch, but with some extra delay  Called relative queuing delay Can we provide packet-mode OQ emulation with bounded RQD and small speedup?

Our Results: Speedup-RQD tradeoff Speedup RQD 2 4 2L max Lower bound on RQD (even with infinite speedup) Lower bound on the speedup (from cell-mode scheduling) Generalization of cell-mode scheduling with S=2: Taking each packet of size ≤ L max as one huge cell L max =maximum packet size First algorithm: S  4 with RQD=O(NL max )

Intuition for Emulation Algorithms Packet Mode CIOQ Packet Mode OQ Cell Mode CIOQ w/ S=2

Underlying CCF Algorithm Observation: Packet-Mode OQ switch is a Cell-Mode OQ switch with different queuing discipline (called PIFO) Cell-Mode CIOQ w/ CCF (and speedup S=2) emulates any PIFO cell-mode OQ switch [Chuang et al.,1999]  But, CCF does not maintain contiguous packet forwarding over the fabric! Packet Mode CIOQ Packet Mode OQ Cell Mode CIOQ w/ S=2 PIFO Cell-Mode OQ =

Intuition for Emulation Algorithms Packet Mode CIOQ Packet Mode OQ Cell Mode CIOQ w/ S=2 Two sub-steps: 1.Framing 2.Contiguous Decomposition

Frame-Based Schedulers Works in pipelined frame-based manner Within each frame: Build a demand matrix for this frame Schedule the demand matrix of the previous frame time

At each frame of size T, CCF forwards at most 2T cells from each input and to each output. Building the Demand Matrix Number of cells CCF sent from input 1 to output 1 in the last frame ≤ 2T ≤ ≤ ≤ ≤ Problem: A packet may span several frames. 2T

Building the Demand Matrix Count only packets whose last cell is forwarded by the CCF in the frame Each row/column in the matrix is bounded by 2T+N(L max -1)  For each input-output pair only cells of one additional packet can be added. Translates into RQD of 2T+L max -2.

Intuition for Emulation Algorithms Packet Mode CIOQ Packet Mode OQ Cell Mode CIOQ w/ S=2 Two sub-steps: 1.Framing 2.Contiguous Decomposition

Decomposing the Demand Matrix Challenge: Decompose the matrix into permutations while maintaining contiguous packet delivery.  Each permutation dictates a scheduling decision.  Speedup = Number of permutations/Frame Length First try: optimal Birkhoff von-Neumann decomposition results in 2T+N(L max -1) permutations.

Contiguous Greedy Decomposition To maintain contiguous packet delivery:  If (i,j) was matched in iteration t-1 and there are more (i,j) cells to schedule  keep for iteration t. Find a greedy matching for the rest of the matrix. Iteration t-1 Iteration t Cells left from 1 to 1  Speedup: RQD: 2T+L max -2

Our Results: Speedup-RQD tradeoff Speedup RQD 2 4 2L max S=4+ (2N(L max -1)-1)/T RQD = 2T+L max -2 Next…

Packet-Mode Emulation w/ S  2 Separate demand matrix for every possible packet size Concatenate packets of the same size into mega-packets of size k =LCM(1,…,L max ) Leftover matrix for each size m Packet Mode CIOQ Packet Mode OQ Cell Mode CIOQ w/ S=2 Two sub-steps: 1.Framing 2.Contiguous Decomposition

Packet-Mode Emulation w/ S  2 Optimally decompose (w/ Birkhoff von-Neumann)  the mega-packets matrix  then the leftover matrices Packet Mode CIOQ Packet Mode OQ Cell Mode CIOQ w/ S=2 Two sub-steps: 1.Framing 2.Contiguous Decomposition

Wrap-up Packet-mode scheduling can be done with the same speedup as cell-mode scheduling  With the price of bounded RQD  Future work: lower bounds ??

Thank You!