048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.

Slides:



Advertisements
Similar presentations
EE384y: Packet Switch Architectures
Advertisements

1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
1 Scheduling Crossbar Switches Who do we chose to traverse the switch in the next time slot? N N 11.
Lecture 12. Emulating the Output Queue So far we have shown that it is possible to obtain the same throughput with input queueing as with output queueing.
Discrete Time Markov Chains
Modeling the Interactions of Congestion Control and Switch Scheduling Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty.
Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Frame-Aggregated Concurrent Matching Switch Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Towards Simple, High-performance Input-Queued Switch Schedulers Devavrat Shah Stanford University Berkeley, Dec 5 Joint work with Paolo Giaccone and Balaji.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Review.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
April 10, HOL Blocking analysis based on: Broadband Integrated Networks by Mischa Schwartz.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Input-Queued.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Statistical.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Course.
The Crosspoint Queued Switch Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)
1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Scheduling in Delay Graphs with Applications to Optical Networks Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis.
1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Milan Vojnović EPFL Joint work with Matthew Andrews.
Maximum Size Matchings & Input Queued Switches Sundar Iyer, Nick McKeown High Performance Networking Group, Stanford University,
1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
Optimal Load-Balancing Isaac Keslassy (Technion, Israel), Cheng-Shang Chang (National Tsing Hua University, Taiwan), Nick McKeown (Stanford University,
CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
Distributed Scheduling Algorithms for Switching Systems Shunyuan Ye, Yanming Shen, Shivendra Panwar
1 Growth in Router Capacity IPAM, Lake Arrowhead October 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
1 Scheduling Crossbar Switches Who do we chose to traverse the switch in the next time slot? N N 11.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
Load Balanced Birkhoff-von Neumann Switches
High Speed Stable Packet Switches Shivendra S. Panwar Joint work with: Yihan Li, Yanming Shen and H. Jonathan Chao New York State Center for Advanced Technology.
Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms Thursday, October 08, 2015 Feng Wang Siu Hong Yuen.
Summary of switching theory Balaji Prabhakar Stanford University.
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
Winter 2006EE384x1 EE384x: Packet Switch Architectures I a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis Nick McKeown.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
Topics in Internet Research: Project Scope Mehreen Alam
Reduced Rate Switching in Optical Routers using Prediction Ritesh K. Madan, Yang Jiao EE384Y Course Project.
Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California
Scheduling algorithms for CIOQ switches Balaji Prabhakar.
Input buffered switches (1)
scheduling for local-area networks”
Balaji Prabhakar Departments of EE and CS Stanford University
Weren’t routers supposed
Packet Forwarding.
Packet Scheduling/Arbitration in Virtual Output Queues and Others
Stability Analysis of MNCM Class of Algorithms and two more problems !
Balaji Prabhakar Departments of EE and CS Stanford University
Scheduling Crossbar Switches
Presentation transcript:

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling in Input-Queued Switches Uniform Traffic Birkhoff-von Neumann

Spring – Packet Switch Architectures2 Where We Are  We introduced IQ switches  We saw that HoL blocking reduces throughput  We got tools from queueing theory to analyze more complex queueing systems

Spring – Packet Switch Architectures3 Where We Are  We will now study input-queued switches with VOQs (Virtual Output Queues)  No HoL blocking  But we need good scheduling algorithms to obtain 100% throughput

Spring – Packet Switch Architectures4 History 1.[Karol et al., 1987] HoL Blocking: Throughput limited to 58% for Bernoulli IID uniform traffic.

Spring – Packet Switch Architectures5 History 2.[Tamir and Frazier, 1988] VOQs: remove HoL blocking, increase throughput

Spring – Packet Switch Architectures6 History 3.[Anderson et al., 1993] MSM: analogy to MSM (Maximum Size Matching) in bipartite graph

Spring – Packet Switch Architectures7 History 4.[McKeown et al., 1995] MWM: MSM (Maximum Size Matching) does not guarantee 100% throughput. MWM (Maximum Weight Matching) does. 5. [Chuang et al., 1998] CIOQ: IQ can emulate OQ with speedup 2. 6.[Chang et al., 1999] BvN: A schedule implementing a Birkhoff-von Neumann decomposition gets 100% throughput.

Spring – Packet Switch Architectures8 History 7.[Leonardi et al., 2000 ; Dai and Prabhakar, 2000] Maximal: IQ can get 100% throughput with speedup 2 using maximal matchings. For instance, WFA [Tamir and Chi, 1993], PIM [Anderson et al., 1993], iSLIP [McKeown et al., 1993]. 8. [Andrews and Zhang, 2001] Network: A network of MWM switches is unstable 9. [Chang et al., 2002] LBR: A Load-Balanced Router provides 100% throughput without scheduling.

Spring – Packet Switch Architectures9 Achieving 100% throughput 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic, but known traffic matrix  Technique: Non-uniform schedule (Birkhoff-von Neumann) 4. Unknown traffic matrix  Technique: Lyapunov functions (MWM) 5. Faster scheduling algorithms  Technique: Speedup (maximal matchings)  Technique: Memory and randomization (Tassiulas)  Technique: Twist architecture (buffered crossbar) 6. Accelerate scheduling algorithm  Technique: Pipelining  Technique: Envelopes  Technique: Slicing 7. No scheduling algorithm  Technique: Load-balanced router

Spring – Packet Switch Architectures10 Head-of-Line Blocking Blocked!

Spring – Packet Switch Architectures11

Spring – Packet Switch Architectures12

Spring – Packet Switch Architectures13 Virtual Output Queues

Spring – Packet Switch Architectures14 Scheduler VOQs VOQs: How Packets Move

Spring – Packet Switch Architectures15 Question: do more lanes help?  Answer: it depends on the scheduling Head of Line BlockingVOQs with Bad Scheduling Good Scheduling? Ayalon: depends on traffic matrix…

Spring – Packet Switch Architectures16 Basic Switch Model A 1 (n) S(n) N N Q NN (n) A 1N (n) A 11 (n) Q 11 (n) 11 A N (n) A NN (n) A N1 (n) D 1N (n) D 11 (n) D NN (n) D N1 (n)

Spring – Packet Switch Architectures17 Notations: Arrivals  A ij (n): packet arrivals at input i for output j at time-slot n  A ij (n) = 0 or 1  ij =E[A ij (n)]: arrival rate   =[ ij ]: traffic matrix  A=[A ij (n)] admissible iff:  For all i,  j ij < 1: no input is oversubscribed  For all j,  i ij < 1: no output is oversubscribed

Spring – Packet Switch Architectures18 Notations: Schedule  Q ij (n): queue size of VOQ (i,j)  Q=[Q ij (n)]  S ij (n): whether the schedule connects input i to output j  S ij (n) = 0 or 1  No speedup: each input is connected to at most one output, each output to at most one input  We will assume that each input is connected to exactly one output, and each output to exactly one input  S=[S ij (n)] permutation matrix

Spring – Packet Switch Architectures19 Scheduling Algorithm  What it does: determine S(n)  How:  Either using traffic matrix ,  Or, in most cases, using queue sizes Q(n) (because  unknown)  Objective: 100% throughput  So that lines are fully utilized  Secondary objective: minimize packet delays/backlogs

Spring – Packet Switch Architectures20 What is “100% throughput”?  Work-conserving scheduler  Definition: If there is one or more packet in the system for an output, then the output is busy.  An output queued switch is work-conserving.  Each output can be modeled as an independent single-server queue.  If  then E[Q ij (n)] < C for some C.  Therefore, we say it achieves “100% throughput”.  For fixed-sized packets, work-conservation also minimizes average packet delay. (Q: What happens when packet sizes vary?)  Non work-conserving scheduler  An input-queued switch is, in general, non work-conserving.  Q: What definitions make sense for “100% throughput”?

Spring – Packet Switch Architectures21 Some common definitions of 100% throughput 1. Work-conserving 2. For all n,i,j, Q ij (n) < C, i.e., 3. For all n,i,j, E[Q ij (n)] < C i.e., 4. Departure rate = arrival rate, i.e., weaker We will focus on this definition.

Spring – Packet Switch Architectures22 Achieving 100% throughput 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic, but known traffic matrix  Technique: Non-uniform schedule (Birkhoff-von Neumann) 4. Unknown traffic matrix  Technique: Lyapunov functions (MWM) 5. Faster scheduling algorithms  Technique: Speedup (maximal matchings)  Technique: Memory and randomization (Tassiulas)  Technique: Twist architecture (buffered crossbar) 6. Accelerate scheduling algorithm  Technique: Pipelining  Technique: Envelopes  Technique: Slicing 7. No scheduling algorithm  Technique: Load-balanced router

Spring – Packet Switch Architectures23 Uniform Traffic  Definition: ij = for all i,j  i.e., all input-output pairs have same traffic rate  Condition for admissible traffic: < 1/N  Example: Bernoulli traffic  =  /N  Arrivals at input i are Bernoulli(  ) and i.i.d.

Spring – Packet Switch Architectures24 Algorithms that give 100% throughput for uniform traffic  Nearly all algorithms in literature can give 100% throughput when traffic is uniform  For example:  Uniform cyclic.  Random permutation.  Wait-until-full [simulations].  Maximum size matching (MSM) [simulations].  Maximal size matching (e.g. WFA, PIM, iSLIP) [simulations].

Spring – Packet Switch Architectures25 Uniform Cyclic Scheduling A1 B C D B C D B C D A1 A1   Each (i,j) pair is served every N time slots: Geom/D/1 Stable for  < 1

Spring – Packet Switch Architectures26 Wait-until-full  We don’t have to do much at all to achieve 100% throughput when arrivals are Bernoulli IID uniform.  For example, simulation suggests that the following algorithm leads to 100% throughput.  Wait-until-full:  If any VOQ is empty, do nothing (i.e. serve no queues).  If no VOQ is empty, pick a random permutation.

Spring – Packet Switch Architectures27 Maximum Size Matching (MSM)  Intuition: maximize instantaneous throughput  Simulations suggest 100% throughput for uniform traffic. Q 11 (n)>0 Q N1 (n)>0 Request Graph Bipartite Match Maximum Size Match

Spring – Packet Switch Architectures28 Some simple algorithms that achieve 100% throughput Wait until full Maximal Matching Algorithm (iSLIP) MSM Uniform Cyclic

Spring – Packet Switch Architectures29 Uniform Random Scheduling  At each time-slot, pick a schedule uar among:  The N cyclic permutations  Or the N! permutations  Then P(S i,j =1) = 1/N  Q: why? A1 B C D B C D B C D A1 A1

Spring – Packet Switch Architectures30 Uniform Random Scheduling  We get a Geom/Geom/1 system:  We studied the birth-death chain  We get:  Stable when  < 1  

Spring – Packet Switch Architectures31 Achieving 100% throughput 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic, but known traffic matrix  Technique: Non-uniform schedule (Birkhoff-von Neumann) 4. Unknown traffic matrix  Technique: Lyapunov functions (MWM) 5. Faster scheduling algorithms  Technique: Speedup (maximal matchings)  Technique: Memory and randomization (Tassiulas)  Technique: Twist architecture (buffered crossbar) 6. Accelerate scheduling algorithm  Technique: Pipelining  Technique: Envelopes  Technique: Slicing 7. No scheduling algorithm  Technique: Load-balanced router

Spring – Packet Switch Architectures32 Non-Uniform Traffic  Assume the traffic matrix is:   is admissible  … and non-uniform

Spring – Packet Switch Architectures33 Uniform Schedule?  What if uniform schedule?  Each VOQ serviced at rate  = 1/N = 1/4  But arrivals to VOQ(1,2) have rate 12 = 0.57  Birth-death chain with birth rate > death rate  switch unstable!  Need to adapt schedule to traffic matrix

Spring – Packet Switch Architectures34 Example 1: (Trivial) scheduling to achieve 100% throughput  Assume we know the traffic matrix, it is admissible, and it follows a permutation:  Then we can simply choose:

Spring – Packet Switch Architectures35  Assume we know the traffic matrix, and it doesn’t follow a permutation. For example:  Then we can choose the sequence of service permutations:  And either cycle though it or pick randomly  In general, if we know an admissible , can we pick a sequence S(n) so that  ? Example 2

Spring – Packet Switch Architectures36 Doubly Stochastic Matrices   is admissible, or “doubly (strictly) sub- stochastic”  Theorem 1 (von Neumann): There exists  ’={ ij ’} such that  <  ’ and  ’ is doubly stochastic:  i ij =  j ij = 1  Example:

Spring – Packet Switch Architectures37 Doubly Stochastic Matrices  Fact 1: the set of doubly stochastic matrices is convex, compact, in R n  Fact 2: any convex, compact set in R n has extreme points, and is equal to the convex hull of its extreme points (Krein-Milman Theorem)

Spring – Packet Switch Architectures38 Doubly Stochastic Matrices  Theorem 2 (Birkhoff): Permutation matrices are the extreme points of the set of doubly stochastic matrices  In other words: Given  ’, there exists K numbers  k >0 and K permutation matrices P k such that  Further, K · N 2 -2N+2.

Spring – Packet Switch Architectures39 Birkhoff-von Neumann (BvN) Scheduling  BvN decomposition:    ’  {  k, P k }  BvN weighted random scheduling: pick P k with proba.  k  Theorem: BvN scheduling achieves 100% throughput

Spring – Packet Switch Architectures40 BvN and 100% Throughput  Proof:  Lindley’s equation:  Birth-death chain  Birth rate: P(A ij (n)=1)=E[A ij (n)]= ij  Death rate:  Birth rate < death rate  100% throughput (“ergodic”)