Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.

Slides:



Advertisements
Similar presentations
Modeling the Interactions of Congestion Control and Switch Scheduling Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty.
Advertisements

Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Routers with a Single Stage of Buffering Sundar Iyer, Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University,
Worst-case Fair Weighted Fair Queueing (WF²Q) by Jon C.R. Bennett & Hui Zhang Presented by Vitali Greenberg.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
Data-Centric Energy Efficient Scheduling for Densely Deployed Sensor Networks IEEE Communications Society 2004 Chi Ma, Ming Ma and Yuanyuan Yang.
1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.
Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS, Technion), Isaac Keslassy (EE, Technion)
CS 268: Router Design Ion Stoica March 1, 2004.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE)
1 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switch Fabric Architectures Vahid Tabatabaee Fall 2006.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
CSIT560 by M. Hamdi 1 Course Exam: Review April 18/19 (in-Class)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
The Crosspoint Queued Switch Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)
EE 122: Router Design Kevin Lai September 25, 2002.
A Real-Time Video Multicast Architecture for Assured Forwarding Services Ashraf Matrawy, Ioannis Lambadaris IEEE TRANSACTIONS ON MULTIMEDIA, AUGUST 2005.
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
COMP680E by M. Hamdi 1 Course Exam: Review April 17 (in-Class)
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
Surprise Quiz EE384Z: McKeown, Prabhakar ”Your Worst Nightmares in Packet Switching Architectures”, 3 units [Total time = 15 mins, Marks: 15, Credit is.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Computer Networks Switching Professor Hui Zhang
Buffer Management for Shared- Memory ATM Switches Written By: Mutlu Apraci John A.Copelan Georgia Institute of Technology Presented By: Yan Huang.
An Integrated IP Packet Shaper and Scheduler for Edge Routers MSEE Project Presentation Student: Yuqing Deng Advisor: Dr. Belle Wei Spring 2002.
TO p. 1 Spring 2006 EE 5304/EETS 7304 Internet Protocols Tom Oh Dept of Electrical Engineering Lecture 9 Routers, switches.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
High Speed Stable Packet Switches Shivendra S. Panwar Joint work with: Yihan Li, Yanming Shen and H. Jonathan Chao New York State Center for Advanced Technology.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms Thursday, October 08, 2015 Feng Wang Siu Hong Yuen.
Summary of switching theory Balaji Prabhakar Stanford University.
1 IK1500 Communication Systems IK1500 Anders Västberg
Engineering Jon Turner Computer Science & Engineering Washington University Coarse-Grained Scheduling for Multistage Interconnects.
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
March 29 Scheduling ?. What is Packet Scheduling? Decide when and what packet to send on output link 1 2 Scheduler flow 1 flow 2 flow n Buffer management.
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
Variable Packet Size Buffered Crossbar (CICQ) Switches Manolis Katevenis, Georgios Passas, Dimitrios Simos, Ioannis Papaefstathiou, and Nikos Chrysos FORTH.
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
Queuing Delay 1. Access Delay Some protocols require a sender to “gain access” to the channel –The channel is shared and some time is used trying to determine.
1 Queuing Delay and Queuing Analysis. RECALL: Delays in Packet Switched (e.g. IP) Networks End-to-end delay (simplified) = End-to-end delay (simplified)
Intel Slide 1 A Comparative Study of Arbitration Algorithms for the Alpha Pipelined Router Shubu Mukherjee*, Federico Silla !, Peter Bannon $, Joel.
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
Queue Scheduling Disciplines
Topics in Internet Research: Project Scope Mehreen Alam
Scheduling algorithms for CIOQ switches Balaji Prabhakar.
Predictive High-Performance Architecture Research Mavens (PHARM), Department of ECE The NoX Router Mitchell Hayenga Mikko Lipasti.
Input buffered switches (1)
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
scheduling for local-area networks”
Routing and Switching Fabrics
Buffer Management and Arbiter in a Switch
CS 268: Router Design Ion Stoica February 27, 2003.
Packet Forwarding.
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Buffer Management for Shared-Memory ATM Switches
Routing and Switching Fabrics
Introduction to Packet Scheduling
Introduction to Packet Scheduling
Presentation transcript:

Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook

Outline Introduction Related work Localized asynchronous packet scheduling Simulation results Conclusions

Introduction Crossbar switches have long been the preferred structures for high speed switches and routers: Provide non-blocking capability. Overcome the bandwidth limitation of bus-based switches. Packet forwarding is simple.

Introduction For a crossbar switch, packets may be buffered at either Output ports Input ports Crosspoints

Introduction Output queued (OQ) switches only have buffer space at the output side. Achieve 100% throughput. Require speedup of N for an NxN switch. Input queued (IQ) switches only have buffer space at the input side. Require no speedup. Have to work with high time complexity algorithms in order to achieve 100% throughput.

Introduction Combined input-output queued (CIOQ) switches make a trade-off between the crossbar speedup and the complexity of the scheduling algorithms. Have small fixed speedup of two. Achieve 100% throughput with any iterative maximal matching algorithms. Emulate OQ switches.

Introduction Buffered crossbar switches are a special type of CIOQ switches. Each crosspoint of the crossbar has a small buffer. Crosspoint buffers eliminate the input and output contention. Buffered crossbar switches can directly schedule and switch variable length packets.

Introduction Previous scheduling algorithms for crossbar switches mainly focused on fixed length packet scheduling or cell scheduling. At input ports, new packets are segmented into fixed length cells. The cells are used as the scheduling units and transmitted across the switching fabric. At output ports, the cells are reassembled into original packets.

Introduction Variable length packet scheduling, or packet scheduling, improves the switch efficiency by avoiding the segmentation-and-reassemble (SAR) process. Higher throughput. Shorter packet latency. Lower hardware cost.

Introduction [Turner Infocom’06] proposed two packet scheduling algorithms for buffered crossbar switches. They can provide work-conserving guarantees, or emulate scheduling algorithms for OQ switches. They schedule packets by imposing an order on buffered packets. Each crosspoint needs 2L or more buffer space, where L is the maximum packet length.

Introduction We consider the other side of the problem, low time complexity and easy to implement packet scheduling algorithms. We present the Localized Asynchronous Packet Scheduling (LAPS) algorithm and analyze its performance. Local info based No comparison Crosspoint buffer size of L

Outline Introduction Related work Localized asynchronous packet scheduling Simulation results Conclusions

Related work Scheduling algorithms in the literature for buffered crossbar switches are generally designed with two possible objectives: To achieve high throughput. To emulate scheduling algorithms for OQ switches. The latter is a stronger requirement, but the implementation of the former can be simpler.

Related work Cell scheduling algorithms for high throughput CIXB-1, CIXOB-k, MCBF, SCBF… Cell scheduling algorithms to emulate scheduling algorithms for OQ switches GBVOQ_OCF, GBFG_SP, MCAF-LTF… Packet scheduling schemes Packet VOQ, Packet LOOFA, DPFQ…

Outline Introduction Related work Localized asynchronous packet scheduling Simulation results Conclusions

Localized asynchronous packet scheduling Structure of a buffered crossbar switch In i : input port Out j : output port B ij : crosspoint buffer Q ij : virtual queue The crossbar has speedup of two.

Localized asynchronous packet scheduling Based on the locations of the packets to be scheduled, there are three types of scheduling involved in a buffered crossbar switch. Input scheduling Crossbar scheduling Output scheduling

Localized asynchronous packet scheduling Output scheduling has been well studied, and various scheduling algorithms are proposed. Output scheduling usually does not affect the throughput performance as long as they are work-conserving. We use a simple FIFO algorithm for output scheduling, which is work-conserving.

Localized asynchronous packet scheduling For input scheduling, Select a backlogged virtual queue whose crosspoint buffer is empty, and send its head packet to the crosspoint buffer. When there are multiple eligible virtual queues, different arbitration rules can be used. Since the crossbar has speedup of two, the packet is sent to the crosspoint buffer with bandwidth 2R. Crossbar scheduling is similar.

Localized asynchronous packet scheduling In order to reduce the packet latency, cut- through switching can be used on the crossbar. Similarly, cut-through switching can be used at output ports.

Localized asynchronous packet scheduling

In input scheduling, the scheduling candidates of an input port are only the virtual queues whose crosspoint buffers are empty. This restriction simplify the implementation by enabling one bit to represent the status of the crosspoint buffer.

Localized asynchronous packet scheduling With speedup of two, LAPS achieves 100% throughput for any admissible traffic. Define Z ij (t)=Q ij (t)+B ij (t) If B ij is not empty at time t, ∑ k Z kj (t) has a negative derivative. If Q ij is not empty at time t, ∑ k Q ik (t) +∑ k Z kj (t) has a negative or zero derivative.

Localized asynchronous packet scheduling Assume that the traffic arrives according to a Poisson process and the packet length follows an exponential distribution with mean M. In i can be approximately modeled as an M/M/1 system, and accordingly

Localized asynchronous packet scheduling Hardware implementation Only local info is necessary, and it is suitable for distributed implementation and highly scalable. Since no comparison is necessary, the arbiters can implemented by priority encoders, which can make fast decisions in hardware. Since each crosspoint buffer needs only L buffer space, it minimize the cost for the switch.

Outline Introduction Related work Localized asynchronous packet scheduling Simulation results Conclusions

Simulation results We have conducted simulations to verify the 100% throughput of LAPS and to measure its delay and buffer requirement. We consider five different LAPS implementations: Fixed priority (FP) Random (RD) Round-robin (RR) Oldest packet first (OPF) Longest queue first (LQF)

Simulation results In order to reflect the burst nature of real network traffic, we emulate the incoming traffic by a Markov modulated Poisson process.

Simulation results We considered both uniform traffic and non- uniform traffic. The packet length in the simulation is uniformly distributed between [50, 1500] bytes. We consider a 16×16 switch, and each input port or output port has bandwidth of 1G bps.

Simulation results Throughput

Simulation results Average delay

Simulation results Maximum queue length

Outline Introduction Related work Localized asynchronous packet scheduling Simulation results Conclusions

Due to the introduction of crosspoint buffers, buffered crossbar switches can directly schedule and transmit variable length packets. Packet scheduling algorithms avoid SAR and are more efficient than cell scheduling algorithms. Higher throughput Shorter latency Lower hardware cost

Conclusions We presented the Localized Asynchronous Packet Scheduling (LAPS) scheme. Local info based No comparison Crosspoint buffer of size L We theoretically proved that LAPS achieves 100% throughput with speedup of two, and conducted simulations to verify the results.

Thank you! Questions?