1 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switch Fabric Architectures Vahid Tabatabaee Fall 2006.

Slides:



Advertisements
Similar presentations
Technische universiteit eindhoven 1 Problem 16: Design-space Exploration Jeroen Voeten, Bart Theelen Eindhoven University of Technology Embedded Systems.
Advertisements

1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
Reducing Network Energy Consumption via Sleeping and Rate- Adaption Sergiu Nedevschi, Lucian Popa, Gianluca Iannaccone, Sylvia Ratnasamy, David Wetherall.
1.  Congestion Control Congestion Control  Factors that Cause Congestion Factors that Cause Congestion  Congestion Control vs Flow Control Congestion.
Traffic Manager Vahid Tabatabaee Fall 2007.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
EE 122: Router Design Kevin Lai September 25, 2002.
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
Introduction. 2 What Is SmartFlow? SmartFlow is the first application to test QoS and analyze the performance and behavior of the new breed of policy-based.
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
August 20 th, A 2.5Tb/s LCS Switch Core Nick McKeown Costas Calamvokis Shang-tse Chuang Accelerating The Broadband Revolution P M C - S I E R R.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
Chapter 4 Queuing, Datagrams, and Addressing
Computer Networks Switching Professor Hui Zhang
Load Balanced Birkhoff-von Neumann Switches
Merits of a Load-Balanced AAPN 1.Packets within a flow are transported to their correct destinations in sequence. This is due to the 1:1 logical connection.
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
High Speed Stable Packet Switches Shivendra S. Panwar Joint work with: Yihan Li, Yanming Shen and H. Jonathan Chao New York State Center for Advanced Technology.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms Thursday, October 08, 2015 Feng Wang Siu Hong Yuen.
Author : Jing Lin, Xiaola Lin, Liang Tang Publish Journal of parallel and Distributed Computing MAKING-A-STOP: A NEW BUFFERLESS ROUTING ALGORITHM FOR ON-CHIP.
Cisco 3 - Switching Perrine. J Page 16/4/2016 Chapter 4 Switches The performance of shared-medium Ethernet is affected by several factors: data frame broadcast.
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
Final Chapter Packet-Switching and Circuit Switching 7.3. Statistical Multiplexing and Packet Switching: Datagrams and Virtual Circuits 4. 4 Time Division.
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
Lecture 3 Applications of TDM ( T & E Lines ) & Statistical TDM.
Forwarding.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
Intel Slide 1 A Comparative Study of Arbitration Algorithms for the Alpha Pipelined Router Shubu Mukherjee*, Federico Silla !, Peter Bannon $, Joel.
Lecture Note on Switch Architectures. Function of Switch.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
Topics in Internet Research: Project Scope Mehreen Alam
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Input buffered switches (1)
Providing QoS in IP Networks
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
-1/16- Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks C.-K. Toh, Georgia Institute of Technology IEEE.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
Buffer Management and Arbiter in a Switch
Lecture 23: Interconnection Networks
Switching and High-Speed Networks
CS 268: Router Design Ion Stoica February 27, 2003.
Packet Forwarding.
Addressing: Router Design
Stability Analysis of MNCM Class of Algorithms and two more problems !
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Chapter 2 Switching.
Switching Chapter 2 Slides Prepared By: -
Multiprocessors and Multi-computers
Presentation transcript:

1 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switch Fabric Architectures Vahid Tabatabaee Fall 2006

2 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures References  Light Reading Report on Switch Fabrics, available online at:  Title: Network Processors Architectures, Protocols, and Platforms Author: Panos C. Lekkas Publisher: McGraw-Hill  I. Elhanany, D. Chiou, V. Tabatabaee, R. Noro, A. Poursepanj, “The Network Processing Forum Switch Fabric Benchmark Specifications: An Overview”, IEEE Network Magazine, March/April 2005.

3 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switching Element  There is no major buffering in the switching element.  The only buffering is for alignment of the cells.  Incoming cells after alignment are simultaneously switched to the output ports  The performance of the switch is very much dependent on the scheduling algorithm.

4 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Switching Element Architecture

5 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Data flow in the switching element  Cells are continuously sent from line card to the switch card and from the switch card to the line card.  Transmitted cells may not have valid data.  Switch scheduler decides about connection between input and output port and then send the corresponding command to the line interface chip.  The line interface chip send one cell destined to the corresponding output port to the switch.  The switching element needs to have some information about the backlogged cells in the input ports.  The line card interface needs to know about its designated output port in the next time slot.  The last two bullets info. are sent through the cell header from the line interface to the switch and from the switch to the line interface respectively.

6 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Why do we need cell alignment?  Consider a simple 2x2 switch  Red cells are destined to output 1 and blue cells to output 2  We need cell alignment if line cards are not equally distanced from the switch cards.

7 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Why do we need cell alignment?  If the cells are not aligned we may end up with switching cells to the wrong destination or contention between cells going to the same destination

8 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Why do we need cell alignment?  We can buffer the cells either in the switch chip or the line card to artificially equalize distances.

9 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Switch Throughput  Throughput is the maximum normalized traffic rate between the line card and the switch card.  Throughput can not be larger than one.  Throughput is usually demonstrated by the average delay versus normalized rate plot.  Theoretically it looks like a hockey stick!  In practice since the buffering is limited delay curve gets saturated.

10 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures What causes throughput limitation  If there is no contention between the input and output ports throughput can go up to 100%.  Due to contention some ports can remain idle even though they have cell to send/receive.  The scheduling algorithm decides about input-output connection and resolves contentions.  Therefore scheduling algorithm determines throughput of a switch.

11 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Scheduling Problem  Scheduling algorithm specifies input-output contention.  We can model a switch as a bipartite graph.  We have two set of nodes corresponding to the input and output ports.  There is a link between two nodes if there is buffered cell for that connection.  The scheduling algorithm finds a matching in the given bipartite graph.

12 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures 100% Throughput Scheduling  Is it possible to achieve 100% throughput in crossbar based schedulers?  We can achieve 100% throughput with maximum weighted matching.  Each link has weight equal to number of backlogged cells.  We find the matching with maximum total weight.  This guarantees to achieve 100% throughput MWM

13 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Alternative 100% Throughput Algorithms  Alternative algorithms to achieve 100% throughput.  Maximum Weighted Matching (MWM): Maximizes total weight of links; O(N 3 ) complexity.  Longest Port First (LPF): Maximizes total weight of nodes; O(N 3 ) complexity.  Maximum Node Containing Matching (MNCM): Includes all nodes that their weight are greater than (1-1/N) of maximum node weight; O(N 2.5 ) complexity MWMLPFMNCM

14 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Practical Approaches  These algorithms are not amenable to hardware implementation  We use simple algorithms that are simple and can be implemented in hardware.  To compensate for their low performance we make the switch works faster than the line-card (speedup).  It is proved that any maximal size matching with 2X speedup can achieve 100% throughput.  A matching is maximal if it is not possible to add anymore link to the matching.

15 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures iSLIP Scheduling Algorithm  There is an arbiter associated with every input and output node.  Every arbiter receives up to N active signals and select one of them using a round-robin scheduler.  Every output arbiter receives request signal from all inputs that have a backlogged cell.  It grants the first request after the previously ACCEPTED grant.  Input arbiters accept the first grant after the previously accepted grant.  Every arbiter has a pointer that points to the previously accepted port.

16 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Arbiter Connections Output ArbitersInput Arbiters

17 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Inside an Arbiter

18 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Multiple Iteration  We can increase matching size by doing multiple iterations.  The arbiter pointers are only updated after the first iteration.  Grant and Accept arbiters can perform their function in one clock cycle.  If we want to do k iterations we need 2k clock cycles without pipelining.  We can pipeline the job and reduce the time required. Grant1Accept1 Grant2Accept2 Grant3Accept3

19 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures iSLIP Throughput and arrival process  Good performance for uniform traffic.  Degraded performance for non-uniform traffic.  In general performance of a switch depends on the characteristics of the input data.  In a switch there are three important characteristics:  Arrival Pattern:  Uniform: Usually modeled as Bernoulli i.i.d arrivals. At each time slot there is a probability p of new arrival.  Non-uniform: Usually modeled with a two-state Markov Chain  If we are in ON state we keep generating packets.  If we are in OFF state no packet is generated.  Packet length: Number of bytes in generated packets.  Load distribution: Destination of packets generated at each input  Uniform: Packets are divide among destinations with equal probability  Non-Uniform: Some destinations are more probable (Hot Spots).

20 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Typical uniform traffic throughput

21 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Typical non-uniform traffic throughput curve

22 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Benchmarking & Comparison of Switch Fabrics  How do we have to compare switch fabrics  First we have to compare general design parameters.  Second we have to compare performance of the fabrics.

23 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Primary Design Parameters 1.Switching Capacity 2.Sample Availability 3.NPU/TM Interfaces 4.Integrated Traffic Management 5.Power (per 10 Gbit/s) 6.Price (per 10 Gbit/s) 7.Integrated Linecard SerDes Gbit/s Device Count Gbit/s (with 1:1 Redundancy) Device Count Gbit/s Device Count Gbit/s (with 1:1 Redundancy) Device Count 12.Switch Architecture 13.Guaranteed Latency 14.TDM Support 15.Sub-ports per 10-Gbit/s Line Interface 16.Traffic Flows per 10-Gbit/s Port 17.Frame Payload (Bytes) 18.Frame Distribution Across Fabric 19.Fabric Overspeed 20.Backplane Link Speed 21.Backplane Links per 10- Gbit/s Port 22.Redundancy Modes 23.Host Interface

24 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Performance Benchmarking  Traffic Modeling  Performance Metrics  Benchmark Suites

25 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Traffic Modeling  Destination Distribution:  The Zipf law has been proposed to model non- uniform traffic distribution between destinations.  k=0 corresponds to uniform traffic  k= infinity completely preferred destination  Typically k varies from 0 to 5

26 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Traffic Modeling  Packet arrival process:  Bernoulli i.i.d. arrivals  ON/OFF model  ON/OFF model with non-delimited burst streams  ON/OFF model with minimum burst size.  Mulitcast  Multiplicity factor: Realistically should not exceed 10 with an average value of 2-4.  Distribution of the detinations  QoS  Distribution of the traffic among a number of classes

27 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Performance Metrics  Fabric Latency: Latency between point 2 and 3.  Total Latency: Latency between point 1 and 3.  Accepted vs. offered bandwidth: The number of cells fabric accept at point 2 divided by the number of frames offered to it at point 1.  Jitter: Difference in the time interval between a pair of consecutive cells belonging to the same flow at the ingress and the egress.

28 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Benchmark Suites  Hardware Benchmarks:  Memory speed, processing speed, port-to-port minimum latency, switch fabric overhead, internal cell size….  In these test there is no contention between packets to minimize scheduling and arbitration impacts.  Zero load latency, maximum port load

29 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Benchmark Suites  Arbitration Benchmarks  Studies performance of the fabric when there is contention.  Performance is studied for different traffic patterns and load destination distribution.