Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Packet Switches.

Slides:



Advertisements
Similar presentations
EE384y: Packet Switch Architectures
Advertisements

1 Outline  Why Maximal and not Maximum  Definition and properties of Maximal Match  Parallel Iterative Matching (PIM)  iSLIP  Wavefront Arbiter (WFA)
1 Omega Network The omega network is another example of a banyan multistage interconnection network that can be used as a switch fabric The omega differs.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
1 Delta Network The delta network is one example of a multistage interconnection network that can be used as a switch fabric The delta network is an example.
Spring 2002CS 4611 Router Construction Outline Switched Fabrics IP Routers Tag Switching.
1 Performance Results The following are some graphical performance results out of the literature for different ATM switch designs and configurations For.
High-Speed Router Design. Content Classes of Routers Components of a Router High-Speed Router Lookup Advances in Switching Fabrics Speeding Up Output.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 High Speed Router Design Shivkumar Kalyanaraman Rensselaer Polytechnic Institute
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
Chapter 10 Switching Fabrics. Outline Physical Interconnection Physical box with backplane Individual blades plug into backplane slots Each blade contains.
1 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switch Fabric Architectures Vahid Tabatabaee Fall 2006.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
CSIT560 by M. Hamdi 1 Course Exam: Review April 18/19 (in-Class)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
EE 122: Router Design Kevin Lai September 25, 2002.
1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.
COMP680E by M. Hamdi 1 Course Exam: Review April 17 (in-Class)
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Computer Networks Switching Professor Hui Zhang
Load Balanced Birkhoff-von Neumann Switches
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
Dynamic Networks CS 213, LECTURE 15 L.N. Bhuyan CS258 S99.
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
ATM SWITCHING. SWITCHING A Switch is a network element that transfer packet from Input port to output port. A Switch is a network element that transfer.
ATM switching Ram Dantu. Introduction Important characteristics –switching speed –potential to lose cells Must minimize –queuing and switching delay Line.
TO p. 1 Spring 2006 EE 5304/EETS 7304 Internet Protocols Tom Oh Dept of Electrical Engineering Lecture 9 Routers, switches.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
Summary of switching theory Balaji Prabhakar Stanford University.
Router Architecture Overview
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #14 Shivkumar Kalyanaraman: GOOGLE: “Shiv RPI”
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
21-Dec-154/598N: Computer Networks Cell Switching (ATM) Connection-oriented packet-switched network Used in both WAN and LAN settings Signaling (connection.
Ch 8. Switching. Switch  Devices that interconnected with each other  Connecting all nodes (like mesh network) is not cost-effective  Some topology.
Data Communications, Kwangwoon University
Based on An Engineering Approach to Computer Networking/ Keshav
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Lecture Note on Switch Architectures. Function of Switch.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
Spring 2000CS 4611 Router Construction Outline Switched Fabrics IP Routers Extensible (Active) Routers.
CS 4594 Broadband Switching Elements and Fabrics.
Input buffered switches (1)
Structure of a switch We use switches in circuit-switched and packet- switched networks. In this section, we discuss the structures of the switches used.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
scheduling for local-area networks”
Weren’t routers supposed
Packet Forwarding.
Chapter 4: Network Layer
Packet Scheduling/Arbitration in Virtual Output Queues and Others
Principles of Communication Networks
Outline Why Maximal and not Maximum
Delta Network The delta network is one example of a multistage interconnection network that can be used as a switch fabric The delta network is an example.
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Chapter 4: Network Layer
Presentation transcript:

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Packet Switches

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 2 Packet switches q In a circuit switch, path of a sample is determined at time of connection establishment q No need for a sample header--position in frame used q In a packet switch, packets carry a destination field or label q Need to look up destination port on-the-fly q Datagram switches q lookup based on entire destination address (longest-prefix match) q Cell or Label-switches q lookup based on VCI or Labels q L2 Switches, L3 Switches, L4-L7 switches q Key difference is in lookup function (I.e. filtering), not in switching (I.e not in forwarding)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 3 Shared Memory Switches q Dual-ported RAM q Incoming cells converted from serial to parallel q Elegant, but memory speeds & port counts don’t scale q Output buffering q 100% throughput under heavy load q Minimize buffers q Eg: CNET Prelude, Hitachi shared buffer s/w, AT&T GCNS-2000

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 4 Shared memory fabrics: more… q Memory interface hardware expensive => many “ports” share fewer memory interfaces q Eg: dual-ported memory q Separate low-speed bus lines for controller

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 5 Shared Medium Switches q Share medium (I.e. bus/ring etc) instead of memory q Medium has to be N times as fast q Address filters & output buffers at the medium speed also! q TDM + round robin q Egs: IBM PARIS & plaNET s/w, Fore Forerunner ASX-100, NEC ATOM

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 6 Fully Interconnected Switches q Full interconnections q Broadcast + address-filters q Multicasting is natural q Output queuing q All hardware same speed => scalable q Quadratic growth of buffers/filters q Knockout switch (AT&T) reduced # of buffers: fixed L (=8) buffers per output + a tournament method to eliminate packets q Small residual packet loss rate (1/million) q Egs: Fujitsu bus matrix, GTE SPANet

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 7 Crossbar: “Switched” interconnections q 2N media (I.e. buses), BUT… q Use “switches” between each input and output bus instead of broadcasting q Total number of “paths” required = N+M q Number of switching points = NxM q Arbitration/scheduling needed to deal with port contention

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 8 Multi-Stage Fabrics q Compromise between pure time-division and pure space division q Attempt to combine advantages of each q Lower cost from time-division q Higher performance from space-division q Technique: Limited Sharing q Eg: Banyan switch q Features q Scalable q Self-routing, I.e. no central controller q Packet queues allowed, but not required q Note: multi-stage switches share the “crosspoints” which have now become “expensive” resources…

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 9 Multi-stage switches: fewer crosspoints q Issue: output & internal blocking…

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 10 Banyan Switch Fabric (Contd) q Basic building block = 2x2 switch, labelled by 0/1 q Can be synchronous or asynchronous q Asynchronous => packets can arrive at arbitrary times q Synchronous banyan offers TWICE the effective throughput! q Worst case when all inputs receive packets with same label

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 11 Switch fabric element q Goal: “self-routing” fabrics q Build complicated fabrics from a simple elements q Routing rule: if 0, send packet to upper output, else to lower output q If both packets to same output, buffer or drop

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 12 Multi-stage Interconnects (MINs): Banyan q Key: reduce the number of crosspoints in a crossbar q 8x8 banyan: Recursive design q Use the first bit to route the cell through the first stage, either to the upper or lower 4x4 network, q Last 2 bits to route the cell through the 4x4 network to the appropriate output port. q Self-routing: output address completely specifies the route through the network (aka digit- controlled routing) q Simple elements, scalable, parallel routing, elements at same speed q Eg: Bellcore Sunshine, Alcatel DN 1100

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 13 Banyan Fabric: another view…

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 14 Banyan q Simplest self-routing recursive fabric q Two packets want to go to the same output => output blocking q Banyan: packets may block even if they want to go to different outputs => internal blocking! q Unlike crossbar: because it has fewer crosspoints q However, feasible non-blocking schedules exist => pre-sort & shuffle packets to get to such non-blocking schedules

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 15 Non-Blocking Batcher-Banyan Batcher SorterSelf-Routing Network Fabric can be used as scheduler. Batcher-Banyan network is blocking for multicast.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 16 Blocking in Banyan S/ws: Sorting q Can avoid blocking by choosing order in which packets appear at input ports q If we can q present packets at inputs sorted by output q “trap” duplicates (I.e. going to same o/p port) q remove gaps q precede banyan with a perfect shuffle stage q then no internal blocking q For example: [X, 010, 010, X, 011, X, X, X]: q Sort => [010, 011, 011, X, X, X, X, X] q Trap duplicates => [010, 011, X, X, X, X, X, X] q Shuffle => [010, X, 011, X, X, X, X, X] q Need sort, shuffle, and trap networks

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 17 Sorting using Merging q Build sorters from merge networks q Assume we can merge two sorted lists q Sort pairwise, merge, recurse

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 18 Putting together: Batcher-Banyan

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 19 Scaling Banyan Networks: Challenges 1. Batcher-banyan networks of significant size are physically limited by the possible circuit density and number of input/output pins of the integrated circuit. To interconnect several boards, interconnection complexity and power dissipation place a constraint on the number of boards that can be interconnected 2. The entire set of N cells must be synchronized at every stage 3. Large sizes increases the difficulty of reliability and repairability 4. All modifications to maximize the throughput of space- division networks increase the implementation complexity

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 20 Other Non-Blocking Fabrics Clos Network

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 21 Other Non-Blocking Fabrics Clos Network Expansion factor required = 2-1/N (but still blocking for multicast)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 22 Blocking and Buffering

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 23 Blocking in packet switches q Can have both internal and output blocking q Internal q no path to output q Output q trunk unavailable q Unlike a circuit switch, cannot predict if packets will block (why?) q If packet is blocked => must either buffer or drop

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 24 Dealing with blocking in packet switches q Over-provisioning q internal links much faster than inputs q Buffers q at input or output q Backpressure q if switch fabric doesn’t have buffers, prevent packet from entering until path is available q Parallel switch fabrics q increases effective switching capacity

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 25 Blocking in Banyan Fabric

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 26 Buffering: where? q Input q Output q Internal q Re-circulating

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 27 Queuing: input, output buffers

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 28 Switch Fabrics: Buffered crossbar q What happens if packets at two inputs both want to go to same output? q Can defer one at an input buffer q Or, buffer cross-points: complex arbiter

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 29 Queuing: Two basic practical techniques Input Queueing Output Queueing Usually a non-blocking switch fabric (e.g. crossbar) Usually a fast bus

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 30 Queuing: Output Queueing Individual Output QueuesCentralized Shared Memory Memory b/w = (N+1).R 1 2 N Memory b/w = 2N.R 1 2 N

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 31 Output Queuing

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 32 Input Queuing

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 33 Input Queueing Head of Line Blocking Delay Load 58.6% 100%

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 34 Solution: Input Queueing w/ Virtual output queues (VOQ)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 35 Head-of-Line (HOL) in Input Queuing

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 36 Input Queues Virtual Output Queues Delay Load 100%

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 37 Input Queueing Scheduler Memory b/w = 2R Can be quite complex!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 38 Input Queueing Scheduling

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 39 Input Queueing Scheduling: Example Request Graph Bipartite Matching (Weight = 18)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 40 Input Queueing Longest Queue First or Oldest Cell First Weight Waiting Time 100% Queue Length { } =

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 41 Input Queueing Scheduling q Maximum Size q Maximizes instantaneous throughput q Does it maximize long-term throughput? q Maximum Weight q Can clear most backlogged queues q But does it sacrifice long-term throughput?

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 42 Input Queuing Why is serving long/old queues better than serving maximum number of queues? When traffic is uniformly distributed, servicing the maximum number of queues leads to 100% throughput. When traffic is non-uniform, some queues become longer than others. A good algorithm keeps the queue lengths matched, and services a large number of queues. VOQ # Avg Occupancy Uniform traffic VOQ # Avg Occupancy Non-uniform traffic

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 43 Input Queueing Practical Algorithms q Maximal Size Algorithms q Wave Front Arbiter (WFA) q Parallel Iterative Matching (PIM) q iSLIP q Maximal Weight Algorithms q Fair Access Round Robin (FARR) q Longest Port First (LPF)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 44 iSLIP Requests Grant Accept/Match #1 #2 Round-Robin Selection

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 45 iSLIP Properties q Random under low load q TDM under high load q Lowest priority to MRU q 1 iteration: fair to outputs q Converges in at most N iterations. On average <= log 2 N q Implementation: N priority encoders q Up to 100% throughput for uniform traffic

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 46 iSLIP

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 47 iSLIP

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 48 iSLIP Implementation Grant Accept 1 2 N 1 2 N State N N N Decision log 2 N Programmable Priority Encoder

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 49 Throughput results Theory: Practice: Input Queueing (IQ) Input Queueing (IQ) Input Queueing (IQ) Input Queueing (IQ) 58% [Karol, 1987] IQ + VOQ, Maximum weight matching IQ + VOQ, Maximum weight matching IQ + VOQ, Sub-maximal size matching e.g. PIM, iSLIP. IQ + VOQ, Sub-maximal size matching e.g. PIM, iSLIP. 100% [M et al., 1995] Different weight functions, incomplete information, pipelining. Different weight functions, incomplete information, pipelining. Randomized algorithms 100% [Tassiulas, 1998] 100% [Various] Various heuristics, distributed algorithms, and amounts of speedup Various heuristics, distributed algorithms, and amounts of speedup IQ + VOQ, Maximal size matching, Speedup of two. IQ + VOQ, Maximal size matching, Speedup of two. 100% [Dai & Prabhakar, 2000]

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 50 Speedup: Context MemoryMemory MemoryMemory The placement of memory gives - Output-queued switches - Input-queued switches - Combined input- and output-queued switches A generic switch

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 51 Output-queued switches Best delay and throughput performance - Possible to erect “bandwidth firewalls” between sessions Main problem - Requires high fabric speedup (S = N) Unsuitable for high-speed switching

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 52 Input-queued switches Big advantage - Speedup of one is sufficient Main problem - Can’t guarantee delay due to input contention Overcoming input contention: use higher speedup

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 53 The Speedup Problem Find a compromise: 1 < Speedup << N - to get the performance of an OQ switch - close to the cost of an IQ switch Essential for high speed QoS switching

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 54 Intuition Speedup = 1 Speedup = 2 Fabric throughput =.58 Bernoulli IID inputs Fabric throughput = 1.16 Bernoulli IID inputs I/p efficiency,  = 1/1.16 Ave I/p queue = 6.25

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 55 Intuition (continued) Speedup = 3 Fabric throughput = 1.74 Bernoulli IID inputs Input efficiency = 1/1.74 Speedup = 4 Fabric throughput = 2.32 Bernoulli IID inputs Input efficiency = 1/2.32 Ave I/p queue = 0.75 Ave I/p queue = 1.35