Fast Switches Switch Fabric Architecture Fast Datagram Switches Higher-Layer and Active Processing (From Kwangwoon Univ.)

Slides:



Advertisements
Similar presentations
ATM Switch Architectures
Advertisements

Lecture 4. Topics covered in last lecture Multistage Switching (Clos Network) Architecture of Clos Network Routing in Clos Network Blocking Rearranging.
1 IK1500 Communication Systems IK1330 Lecture 3: Networking Anders Västberg
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
CSCI 465 D ata Communications and Networks Lecture 20 Martin van Bommel CSCI 465 Data Communications & Networks 1.
Survey of Packet Classification Algorithms. Outline Background and problem definition Classification schemes – One dimensional classification – Two dimensional.
1 ELEN 602 Lecture 18 Packet switches Traffic Management.
Router Architecture : Building high-performance routers Ian Pratt
Module 3.4: Switching Circuit Switching Packet Switching K. Salah.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Spring 2002CS 4611 Router Construction Outline Switched Fabrics IP Routers Tag Switching.
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
Chapter 4 Network Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 14.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
1 Version 3 Module 8 Ethernet Switching. 2 Version 3 Ethernet Switching Ethernet is a shared media –One node can transmit data at a time More nodes increases.
Chapter 10 Switching Fabrics. Outline Physical Interconnection Physical box with backplane Individual blades plug into backplane slots Each blade contains.
EE 122: Router Design Kevin Lai September 25, 2002.
Chapter 10 Introduction to Wide Area Networks Data Communications and Computer Networks: A Business User’s Approach.
ATM COMPONENTS Presented by: ANG BEE KEEWET CHONG SIT MEIWET LAI YIN LENGWET LEE SEANG LEIWET
Chapter 4 Queuing, Datagrams, and Addressing
Computer Networks Switching Professor Hui Zhang
LAN Overview (part 2) CSE 3213 Fall April 2017.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
ATM SWITCHING. SWITCHING A Switch is a network element that transfer packet from Input port to output port. A Switch is a network element that transfer.
TO p. 1 Spring 2006 EE 5304/EETS 7304 Internet Protocols Tom Oh Dept of Electrical Engineering Lecture 9 Routers, switches.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Router Architecture Overview
Chapter 6 Delivery and Forwarding of IP Packets
Survey of Performance Analysis on Banyan Networks Written By Nathan D. Truhan Kent State University.
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Delivery, Forwarding, and Routing of IP Packets
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #14 Shivkumar Kalyanaraman: GOOGLE: “Shiv RPI”
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet switching network Data is divided into packets. Transfer of information as payload in data packets Packets undergo random delays & possible loss.
Case Study: The Abacus Switch CS Goals and Considerations Handles cell relay (fixed-size packets) Can be modified to handle variable-sized packets.
Final Chapter Packet-Switching and Circuit Switching 7.3. Statistical Multiplexing and Packet Switching: Datagrams and Virtual Circuits 4. 4 Time Division.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Forwarding.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Ch 8. Switching. Switch  Devices that interconnected with each other  Connecting all nodes (like mesh network) is not cost-effective  Some topology.
Data Communications, Kwangwoon University
Lecture Note on Switch Architectures. Function of Switch.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
Spring 2000CS 4611 Router Construction Outline Switched Fabrics IP Routers Extensible (Active) Routers.
Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
CS 4594 Broadband Switching Elements and Fabrics.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
1 Building big router from lots of little routers Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Muhammad Waseem Iqbal Lecture # 20 Data Communication.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Chapter 3 Part 3 Switching and Bridging
Addressing: Router Design
Chapter 4: Network Layer
Chapter 3 Part 3 Switching and Bridging
Bridges and Extended LANs
Router Construction Outline Switched Fabrics IP Routers
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Chapter 3 Part 3 Switching and Bridging
CS 6290 Many-core & Interconnect
Project proposal: Questions to answer
Chapter 4: Network Layer
Presentation transcript:

Fast Switches Switch Fabric Architecture Fast Datagram Switches Higher-Layer and Active Processing (From Kwangwoon Univ.)

Introduction The way of the determination and setting the path –Centralized control : single point control –Distributed control : per input port processing –Self-routing : autonomous control –Distributed control & Self-routing Advantage : don’t limit scalability Disadvantage : difficult global optimization Blocking Characteristics –Strictly nonblocking –Wide-sense nonblocking : switching algorithm (set path) –Rearrangeably nonblocking : rearrange path –Virtually nonblocking : low probability of blocking Nonblocking Switch Fabric Principle : Avoid blocking by space-division parallelism, internal speedup, and internal pipelined buffering with cut-through

Buffering Why buffering? –If all traffic were uniform, buffering would not be needed. –If traffic is bursty, however, buffering would be needed because packets are to be dropped. IN 1 IN 2 OUT collisions delayed

Buffering Buffer Location –Unbuffered This way is undesirable for fast packet switches.(buffering) Optical components are suitable, because there is no way to queue. –Dealing with an optical burst switch »Dropping burst and retransmitting end-to-end are enable. »Burst be deflected by scheduling. »Convert burst to the electronic domain for queueing. –Internally buffered Increases complexity –Input or ouput queued –Input AND ouput queued –Shared buffer switch A logical partitioning of physical memory

Buffering Buffer location –Unbuffered vs internally buffered

Buffering Buffer Location –Input or output buffered switches.

Buffering Buffer Location –Combined input/output buffered switch

Buffering Buffer Location –Shared buffer switch

Buffering Head-of-line blocking –Input queueing Input queueing holds packets until the switch is able to direct them to the appropriate output –Output queueing Shared medium network due to contention from other network nodes for MAC –Speedup (S) : the ratio of internal to external data rates –Internal buffering –Internal expansion : clos fabic Head-of-line blocking Avoidance principle Output queueing requires internal speedup, expansion, or buffering. Virtual output requires additional queues or queuing complexity. The two techniques must be traded against on another,and can be used in combination

Buffering Head-of-Line Blocking –Input vursus output queueing

Buffering Head-of-line blocking –Clos fabric

Buffering Virtual Output Queueing –This scheme requires that packets be multiplexed and timestamped todetermain the arrival orderamong the queues at each input –A scheduling algorithm can be applied to determine which packets to accept to match a set of nonconflicting output –Disadvantage Waste of buffer space –tradoff Increase memory density for more queues practical Increased logic density makes more complex hardware

Buffering Virtual Output Queueing : –Head of line blocking can be eliminated

Single-Stage Shared Elements Bus Interconnects i 0 i 1 i 2 i 3 i 4 i 5 i 6 i 7 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7 w

Single-Stage Shared Elements Bus Interconnects –Packet must wait in input queues until the bus is free –Aggregate throughput : r i < w/nt (w:bandwidth, 1/nt:n port, bit rate) –Bus speedup is limited by the available electronic technology –Multicast –Ring Switches –Throughput can be higher due to better ring utilization of the MAC protocol and the isolation of electrical effects.

Single-Stage Shared Elements Shared Memory Fabrics Shared memory Output demultiplex INPUTMULTIPLEXINPUTMULTIPLEX I0I1I2I3I4I5I6I7I0I1I2I3I4I5I6I7 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7

Single-Stage Shared Elements Shared Memory Fabrics –Difficulties memory density is increaing exponentially, memory access time are not. Packet must typically be completely read into memory before being output –Multicast

Single-Stage Space Division Elements Basic Switch Element –Electronic Switch Elements : 2 * 2 switch element straightcrossduplicate Control Packet buffer Cut-through Output multiplexor o0o0 i1i1 c i0i0 o1o1

Single-Stage Space Division Elements Basic Switch Element –Electronic Switch Elements(2 * 2 Self-routing switch element) Control Packet buffer Cut-through Output multiplexor o0o0 i1i1 i0i0 o1o1 Header decode Header decode delay

Single-Stage Space Division Elements Basic Switch Element –Optical Switch Elements electrode i0i0 i1i1 o0o0 o0o0 i0i0 i1i1 o0o0 o0o0 Cross state straight state (voltage applied)

Single-Stage Space Division Elements Crossbar –Crossbar switch point states column ojoj i ojoj i electronic Optical MEMS crossturn duplicate

Multistage Switches Crossbar –Advantage Simple and regularity –Disadvantage Scaling complexity(n 2 ) Simple model of the cost in chip area –A=a c + n(a i + a o ) + n 2 a x

Single-Stage Space Division Elements Crossbar –Crossbar switch I0I1I2I3I4I5I6I7I0I1I2I3I4I5I6I7 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7

Multistage Switches Tiling Crossbar –Tile switch elements in a square array –This is not a cost effective solution for large switches Multistage Interconnection Networks(MINs) –Delta switch advantage –Elimination of central switch control(self-routing) Disadvantage –Preservation of packet sequence since cell has same path –Load is not distributed –Benes switch Dinamically route packets with additional stages. –Resequencing buffer by using a timestamp inserted into the internal switch header –Banyan switch Using shared memory and crossbar switchs.

1010 Multistage Switches Multistage Interconnection Networks –Delta switch I 0 I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I 11 I 12 I 13 I 14 I 15 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 o 13 o 14 o 15

Multistage Switches Multistage Interconnection Networks –Benes switch I 0 I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I 11 I 12 I 13 I 14 I 15 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 o 13 o 14 o 15 s 0 s 1 s 2 s 3 s 4 s 5 s

Multistage Switches Multistage Interconnection Networks –Banyan switch I 0 I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I 11 I 12 I 13 I 14 I 15 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 o 13 o 14 o 15 S0S0 S1S1

Multistage Switches Multistage Interconnection Networks –Optical Multistage Networks Incapable of buffering : nonblocking bufferless interconnection fabrics Crosstalk problem : dilation techniques Dilated Benes switch Pass Cross

Multistage Switches Scaling Speed(parallel switch slices) datadelay σ0σ0 σ m-1 ioio i1i1 I n-1 coco c1c1 c n-1 o n-1 o1o1 o1o1 Fabric Control

Multicast Support Crossbar Switch Multicast –Service disciplines No fanout splitting : according to output blocking Fanout splitting –The Goal of schedule servicing Throughput is high Some fairness measure is met, in particular packets should not be starved The scheduling discipline can be implemented at high-speed(line rate) –Variety of scheduling are possible Concentrates residue among as few inputs as possible Weight based

Multicast Support Crossbar Switch Multicast scheduling I 1 I 2 I 3 I 4 I 5 1_3_5 _2345_ _ _23_5 _2_4_ o 1 o 2 o 3 o 4 o 5

Multicast Support Multistage Fabric Multicast I 0 I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I 11 I 12 I 13 I 14 I 15 o 0 o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 o 13 o 14 o 15 Copy stagesRouting stagesTranslate

Review – Fast Packet Switching 80’s link rate technology improvement. Connection-oriented fast packet switching technologies for high speed networks. 90’s widely deployed. –ex. ATM for high-speed backbone networks Benefit (5.3) –Simplifying packet processing and forwarding. –Eliminating the store and forward latency. –Provide QoS quarantees, Resources reservation.

Fast Datagram Switches Resisted the global deployment of CON. –IP-based Internet, WWW. –shared medium link protocols were overcome Fast Datagram Switches –Motivation High Performance maintaining. Support Connectionless networks. –Derivation Complexity of Switch input and output processing

Fast Packet Switching Architecture

Connection-Oriented Vs Connectionless Similarity –At a high-level, Each switch has the same functional block. –Ex. Routing, Signaling, Management… Difference –Input processing Address lookup using a prefix table. Packet classification. –Output processing Packet scheduling to meet QoS requirement.

Architecture of Fast Datagram Switching

Packet Processing Rates Design a switch –Datagram size : Min 40Byte ~ Max 1500Byte. –Rule of thumb : average packet size. Form of Processing –Sequentially processing for minimum packet size. –Parallel processing for average packet size. Packet Processing Rate The Packet processing rate is a key throughput measure of a switch. Packet processing software and shared parallel hardware resources must be able to sustain the average packet processing rate.

Fast Forwarding Lookup Review - Fast Packet Switching –CID for Fast Packet Switching. –Problem : Table entry size. Fast Datagram Switching –Problem : similar to Fast Packet Switching. –Solutions Flat Addressing Hierarchical Address Software Prefix Matching Hardware Matching Support Source Routing

Flat Addressing

Software Search Lookup Time –Worst case : minimum packet size, worst-case lookup algorithm Memory Required –Trade Off( performance vs cost ). –Amount of memory reasonable to contain in the switch input processing. Update Time –Lookup data structure. Techniques –Tree search( O(log B N) for N entries, B is branch factor ). –Hash function( O(1) for no hash collisions ).

Content Addressable Memory(CAM) Feature –Parallel scan –Memory Access –Referencing One(by Key) –Return Associate Data Benefit –Initutive & speed Model Each word consists of a. All words are checked in parallel in a single CAM cycle. Return-field portion of the word is the output of the CAM read. CAMs specifically designed for network address lookup. Key Data Association Data

Hierarchical Addresses Exploited to reduce the size of the forwarding tables. Forwarding entries can be represented as prefix addresses. Higher order bit portion of an address that must be matched to lead toward the destination. Similar to PSTN.

Software Prefix Matching

Basic Prefix Matching Algorithm

Hardware Matching Support Motivation –Complexity of software algorithms. Hardware techniques for line rate lookup. –Assisting logic can be Embedded in the memory. CAMs for Variable Prefixes. –Translation logic can be provided that assists the location of addresses in conventional memory. Multistage Lookup.

CAMs for Variable Prefixes

Multistage Lookup

Source Routing Eliminate the per hop address lookup –By precomputing the route. Include the entire path in the packet header.

Packet Classification Two other common forms –Separation of control packets. –Separation of packets belonging to different traffic classes. General classification include –Classification into a QoS traffic class. –Policy based routing. –Security. –Higher-layer switching functions. –Active networking.

Packet Filtering Problem Classification occur before queueing in the node. General problem of classification.

Packet Classification Implements Hardware Classification –Ternary CAMs can be used to match the rules in parallel. –Similar to the address lookup. Software Classification –Forwarding table lookup(section 5.1.1). –“Grid of tries”, “Tuple space search”. Preprocessing Classifiers –Preprocess all possible packet fields.

Output Processing and Packet Scheduling (1) Reasons to perform output scheduling –Datagrams are consist of Quranteed Service classes. Best Effort Traffic. –Sufficient to meet delay and bandwidth bounds. –Fair service among the best-effort flows. –Congestion control mechanisms does not protect quaranteed service classes from the best-effort traffic.

Output Processing and Packet Scheduling (2) Fair Queuing –Packet Fair Queuing(PFQ). –Weighted Fair Queuing(WFQ). Per-Flow Queuing –The highest degree of isolation. –Control occurs when per flow queuing is used. Congestion Control –Large building queues increase delay, resulting in congestion. –Discard to keep Queues from building. –Ex. RED(Random early detection)

Higher-Layer and Active Processing Active networking uses general classification techniques. –First, identify packets for active processing. –Executes active applications in the network nodes on the identified packets, connections, or flows to provide the desired service. Motivation for “Active networking” –Open flexible interfaces to allow provisioning of new protocols and services. Condition for “Active networking” –Should not impede the non-Active fast path.

Active Network Node Reference Model