1 EE384Y: Packet Switch Architectures Part II Load-balanced Switch (Borrowed from Isaac Keslassys Defense Talk) Nick McKeown Professor of Electrical Engineering.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 A B C
Scenario: EOT/EOT-R/COT Resident admitted March 10th Admitted for PT and OT following knee replacement for patient with CHF, COPD, shortness of breath.
Simplifications of Context-Free Grammars
Variations of the Turing Machine
EE384y: Packet Switch Architectures
AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
David Burdett May 11, 2004 Package Binding for WS CDL.
Introduction to Algorithms 6.046J/18.401J
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
Media-Monitoring Final Report April - May 2010 News.
Chapter 7: Steady-State Errors 1 ©2000, John Wiley & Sons, Inc. Nise/Control Systems Engineering, 3/e Chapter 7 Steady-State Errors.
Break Time Remaining 10:00.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
: 3 00.
5 minutes.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
FIGURE 12-1 Op-amp symbols and packages.
Converting a Fraction to %
Clock will move after 1 minute
PSSA Preparation.
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Distributed Computing 9. Sorting - a lower bound on bit complexity Shmuel Zaks ©
1.step PMIT start + initial project data input Concept Concept.
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Distributed Computing 5. Snapshot Shmuel Zaks ©
FIGURE 3-1 Basic parts of a computer. Dale R. Patrick Electricity and Electronics: A Survey, 5e Copyright ©2002 by Pearson Education, Inc. Upper Saddle.
Configuring a Load-Balanced Switch in Hardware Srikanth Arekapudi, Shang-Tse (Da) Chuang, Isaac Keslassy, Nick McKeown Stanford University.
Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.
Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University The Load-Balanced Router.
Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004.
1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group
Using Load-Balancing To Build High-Performance Routers Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
A Load-Balanced Switch with an Arbitrary Number of Linecards Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.
Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department.
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Nick McKeown CS244 Lecture 7 Valiant Load Balancing.
Applied research laboratory 1 Scaling Internet Routers Using Optics Isaac Keslassy, et al. Proceedings of SIGCOMM Slides:
Presentation transcript:

1 EE384Y: Packet Switch Architectures Part II Load-balanced Switch (Borrowed from Isaac Keslassys Defense Talk) Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University

2 The Arbitration Problem A packet switch fabric is reconfigured for every packet transfer. For example, at 160Gb/s, a new IP packet can arrive every 2ns. The configuration is picked to maximize throughput and not waste capacity. Known algorithms are probably too slow.

3 Approach We know that a crossbar with VOQs, and uniform Bernoulli i.i.d. arrivals, gives 100% throughput for the following scheduling algorithms: Pick a permutation uar from all permutations. Pick a permutation uar from the set of size N in which each input- output pair (i,j) are connected exactly once in the set. From the same set as above, repeatedly cycle through a fixed sequence of N different permutations. Can we make non-uniform, bursty traffic uniform enough for the above to hold?

4 Design Example Goals Scale to High Linecard Speeds (160Gb/s) No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity Scale to High Number of Linecards (640) Provide Performance Guarantees 100% Throughput Guarantee No Packet Reordering Stanford Optics in Routers project Some challenging numbers: 100Tb/s 160Gb/s linecards 640 linecards

5 Outline Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards Arbitrary arrangement of linecards

6 In Out R R R R R R Router capacity = NR Switch capacity = N 2 R 100% Throughput in a Mesh Fabric ? ? ? ? ? ? ? ? ? R R R R R R R R R R R R R

7 R In Out R R R R R R/N If Traffic Is Uniform R R

8 Real Traffic is Not Uniform R In Out R R R R R R/N R R R R R R R R R ?

9 Out R R R R/N Load-Balanced Switch Load-balancing stageForwarding stage In Out R R R R/N R R R 100% throughput for weakly mixing traffic (Valiant, C.-S. Chang)

10 Out R R R R/N In R R R R/N Load-Balanced Switch

11 Out R R R R/N In R R R R/N Load-Balanced Switch

12 Out R R R R/N In R R R R/N Intuition: 100% Throughput Arrivals to second mesh: Capacity of second mesh: Second mesh: arrival rate < service rate [C.-S. Chang]

13 Another way of thinking about it 1 N 1 N 1 NExternal Outputs Internal Inputs External Inputs Load-balancing cyclic shift Switching cyclic shift Load Balancing First stage load-balances incoming packets Second stage is a cyclic shift

14 Load-Balanced Switch External Outputs Internal Inputs 1 N External Inputs Load-balancing cyclic shift Switching cyclic shift 1 N 1 N

15

16 Outline of Changs Proof

17 Outline Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards Arbitrary arrangement of linecards

18 Out R R R R/N In R R R R/N Packet Reordering 1 2

19 Out R R R R/N In R R R R/N Bounding Delay Difference Between Middle Ports 1 2

20 Out R R R R/N In R R R R/N UFS (Uniform Frame Spreading) 1 2

21 Out R R R R/N In R R R R/N FOFF (Full Ordered Frames First) 1 2

22 FOFF (Full Ordered Frames First) Input Algorithm N FIFO queues corresponding to the N output flows Spread each flow uniformly: if last packet was sent to middle port k, send next to k+1. Every N time-slots, pick a flow: - If full frame exists, pick it and spread like UFS - Else if all frames are partial, pick one in round-robin order and send it N

23 Out R R R R/N In R R R R/N Bounding Reordering 1 2 3

24 FOFF Output properties N FIFO queues corresponding to the N middle ports Buffer size less than N 2 packets If there are N 2 packets, one of the head-of-line packets is in order Output 4 N

25 FOFF Properties Property 1: FOFF maintains packet order. Property 2: FOFF has O(1) complexity. Property 3: Congestion buffers operate independently. Property 4: FOFF maintains an average packet delay within constant from ideal output-queued router. Corollary: FOFF has 100% throughput for any adversarial traffic.

26 In Out R R R R R R Output-Queued Router ? ? ? ? ? ? ? ? ? R R R R R R R R R R R R R

27 Outline Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards Arbitrary arrangement of linecards

28 Out R R R R/N In R R R R/N From Two Meshes to One Mesh One linecard In Out

29 From Two Meshes to One Mesh First mesh In Out In Out In Out In Out One linecard Second mesh R R R R R

30 From Two Meshes to One Mesh Combined mesh In Out In Out In Out In Out 2R R

31 Many Fabric Options Options Space: Full uniform mesh Time: Round-robin crossbar Wavelength: Static WDM Any spreading device C 1, C 2, …, C N C1C1 C2C2 C3C3 CNCN In Out In Out In Out In Out N channels each at rate 2R/N One linecard

32 AWGR (Arrayed Waveguide Grating Router) A Passive Optical Component Wavelength i on input port j goes to output port (i+j-1) mod N Can shuffle information from different inputs 1, 2 … N NxN AWGR Linecard 1 Linecard 2 Linecard N 1 2 N Linecard 1 Linecard 2 Linecard N

33 In Out In Out In Out In Out Static WDM Switching: Packaging AWGR Passive and Almost Zero Power A B C D A, B, C, D A, A, A, A B, B, B, B C, C, C, C D, D, D, D N WDM channels, each at rate 2R/N

34 Outline Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards Arbitrary arrangement of linecards

35 Scaling Problem For N < 64, an AWGR is a good solution. We want N = 640. Need to decompose.

36 A Different Representation of the Mesh In Out In Out In Out In Out R 2R Mesh 2R In Out In Out In Out In Out R 2R R

37 A Different Representation of the Mesh In Out In Out In Out In Out R In Out In Out In Out In Out R 2R/N

Example: N= R/8

39 When N is Too Large Decompose into groups (or racks) 4R/4 2R2R2R2R R2R 2R2R R

40 When N is Too Large Decompose into groups (or racks) 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G 2RL 2RL/G

41 Outline Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards Arbitrary arrangement of linecards

42 When Linecards Fail 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G 2RL 2RL/G 2RL Solution: replace mesh with sum of permutations = + + 2RL/G 2RL 2RL/G G *

43 Hybrid Electro-Optical Architecture Using MEMS Switches 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G MEMS Switch MEMS Switch

44 When Linecards Fail 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G MEMS Switch MEMS Switch

45 Fiber Link Capacity 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G MEMS Switch MEMS Switch MEMS Switch Link Capacity 64 λs * 5 Gb/s/λ = 320 Gb/s = 2R Laser/ Modulator MUX

46 Group/Rack R 4R Group/Rack R 4R Example 2 Groups of 2 Linecards 12 2R Group/Rack R Group/Rack 2 4R 2R

47 Theorem: ML+G-1 MEMS switches are sufficient for bandwidth. Number of MEMS Switches Examples: G groups, L i linecards in group i,

48 Group A 1 2 2R 4R Group B 12 2R 4R Packet Schedule 12 2R Group A 12 2R Group B 4R 2R

49 At each time-slot: Each transmitting linecard sends one packet Each receiving linecard receives one packet (MEMS constraint) Each transmitting group i sends at most one packet to each receiving group j through each MEMS connecting them In a schedule of N time-slots: Each transmitting linecard sends exactly one packet to each receiving linecard Rules for Packet Schedule

50 Packet Schedule T+1T+2T+3T+4 Tx LC A1???? Tx LC A2???? Tx LC B1???? Tx LC B2???? Tx Group A Tx Group B

51 Packet Schedule T+1T+2T+3T+4 Tx LC A1A1A2B1B2 Tx LC A2B2A1A2B1 Tx LC B1B1B2A1A2 Tx LC B2A2B1B2A1 Tx Group A Tx Group B

52 Bad Packet Schedule T+1T+2T+3T+4 Tx LC A1A1A2B1B2 Tx LC A2B2A1A2B1 Tx LC B1B1B2A1A2 Tx LC B2A2B1B2A1 Tx Group A Tx Group B

53 Group Schedule T+1T+2T+3T+4 Tx Group AAB Tx Group BAB

54 Good Packet Schedule T+1T+2T+3T+4 Tx LC A1A1A2B1B2 Tx LC A2B2B1A2A1 Tx LC B1B1B2A1A2 Tx LC B2A2A1B2B1 Theorem: There exists a polynomial-time algorithm that finds the correct packet schedule. Tx Group A Tx Group B

55 Outline Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards Arbitrary arrangement of linecards