A Switch-Tagged Routing Methodology for PC Clusters with VLAN Ethernet 2011 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS Authors: Michihiro Koibuchi,

Slides:



Advertisements
Similar presentations
CCNA3 v3 Module 7 v3 CCNA 3 Module 7 JEOPARDY K. Martin.
Advertisements

University of Calgary – CPSC 441.  We need to break down big networks to sub-LANs  Limited amount of supportable traffic: on single LAN, all stations.
Larger Site Networks Part 1. 2 Small Site –Single-hub or Single- Switch Ethernet LANs Large Site –Multi-hub Ethernet LANs –Ethernet Switched Site Networks.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 W. Schulte Chapter 5: Inter-VLAN Routing Routing And Switching.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 5: Inter-VLAN Routing Routing & Switching.
1 CCNA 3 v3.1 Module 7. 2 CCNA 3 Module 7 Spanning Tree Protocol (STP)
STP Spanning tree protocol. Trunk port : A trunk port is a port that is assigned to carry traffic for all the VLANs that are accessible by a specific.
1 Version 3 Module 8 Ethernet Switching. 2 Version 3 Ethernet Switching Ethernet is a shared media –One node can transmit data at a time More nodes increases.
1 Interconnecting LAN segments Repeaters Hubs Bridges Switches.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
1 K. Salah Module 4.3: Repeaters, Bridges, & Switches Repeater Hub NIC Bridges Switches VLANs GbE.
1 25\10\2010 Unit-V Connecting LANs Unit – 5 Connecting DevicesConnecting Devices Backbone NetworksBackbone Networks Virtual LANsVirtual LANs.
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
DataLink Layer1 Ethernet Technologies: 10Base2 10: 10Mbps; 2: 200 meters (actual is 185m) max distance between any two nodes without repeaters thin coaxial.
LOGO Local Area Network (LAN) Layer 2 Switching and Virtual LANs (VLANs) Local Area Network (LAN) Layer 2 Switching and Virtual LANs (VLANs) Chapter 6.
Layer 2 Switch  Layer 2 Switching is hardware based.  Uses the host's Media Access Control (MAC) address.  Uses Application Specific Integrated Circuits.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 2: LAN Redundancy Scaling Networks.
We will be covering VLANs this week. In addition we will do a practical involving setting up a router and how to create a VLAN.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 5: Inter-VLAN Routing Routing And Switching.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429 Introduction to Computer Networks Lecture 8: Bridging Slides used with permissions.
Chapter 4: Managing LAN Traffic
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 2: LAN Redundancy Scaling Networks.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Lecture 12: LAN Redundancy Switched Networks Assistant Professor Pongpisit.
CCNA Guide to Cisco Networking Fundamentals Fourth Edition
1 CISCO NETWORKING ACADEMY PROGRAM (CNAP) SEMESTER 1/ MODULE 8 Ethernet Switching.
Saeed Darvish Pazoki – MCSE, CCNA Abstracted From: Cisco Press – ICND 2 – Chapter 2 Spanning tree Protocol 1.
CS 350 Chapter-11Switching. Switching Service Hardware-based bridging (ASIC: application-specific integrated circuits) Wire speed Low latency Low cost.
CN2668 Routers and Switches (V2) Kemtis Kunanuraksapong MSIS with Distinction MCTS, MCDST, MCP, A+
1/28/2010 Network Plus Network Device Review. Physical Layer Devices Repeater –Repeats all signals or bits from one port to the other –Can be used extend.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
S3C2 – LAN Switching Addressing LAN Problems. Congestion is Caused By Multitasking, Faster operating systems, More Web-based applications Client-Server.
Steffen/Stettler, , 4-SpanningTree.pptx 1 Computernetze 1 (CN1) 4 Spanning Tree Protokoll 802.1D-2004 Prof. Dr. Andreas Steffen Institute for.
1 Michihiro Koibuchi, Takafumi Watanabe, Atsushi Minamihata, Masahiro Nakao, Tomoyuki Hiroyasu, Hiroki Matsutani, and Hideharu Amano
Cisco 3 - LAN Perrine. J Page 110/20/2015 Chapter 8 VLAN VLAN: is a logical grouping grouped by: function department application VLAN configuration is.
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
LOGO Local Area Network (LAN) Layer 2 Switching and Virtual LANs (VLANs) Local Area Network (LAN) Layer 2 Switching and Virtual LANs (VLANs) Chapter 6.
Link Layer Review CS244A Winter 2008 March 7, 2008 Ben Nham.
OSI Model. Switches point to point bridges two types store & forward = entire frame received the decision made, and can handle frames with errors cut-through.
Computer Networks 15-1 Chapter 15. Connecting LANs, Backbone Networks, and Virtual LANs 15.1 Connecting devices 15.2 Backbone networks 15.3 Virtual LANs.
Michihiro Koibuchi(NII, Japan ) Tomohiro Otsuka(Keio U, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) An On/Off.
S7C5 – Spanning Tree Protocol And other topics. Switch Port Aggregation Bundling –Combining 2 to 8 links of FE (Fast Ethernet) or GE (Gigabit) Full duplex.
Sem1 - Module 8 Ethernet Switching. Shared media environments Shared media environment: –Occurs when multiple hosts have access to the same medium. –For.
Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing.
Spanning Tree V1.2 Slide 1 of 1 Purpose:
STP LAN Redundancy Introduction Network redundancy is a key to maintaining network reliability. Multiple physical links between devices provide redundant.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Switching Topic 2 VLANs.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Chapter 16 Connecting LANs, Backbone Networks, and Virtual LANs.
1 Version 3.0 Module 7 Spanning Tree Protocol. 2 Version 3.0 Redundancy Redundancy in a network is needed in case there is loss of connectivity in one.
CCNP 3: Chapter 3 Implementing Spanning Tree. Overview Basics of implementing STP Election of Root Bridge and Backup Enhancing STP RSTP MSTP EtherChannels.
BZUPAGES.COM Introduction to Cisco Devices Interfaces and modules –LAN interfaces (Fast Ethernet, Gigabit Ethernet) –WAN interfaces(Basic Rate Interface.
W&L Page 1 CCNA CCNA Training 2.5 Describe how VLANs create logically separate networks and the need for routing between them Jose Luis.
1 Chapter 3: Packet Switching (Switched LANs) Dr. Rocky K. C. Chang 23 February 2004.
4: DataLink Layer1 Hubs r Physical Layer devices: essentially repeaters operating at bit levels: repeat received bits on one interface to all other interfaces.
1 LAN switching and Bridges Relates to Lab Outline Interconnection devices Bridges/LAN switches vs. Routers Bridges Learning Bridges Transparent.
Chapter-5 STP. Introduction Examine a redundant design In a hierarchical design, redundancy is achieved at the distribution and core layers through additional.
Youngstown State University Cisco Regional Academy
Instructor Materials Chapter 3: STP
Spanning Tree Protocol
Chapter 4 Data Link Layer Switching
Chapter 5: Inter-VLAN Routing
Lecture#10: LAN Redundancy
Spanning Tree Protocol
Spanning Tree Protocol
LAN switching and Bridges
NT2640 Unit 9 Activity 1 Handout
Chapter 16 Connecting LANs, Backbone Networks, and Virtual LANs
LAN switching and Bridges
Chapter 15. Connecting Devices
Virtual LAN (VLAN).
Presentation transcript:

A Switch-Tagged Routing Methodology for PC Clusters with VLAN Ethernet 2011 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS Authors: Michihiro Koibuchi, Member, IEEE, Tomohiro Otsuka, Tomohiro Kudoh, Member, IEEE Computer Society, and Hideharu Amano, Member, IEEE Computer Society Speaker: Ming-Chao Hsu, National Cheng Kung University

Outline INTRODUCTION RELATED WORK SWITCH-TAGGED ROUTING METHODOLOGY FUNDAMENTAL EVALUATION EVALUATION USING PC CLUSTERS CONCLUSIONS

INTRODUCTION High-throughput commercial Ethernet switches are now available, and the link bandwidth of Ethernet has rapidly increased – The standardizations of 10-gigabit Ethernet (10 GbE). – As of November 2008, GbEs were employed as interconnects on 56 percent of the TOP500 supercomputers Recent PC clusters with Ethernet employ system software that supports low latency zero- or one-copy communication used in system area networks (SANs) Most current PC clusters using Ethernet have employed simple tree- based topologies. – To avoid broadcast storms which circulate packets forever in layer-2 Ethernet – IEEE 802.1D STP or 802.1D-2004 Rapid STP (RSTP)

INTRODUCTION IEEE 802.1Q tagged virtual LAN (VLAN) technology – For setting up multiple paths between a pair of switches on loop and mesh topologies. – The existing VLAN-based routing method cannot be easily applied to most of current PC clusters Message Passing Interface (MPI) communication libraries used in PC clusters – Usually do not support tagged VLAN technology. – Not much work exists on evaluating deadlock free routing algorithms in Ethernet, or their impact on real PC clusters with many Ethernet switches. Switch-tagged routing methodology for PC clusters – It simply configures a switch It disables the spanning-tree protocol (STP), allocates the VLAN sets, and optionally registers static MAC addresses of hosts for the routing. – MPI communication libraries do not need to use tagged VLAN technology – Various deadlock-free routing algorithms

RELATED WORK High-performance deterministic routing algorithms (Static MAC routing) – Statically registering MAC addresses of hosts without VLAN technology – It is difficult to stabilize the management of frames with such a configured Ethernet when a broadcast storm occurs. Routing implementation techniques using VLAN technology (VLAN Routing) – Partitioning hosts into multiple groups in the Internet backbone for the QoS control – The VLAN technology was not intended for increasing network throughput Multiple paths between hosts can be obtained by using VLANs( Multiple VLANs) – Each host having a different tree of the physical network – All pairs of hosts can communicate with each other via any VLAN tree topology, and there are multiple paths that consist of different link sets between each pair of hosts. Ethernet switches support IEEE 802.1D STP or 802.1D-2004 Rapid STP (RSTP) to prevent loops in a network – STP and RSTP are not aware of VLANs. – When these protocols are enabled, all links out of a spanning tree are automatically disabled.

SWITCH-TAGGED ROUTING METHODOLOGY Frame Tagging at Switch - A switch behavior of the VLAN tagging operation – When an untagged frame enters a port, it is tagged with a default VLAN ID tag number (port VLAN ID, PVID). – Frames leaving the switch are either tagged or untagged depending on the port’s VLAN configuration. If the port is a “tagged” member of a VLAN, the output frame is tagged with the respective VLAN ID. If the port is an “untagged” member of a VLAN, the output frame is left untagged. Fixed VLAN assignment for the dimension-order routing (DOR) in 4 x 4 2D mesh

SWITCH-TAGGED ROUTING METHODOLOGY Renamed VLAN assignment of fat tree. – (a) Paths in fat tree

SWITCH-TAGGED ROUTING METHODOLOGY Renamed VLAN assignment of fat tree. – (b) Renamed VLAN assignment of switch s1

SWITCH-TAGGED ROUTING METHODOLOGY MAC Address Management at switches – Ethernet switches usually learn unknown MAC addresses when they receive frames. – When a path from host A to B and one from B to A use different VLANs, the intermediate switches of both paths cannot learn the destination MAC address. This is because the MAC address self-learning procedure is independently performed on each VLAN. This problem can be resolved through static MAC address registration. – Ethernet switches statically register pairs of MAC addresses, VLAN IDs, and output port numbers.

SWITCH-TAGGED ROUTING METHODOLOGY On/Off and Multispeed Link Regulation for Saving Power – The power consumption of links can be reduced by using the port-shutdown operation available in most commercial Ethernet switches. – Their operation was not originally intended to reduce power consumption; it is normally used to block the injection of unexpected frames from neighboring switches. – Standard management information base (MIB) + simple network management protocol (SNMP). Power Consumption of GbE Switches (W) – The “All except ports” is the power consumption of switches when all the ports are shutdown, – “Max (port ratio)” is the power consumption when all ports are activated with 1 Gbps. – “PC5324,”“PC6224,” and “PC6248” stand for Dell PowerConnect 5324(layer 2), 6224, and 6248 (layer 3) switches – “SF-420” is Planex Communications SF-0420G

FUNDAMENTAL EVALUATION Fixed VLAN assignment in fat tree (2,4,2) – switch has u upper links and d lower links, and the number of its layers is r. (u, d, r) Renamed VLAN assignment

FUNDAMENTAL EVALUATION Overhead of VLAN Operations – Dell PowerConnect 5324, 6224(layer-3), Netgear GSM7212, and Planex SF- 0420G – ICMP message – The tagged VLAN operation does not affect the latency at these switches. Latency of Switch (in Microseconds)

FUNDAMENTAL EVALUATION Overhead of VLAN Operations – Tperf 1.5 – The bandwidth of the tagged frame transfer (T-T and T-U) is thus slightly lower than that of untagged frame transfer in all switches – Renamed VLAN assignment, (U-(T)-U), has no bandwidth overhead, since it transfers untagged frames between switches. Bandwidth of Switch (in Megabits Per Second)

FUNDAMENTAL EVALUATION Topology considered in evaluations – The “fixed path” represents frame transfers using VLAN A, – The “dynamic path” represents frame transfers using VLANs A and B that are dynamically changed at every a few seconds. – The frame bandwidth is measured at every 100 mseconds (GtrcNET-1.) – The performance of latency and throughput was unaffected even though path set is switched at every a few seconds. Performance of Path Modification (Testbed)

EVALUATION USING PC CLUSTERS Three types of PC clusters – SCore cluster - Testbed SCore is an open-source cluster system software package that provides various parallel programming environments – Misc cluster- 66-host PC cluster using six GbE switches (Dell PowerConnect 6248, 48 ports) Misc Each switch in the Misc cluster connects to 11 hosts, – SuperNova cluster- 225-host PC cluster using the eight same GbE switches. Each switch in the SuperNova cluster connects to 28 or 29 hosts SuperNova Cluster Testbed Misc Cluster The IEEE 802.3x link-level flow control was enabled at every port

EVALUATION USING PC CLUSTERS Evaluated indirect topologies – (a) Fat tree (2,4,2) – (b) Myri-clos(4x4). Evaluated Topologies in Testbed “Avg.H” represents the average number of switches that compose a path.

EVALUATION USING PC CLUSTERS The number of hosts was 16 in all topologies on the testbed. – maximum UDP datagram size, 1,470 bytes – UDP transfer of Tperf-1.5 – flat 1-switch network (flat, nonblocking full crossbar) in which all 16 hosts were connected to a single switch it is hardly possible to employ such a topology in a large-scale cluster with thousands of hosts. – M-tree Single tree-based topology with no VLANs Network throughput – bit-reversal traffic a host with the identifier (a0,a1,…an-1) sends a packet to the host whose identifier is the bit reversal (an-1,…,a1,a0) of the source host. – matrix transpose traffic a host (x; y) sends a packet to the host (k-y-1; k-x-1) (k is the number of hosts in each dimension) or (k-x-1; k-y-1) when x + y =k-1.

EVALUATION USING PC CLUSTERS NAS Parallel Benchmarks(NPBs) – The unit of performance is the execution time (second) – The number of processes for parallel execution was fixed to 16 in the case of the PC-cluster testbed – the SuperNova cluster used 128 processes in CG, FT, IS, LU, and MG – And 225 processes (the maximum size) in SP and BT. – the performance of M-tree is especially degraded on FT and IS. It is known that FT and IS frequently perform the MPI_Alltoall function, and thus, require a large bisection bandwidth. Integer Sort (IS), Fast Fourier Transformation (FT), Multigrid (MG), Conjugate Gradient (CG), Lower-Upper (LU) diagonal, Scalar Pentadiagonal (SP), Block Tridiagonal (BT), Embarrassingly Parallel (EP)

EVALUATION USING PC CLUSTERS NAS Parallel Benchmarks(NPBs) – “Tree (6link)” stands for the tree topology that uses six links between switches using link aggregation – “Compl(2link)” stands for a completely connected topology that uses two links between switches

EVALUATION USING PC CLUSTERS NAS Parallel Benchmarks(NPBs) – the proposed methodology improve the performance by up to 650 percent – Torus topology with three links between switches, that uses 36 links in total, achieves up to a 27 percent performance improvement compared with the tree topology with six links that uses 42 links in total.

EVALUATION USING PC CLUSTERS The simple static on/off link selection algorithm selects the deactivated links as follows: – 1) Estimate the amount of the application traffic in each channel when all the links are activated. The maximum amount of traffic on a single channel is calculated. – 2) Reduce the number of links between switches under the condition that the amount of the application traffic on every channel is less than the maximum amount of traffic on a channel calculated in step 1. SuperNova cluster, 128 processes the power consumption of all switches is reduced by up to 20 percent

EVALUATION USING PC CLUSTERS Impact of On/Off Link Regulation – the power consumption of all switches is reduced by up to 19 percent Misc cluster

EVALUATION USING PC CLUSTERS Impact of On/Off Link Regulation – the power consumption can be reduced by up to 25 percent in the case of PowerConnect 5324s Power consumption of on/off link regulation (SuperNova cluster, 64 processes, PC5324).

CONCLUSIONS Proposed a switch-tagged routing methodology – Implement the routing algorithms on PC clusters with Ethernet – Simple host configuration and high portability Evaluation results – NAS parallel benchmarks performance comparable to that of an ideal 1-switch (full crossbar) network, – The torus topology achieves up to a 27 percent performance improvement compared with the tree topology using link aggregation. – The on/off and multispeed link regulation reduces the power consumption of switches by up to 25 percent