1 CS294 Project Virtual and Redundant Switches IRAM Retreat – Winter 2001 Sam Williams.

Slides:



Advertisements
Similar presentations
Ethernet Switch Features Important to EtherNet/IP
Advertisements

Data Communications and Networking
University of Calgary – CPSC 441.  We need to break down big networks to sub-LANs  Limited amount of supportable traffic: on single LAN, all stations.
Layer 3 Switching. Routers vs Layer 3 Switches Both forward on the basis of IP addresses But Layer 3 switches are faster and cheaper However, Layer 3.
Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links.
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Larger Site Networks Part 1. 2 Small Site –Single-hub or Single- Switch Ethernet LANs Large Site –Multi-hub Ethernet LANs –Ethernet Switched Site Networks.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
2. Computer Clusters for Scalable Parallel Computing
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.
High availability is one of the most important issues in computing today. Understanding how to achieve the highest possible availability of systems has.
COMMMONWEALTH OF AUSTRALIA Do not remove this notice.
Random access memory Sequential circuits all depend upon the presence of memory. A flip-flop can store one bit of information. A register can store a single.
Chapter 10 Switching Fabrics. Outline Physical Interconnection Physical box with backplane Individual blades plug into backplane slots Each blade contains.
EE 4272Spring, 2003 Chapter 9: Circuit Switching Switching Networks Circuit-Switching Networks Circuit-Switching Concept  Space-Division Switching  Time-Division.
Security in Wireless Sensor Networks Perrig, Stankovic, Wagner Jason Buckingham CSCI 7143: Secure Sensor Networks August 31, 2004.
1 Interconnecting LAN segments Repeaters Hubs Bridges Switches.
Lesson 1: Configuring Network Load Balancing
Chapter 2 Network Models.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Connecting LANs, Backbone Networks, and Virtual LANs
1 Product Reliability Chris Nabavi BSc SMIEEE © 2006 PCE Systems Ltd.
Data Communications and Networking
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Computer Measurement Group, India Reliable and Scalable Data Streaming in Multi-Hop Architecture Sudhir Sangra, BMC Software Lalit.
LAN / WAN Business Proposal. What is a LAN or WAN? A LAN is a Local Area Network it usually connects all computers in one building or several building.
Network Design Essentials
Computers in the real world Objectives Understand what is meant by memory Difference between RAM and ROM Look at how memory affects the performance of.
NETWORK TOPOLOGIES There are three basic configurations used to connect computers they are the  Bus  Ring  Star.
High-Availability Linux.  Reliability  Availability  Serviceability.
1 Module 15: Network Structures n Topology n Network Types n Communication.
Basic Network Gear Created by Alex Schatz. Hub A hub is a very basic internetworking device. Hubs connect multiple machines together and allow them to.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching (Wide Area Networks)
Data and Computer Communications Circuit Switching and Packet Switching.
1 LAN design- Chapter 1 CCNA Exploration Semester 3 Modified by Profs. Ward and Cappellino.
Operating Systems COMP 4850/CISG 5550 Page Tables TLBs Inverted Page Tables Dr. James Money.
LAN Switching and Wireless – Chapter 1 Vilina Hutter, Instructor
Intro to Network Design
Embedded Runtime Reconfigurable Nodes for wireless sensor networks applications Chris Morales Kaz Onishi 1.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
VIRTUAL MEMORY By Thi Nguyen. Motivation  In early time, the main memory was not large enough to store and execute complex program as higher level languages.
15.1 Chapter 15 Connecting LANs, Backbone Networks, and Virtual LANs Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or.
The concept of RAID in Databases By Junaid Ali Siddiqui.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,
Network Components By Kagan Strayer. Network Components This presentation will cover various network components and their functions. The components that.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Unit 1 Lecture 4.
5: DataLink Layer 5a-1 Bridges and spanning tree protocol Reference: Mainly Peterson-Davie.
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Data Communication Networks Lec 13 and 14. Network Core- Packet Switching.
+ Lecture#2: Ethernet Asma ALOsaimi. + Objectives In this chapter, you will learn to: Describe the operation of the Ethernet sublayers. Identify the major.
Lecture 11. Switch Hardware Nowadays switches are very high performance computers with high hardware specifications Switches usually consist of a chassis.
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Muhammad Waseem Iqbal Lecture # 20 Data Communication.
Instructor Materials Chapter 4: Introduction to Switched Networks
Frame Relay lab1.
Chapter 8 Switching Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Random access memory Sequential circuits all depend upon the presence of memory. A flip-flop can store one bit of information. A register can store a single.
Random access memory Sequential circuits all depend upon the presence of memory. A flip-flop can store one bit of information. A register can store a single.
Introduction to Computers
Instructor Materials Chapter 4: Introduction to Switched Networks
Chapter 4: Switched Networks
QNX Technology Overview
Data Communication Networks
Random access memory Sequential circuits all depend upon the presence of memory. A flip-flop can store one bit of information. A register can store a single.
Presentation transcript:

1 CS294 Project Virtual and Redundant Switches IRAM Retreat – Winter 2001 Sam Williams

2 CS294 Project Outline Motivation Existing Products Arrayed Commodity Switches Adding Redundancy Optimizing Generalization Conclusions

3 CS294 Project Motivation Cost of switches grows very quickly: O(Ports 2 ) for crossbar based Additionally address tables and buffers must grow Industry leading MTBF for a single switch is about 50K hours and typical is perhaps only 25K. Modular Switches provide redundancy for management and power, but not the data transport fabric. MTTR is typically over 1 hour Can the money saved by cascading commodity switches be applied towards improved performance or redundancy? The goals are to improve the MTBF, improve performance, and simplify the work that must be done to replace a failed switch.

4 CS294 Project Existing Products Existing modular aggregators can merge several smaller switches (modules) into a single large virtual switch. In this case, each 36 port switch module has a pair of gigabit uplinks to the switching fabric, which has either 6 or 24 gigabit ports (full duplex) Redundancy is also provided for management modules, fans, and power supplies. However, not for modules or switching fabric. So if the switching fabric fails, the entire device fails, but if individual switching modules fail, then only that sub network fails. Management modules can infer priority to improve performance for critical activity 3com switch 4007 Management 4 x 36 port switching modules, each with 2 gigabit uplinks 120 Gbps backplane (16 used) Logical View Switching Fabric: 24 internal gigabit ports

5 CS294 Project Existing Products (Analysis) The cost analysis here is based on use of either 18 or 48 Gbps switching fabrics, 36 port switching modules and either a 7 or 13 bay chassis. Performance is slowdown on the time to send from every node to every other node compared to a true n*36 port switch. MTBF is for any part of the network MTTR was at least 1 hour. Repair cost is about $4000/failure – modularization helps to keep this low, but yearly maintenance cost will grow with the number of ports

6 CS294 Project Examples of failure Switching module fails, each of the nodes/sub-networks attached is no disconnected from all other nodes More likely case Switching fabric fails, each of the switches is now disconnected from the others, but nodes attached to a switch still can communicate with each other.

7 CS294 Project Examples of failure (continued) Redundancy allows for this failure, with reduced performance. This are not commodity switches, and are considerably more expensive. However, in this case, the failure does cause a network split. This is the more likely case, so why not allow the extra switch be used to cover any other switch’s failure Could be extended to nodes, but then you pay double for NIC’s and ports.

8 CS294 Project Virtual switch from commodity switches Although without the management functions, and performance, cheaper virtual switches can be built – nothing more than just cascading them This is based on 5, 8, 16, and 24 port switches, each with the last port MDI type, and from 5 different companies Performance is poor since the uplinks are only 100Mbps Adding a second uplink port only moderately alleviates this deficiency

9 CS294 Project Virtual switch from mid-range switches By using switches more suited to this design (higher speed uplink(s)), we can improve performance These switches use an 8 or 24 port switch at the bottom, each with 1 or 2 gigabit uplink modules, and a 4, 8, or 12 port gigabit switch at the top The gigabit uplinks and gigabit switches drive cost to at least twice as much as commodity solution, but with 10x better performance Performance is near that of a monolithic switch if 2 uplinks are used. Compared to packaged solution, its about half the cost, and slightly less performance, but no management functionality.

10 CS294 Project Port Virtualization for Redundancy The re-mapping stage is much simpler than a full n*m port switch. Essentially each of the m n bit busses are mapped to one of the k n bit internal busses which are connected directly to the switches For this example each of the 4 groups of 8 virtual ports is mapped to one of the 5 groups of physical ports. The uplinks of the first stage switches are sent back, and into one of the top level switches. An even simpler solution, for single redundancy, would be to map either directly, or to the spare In this design the the single point of failure is the re-mapping block, since first and second level switches have redundancy So for the example below, MTBF is improved by about 50% (from 208 days to 347 days) port re-mapping Extra switches for redundancy

11 CS294 Project Operation (Homogenous switches) In this somewhat rigid example, there are 6 bays, 4 are map direct or to spare, There is a switching fabric slot, and a slot for the redundant switch, which can replace either of the other two classes In this case, the switching fabric switch failed, and the uplink ports were remapped to the spare. At this point the admin must replace the failed switch. If any other switch fails before this, the network will be partially split.

12 CS294 Project Operation - continued In this case, one of the first level of switches failed. Instead of those nodes loosing connection to the rest of the network, they are remapped to the spare. Once again, the admin must replace the failed switch. If any other switch fails before this, the network will be partially split. If the case had bee the spare went down, then it would need to be replaced to provide redundancy.

13 CS294 Project Port Virtualization for Higher Performance Previous performance analysis was based on “1-to-all” messaging. However, it is likely that network access patterns can be broken into groups of high inter-node communication Thus monitoring can be performed, and the network can be periodically paritioned into activity groups Create a graph based on bandwidth used between nodes, use something like Kernighan partitioning to separate it into a number of partitions equal to the number of first stage switches (power of 2). The re-mapping stage is only slightly simpler than a full n*m port switch (no buffers, never any contention, etc…) Logical View 3 switches reserved as spares. 1 failed, and the network was repartitioning

14 CS294 Project Performance / Availability MTTR for aggregators was typically over an hour. This is on top of the time to detect the failure. By automating recovery, the downtime can be significantly reduced This is dependent on timely detection of a failed switch, which could be handled via packet injection. Once the failing switch is determined, a new mapping can quickly be determined. For the performance optimizing case, satisfying connectivity is the top priority, a previously scheduled performance can be done later. Hard fail Fail detected Switches have adapted perf time repartition for performance Switches have adapted Hard fail Fail detected Switches have adapted perf time Hard fail admin notices & fixes failure Switches have adapted perf time...

15 CS294 Project Generalization Use homogenous switches. There is a mapping layer which maps physical to virtual ports. This can range from simple 1 to 2, to complex 1 to n, with performance monitoring and repartitioning. Performance can be gained by using some faster switches where needed. Extra switches for redundancy or extra performance monitor and port re-mapping #DescriptionFailsPerformanceCost 0Array of switches0LowN switches 1Single Redundancy1LowN switches trivial mapper 2R way redundancyRLowN switches + R + general mapper 3Array of switches with partitioning0AdaptiveN switches + expensive mapper 4R way redundancy with partitioningRAdaptiveN switches + R + expensive mapper 5 R way redundancy with partitioning And total utilization RAdaptiveN switches + R + expensive mapper

16 CS294 Project Conclusion It is possible to make a larger virtual switch out of smaller switches, and still get reasonable performance. With little additional hardware, and monitoring agent, it is possible to make it fault tolerant, with several spare switches which can be automatically swapped in – simple case cost ~ O(Spares * Ports). more complex designs make it O(Ports 2 ) With a very simple, but large switch, it is possible to also optimize for performance by balancing network bandwidth among switches in the pool. This is a much more costly solution. A generalization would provide a pool of switches connected by the port mapper, and some or none reserved as spares. Both of these concepts and their functionality could be integrated into a single ASIC or even using a network processor.

17 CS294 Project Future Work How do switches fail? This determines the failure detection method. Implementation of type 1 or 2 switch would be possible given the relative simple mapper. Type 3, 4, or 5 would require a complex ASIC, which should be replaced with a network processor and software.