Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.

Slides:



Advertisements
Similar presentations
Chapter 1: Introduction to Scaling Networks
Advertisements

Chapter 7: Intranet LAN Design
Introducing Campus Networks
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Module 5 - Switches CCNA 3 version 3.0 Cabrillo College.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
Lab Practical 2 Study about different types of Networking Device
Dr. Zahid Anwar. Simplified Architecture of Linux Cluster Simplified Architecture of a Single Computer Simplified architecture of an enterprise cluster.
Module 8: Concepts of a Network Load Balancing Cluster
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
CS 582 / CMPE 481 Distributed Systems Communications.
Ch.6 - Switches CCNA 3 version 3.0.
1 Version 3 Module 8 Ethernet Switching. 2 Version 3 Ethernet Switching Ethernet is a shared media –One node can transmit data at a time More nodes increases.
Computer Network Architecture and Programming
1 CCNA 3 v3.1 Module 5. 2 CCNA 3 Module 5 Switches/LAN Design.
Routing.
Ethernet Frame PreambleDestination Address Source Address Length/ Type LLC/ Data Frame Check Sequence.
Institute of Technology, Sligo Dept of Computing Semester 3, version Semester 3 Chapter 3 VLANs.
1 K. Salah Module 4.3: Repeaters, Bridges, & Switches Repeater Hub NIC Bridges Switches VLANs GbE.
Introduction to Computer Networks 09/23 Presenter: Fatemah Panahi.
1 25\10\2010 Unit-V Connecting LANs Unit – 5 Connecting DevicesConnecting Devices Backbone NetworksBackbone Networks Virtual LANsVirtual LANs.
Layer 2 Switch  Layer 2 Switching is hardware based.  Uses the host's Media Access Control (MAC) address.  Uses Application Specific Integrated Circuits.
Fundamentals of Computer Networks ECE 478/578 Lecture #2 Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University of Arizona.
Connecting LANs, Backbone Networks, and Virtual LANs
1 Network Strategy By Mr J. Sloan. Ideas Protocol WAN LAN Node What is a… Workstation File Server Print Server.
C OLUMBIA U NIVERSITY Lightwave Research Laboratory Embedding Real-Time Substrate Measurements for Cross-Layer Communications Caroline Lai, Franz Fidler,
Chapter 1: Hierarchical Network Design
LECTURE 9 CT1303 LAN. LAN DEVICES Network: Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and.
Chapter 2 Network Design Essentials Instructor: Nhan Nguyen Phuong.
GrIDS -- A Graph Based Intrusion Detection System For Large Networks Paper by S. Staniford-Chen et. al.
Common Devices Used In Computer Networks
LOCAL AREA NETWORK A local area network (lan) is a communication network that interconnects a variety of data communicating devices within a small geographic.
CS3502: Data and Computer Networks Local Area Networks - 4 Bridges / LAN internetworks.
Lec4: TCP/IP, Network management model, Agent architectures
Repeaters and Hubs Repeaters: simplest type of connectivity devices that regenerate a digital signal Operate in Physical layer Cannot improve or correct.
The University of Bolton School of Games Computing & Creative Technologies LCT2516 Network Architecture CCNA Exploration LAN Switching and Wireless Chapter.
LAN Switching and Wireless – Chapter 1
1 LAN design- Chapter 1 CCNA Exploration Semester 3 Modified by Profs. Ward and Cappellino.
Hierarchical Network Design – a Review 1 RD-CSY3021.
Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Local Area Networks.
Computer Networks 15-1 Chapter 15. Connecting LANs, Backbone Networks, and Virtual LANs 15.1 Connecting devices 15.2 Backbone networks 15.3 Virtual LANs.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 1: Introduction to Scaling Networks Scaling Networks.
Click to edit Master subtitle style
Sem1 - Module 8 Ethernet Switching. Shared media environments Shared media environment: –Occurs when multiple hosts have access to the same medium. –For.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Chapter2 Networking Fundamentals
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Chapter 3 - VLANs. VLANs Logical grouping of devices or users Configuration done at switch via software Not standardized – proprietary software from vendor.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Star Topology Star Networks are one of the most common network topologies. consists of one central switch, hub or computer, which acts as a conduit to.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
Advanced Computer Networks Lecturer: E EE Eng. Ahmed Hemaid Office: I 114.
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
Data Communications and Networks Chapter 1 - Classification of network topologies Data Communications and Network.
Rehab AlFallaj.  Network:  Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and do specific task.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 1: Hierarchical Network Design Connecting Networks.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco PublicITE I Chapter 6 1 Creating the Network Design Designing and Supporting Computer Networks – Chapter.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Network Topologies for Scalable Multi-User Virtual Environments Lingrui Liang.
3. Internetworking (part 2: switched LANs)
NOX: Towards an Operating System for Networks
CT1303 LAN Rehab AlFallaj.
Routing.
Module 5 - Switches CCNA 3 version 3.0.
Chuanxiong Guo, Haitao Wu, Kun Tan,
QNX Technology Overview
Dr. Rocky K. C. Chang 23 February 2004
Chapter 3 VLANs Chaffee County Academy
Routing.
Presentation transcript:

Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D. Nguyen and Bin Zhang Dept. of Computer Science, Rutgers University

Talk Outline  Motivation  Design  Implementation  Benchmarks  Case Studies  Related Work  Future Work

Motivation  Ubiquitous network access  exponential growth in network services  Availability is one key challenge  Networked systems are comprised of large numbers of heterogeneous components  Faults are not uncommon  Complex interaction between components  Examples of costly failures: Ebay, Brittanica  Currently difficult to assess service availability  How to analyze impact of failures?  How to set up an appropriate test-bed?

Mendosus  Goal: provide infrastructure for service designers to assess the availability of network services  Overview:  Provide flexible infrastructure to accurately model a variety of different networking systems from the application’s point-of-view  Run application in real-time and inject faults to assess application’s behavior  Two key components:  Real-time emulation of a variety of interconnects  General fault injection infrastructure

Vision  Map available resources to emulated network

Design

Mendosus Architecture Applications Kernel Latency Routing Fault Inclusion Mendosus daemon Central Controller Network State User Level Fast & Reliable SAN Emulator Module Events

Design Decisions  Central controller  Advantage: consistent network and fault information  Disadvantage: limits scalability  Not involved in network emulation so should still scale well to targeted system sizes (thousands or tens of thousands of components)  Entire network state is maintained at each end node  Advantage: performance  Disadvantage: limits scalability  Only maintain state for LAN  Emulation module embedded within kernel  Advantage: no modifications to application code  Disadvantage: more difficult to modify and extend

Functional Components  Topology Maintenance  Fault Injection  Emulation

Topology Maintenance  Specification - simple ns-2 like topology scripts  Specify available resources  Central controller manages topology  Initializes original topology on each node  Consistent view  Real time topology changes  Specified as scripted events  Controller monitors network connectivity  Detects partitions

Fault Injection  Every n/w component can have a fault profile  Switches, hubs, NICs, links, end nodes  Fault specification:  trace files or theoretical distributions  Exponential, Weibull, constant  Simulate fail-stop components  MTTR - constant or follow a distribution  E.g. unplugging, port shutdown

Emulation  Completely distributed  Every node has enough network state  Emulation Messaging sequence  Application initiates communication  Routing – determine route  Fault Inclusion – effect of injected faults  Latency – corresponding to route taken  We do not implement the innards of network components  Switching

Implementation

Ethernet LAN Emulation  Routing  Emulate computation of Ethernet spanning tree  Controller chooses root of tree  Emulator on each node computes identical spanning tree  Reconfiguration performed periodically (every 2 secs)  Broadcast & Multicast  Emulate using sequence of unicast

Ethernet LAN Emulation - Faults  Network partitions  Controller monitors connectivity  Multiple roots - one for each partition  NIC fail-over  Multiple interfaces using IP aliasing support in Linux

Emulation completeness… Yes P-to-P Software (multiple unicast) HardwareBroadcast Not implementedSome advanced switches Layer 3, 4 services E.g.VLAN, IGMP Software (Broadcast w/ filters) HardwareMulticast Emulated Ethernet EthernetFeature

Micro-benchmarks

Emulation Limits Emulator Gigabit Ethernet Fast Ethernet RTT usecThroughput MB/sec No. of Switches in Topology Network

Software Broadcast Scaling

Fault View Convergence

Case Studies

Group Membership  Test protocol behavior under faults  subtle interactions in distributed protocols  Three Round Membership algorithm  Robust against multiple node failures, packet drops and network partitions  Two modes of operation: normal and FCM

Membership Observations A C BD 5. Link L up 4. Packet drops at A 3. NIC at B recovers 2. Link L down 1. NIC failure at B L

Multi-Level Switched Network  Large enterprise LANs have multiple layers of network components  Access, core and aggregation switches  How to evaluate availability vs. cost vs. complexity?  Study service availability with increased redundancy  Faults following exponential distributions

Enterprise LAN

Availability Vs Redundancy

Related Work  Network Emulation  Distributed emulation  Emulab [Utah], DelayLine  Centralized emulation  NISTNET, Lancaster emulator  Fault injection  Script-based probing and fault injection  Orchestra, DOCTOR  Co-related faults  Loki [UIUC]  Simulation  NS-2, REAL[Cornell], SSFNet, x-sim[Arizona]

Future Work  Extend Mendosus to emulate other networks  WAN: Build in performance dynamics model  Wireless LAN - Realistic fault and performance models  Support pluggable modules within network components which add functionality and additional failures !  Intelligent Routing protocols (E.g. HSRP)  Dynamic DNS, RR DNS

Summary  Test-bed for service designers to systematically analyze network and protocol design against failures  Results show that real-time emulation is feasible given capability of current SAN networks  Demonstrated the flexibility and usefulness of Mendosus through 2 case studies  Another step towards building highly available services…