ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.

Slides:

Advertisements

Similar presentations

Express5800/ft series servers Product Information Fault-Tolerant General Purpose Servers.

Advertisements

Remus: High Availability via Asynchronous Virtual Machine Replication

High Availability Deep Dive What’s New in vSphere 5 David Lane, Virtualization Engineer High Point Solutions.

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

The Case for Drill-Ready Cloud Computing Vision Paper Tanakorn Leesatapornwongsa and Haryadi S. Gunawi 1.

Making Services Fault Tolerant

Distributed components

VMware Update 2009 Daniel Griggs Solutions Architect, Virtualization Servers & Storage Solutions Practice Dayton OH.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Virtualization in Data Centers Prashant Shenoy

Managing Information Systems Information Systems Security and Control Part 2 Dr. Stephania Loizidou Himona ACSC 345.

Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin CHAPTER FIVE INFRASTRUCTURES: SUSTAINABLE TECHNOLOGIES CHAPTER.

A T AXONOMY AND S URVEY OF C LOUD C OMPUTING S YSTEMS Reporter: Steven Chen Date: 2010/10/27 1.

ProjectWise Virtualization Kevin Boland. What is Virtualization? Virtualization is a technique for deploying technologies. Virtualization creates a level.

Is Windows Right for High-Availability Enterprise Applications? Dan Kusnetzky, Vice President System Software Research IDC.

VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.

Team Members Lora zalmover Roni Brodsky Academic Advisor Professional Advisors Dr. Natalya Vanetik Prof. Shlomi Dolev Dr. Guy Tel-Zur.

CHAPTER FIVE Enterprise Architectures. Enterprise Architecture (Introduction) An enterprise-wide plan for managing and implementing corporate data assets.

Module 13: Network Load Balancing Fundamentals. Server Availability and Scalability Overview Windows Network Load Balancing Configuring Windows Network.

Cloud Models – Iaas, Paas, SaaS, Chapter- 7 Introduction of cloud computing.

Module 12: Designing High Availability in Windows Server ® 2008.

Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.

Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.

Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.

GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.

The Grid System Design Liu Xiangrui Beijing Institute of Technology.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.

Server Virtualization

Transparent Fault-Tolerant Java Virtual Machine Roy Friedman & Alon Kama Computer Science — Technion.

©2015 EarthLink. All rights reserved. Private Cloud Hosting Create Your Own Private IT Environment.

 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.

Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.

Visual Studio Windows Azure Portal Rest APIs / PS Cmdlets US-North Central Region FC TOR PDU Servers TOR PDU Servers TOR PDU Servers TOR PDU.

Continuous Availability

Private Cloud Hosting. IT Business Challenges I need to extend my on-premises virtualized environment to utilize the Cloud and manage the entire environment.

Virtual Infrastructure By: Andy Chau Farzana Mohsini Anya Mojiri Virginia Nguyen Bobby Phimmasane.

Microsoft Virtual Academy. System Center 2012 Virtual Machine Manager SQL Server Windows Server Manages Microsoft Hyper-V Server 2008 R2 Windows Server.

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

70-412: Configuring Advanced Windows Server 2012 services

CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.

Network management Network management refers to the activities, methods, procedures, and tools that pertain to the operation, administration, maintenance,

Mark Gilbert Microsoft Corporation Services Taxonomy Building Block Services Attached Services Finished Services.

1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.

Building Cloud Solutions Presenter Name Position or role Microsoft Azure.

Virtual Machine Movement and Hyper-V Replica

+ Support multiple virtual environment for Grid computing Dr. Lizhe Wang.

U N C L A S S I F I E D LA-UR Leveraging VMware to implement Disaster Recovery at LANL Anil Karmel Technical Staff Member

OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.

Chapter 1 Characterization of Distributed Systems

Azure Site Recovery For Hyper-V, VMware, and Physical Environments

Chapter 6: Securing the Cloud

Understanding The Cloud

Bentley Systems, Incorporated

High Availability 24 hours a day, 7 days a week, 365 days a year…

Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng

High Availability Linux (HA Linux)

Network Load Balancing

VIDIZMO Deployment Options

Exploring Azure Event Grid

Dependability Evaluation and Benchmarking of

Virtualization Meetup Discussion

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

Managing Services with VMM and App Controller

Cloud Computing Architecture

Key Manager Domains February, 2019.

Harrison Howell CSCE 824 Dr. Farkas

Client/Server Computing and Web Technologies

Presentation transcript:

ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability

Dependability In Systems Engineering, dependability is a measure of system’s availability, reliability and maintainability It is ability of system to deliver services that can be justifiably trusted Often considered as third axis of system quality

Dependability ontology

Dependability challenges in cloud computing Lack of trust in shared virtualized infrastructures Management of cloud computing service by a single provider or vendor is in fact a single point of failure APIs are proprietary Virtualization increases complexity Higher resource utilization Common mode outages Multiple administrative domains Legal and privacy implications

Threats to dependability Faults, Errors and Failures A fault in a system is a deviation from its expected behavior Faults may arise due to hardware failure, software bugs, user error and network problems

Fault Tolerance Ability of a system to continue providing services to its user in case of failure of some of its components Faults can be introduced at:  Application level  Virtual machine level  Physical resource level

Fault Tolerance Application Fault Tolerance:  Application health is continuously monitored by special software components called sensors  Sensor may trigger specific procedures to start repairing process of an application that is malfunctioning  Example : Vmware App HA

Fault Tolerance Virtual Machine Fault Tolerance:  Can be detected by both customer and service provider  Customers can detect virtual machine failure by monitoring its state with the help of sensors deployed in the cloud  Cloud service provider can provide VM fault tolerance by installing a single sensor per physical server that monitors all virtual machines hosted on that server

Fault Tolerance Physical Machine Fault Tolerance:  Can be implemented by cloud service provider by monitoring state of physical server machines and in case of hardware failure, resume all virtual machines on new server

Fault Tolerance Techniques Reactive Fault Tolerance  In case of failure, these techniques reduce the effect of failure on application execution Proactive Fault Tolerance  These techniques work by predicting faults and proactively replacing the suspected components with working ones

Reactive Fault Tolerance Check pointing Replication Job migration SGuard Retry Task resubmission User defined exception handling Rescue workflow

Proactive Fault Tolerance Software Rejuvenation Self-Healing Pre-emptive migration

Tools for implementing fault tolerance HA proxy:  Open source high availability and load balancing solution for TCP and HTTP based applications  De facto standard open source load balancer ASSUE  Automatic Software Self-healing Using REscue points  Uses rescue points to detect, tolerate and recover from software faults

Tools for implementing fault tolerance SHelp:  Upgraded version of ASSURE  Uses weighted values to rescue points and error virtualization techniques so that applications bypass the faulty path

Tools for implementing fault tolerance

High Availability Can be achieved by having redundant failover servers Can be achieved at application level, infrastructure level, data center level

Types of Virtual Machines High Availability Load sharing  Both replicas are active  Service requests are equally distributed between both of them Updated dedicated hot standby  Two identical virtual machines execute on two different physical servers  Both virtual machines are fully synchronized with state information  VMware Fault Tolerance is an example

Types of Virtual Machines High Availability Not dedicated hot standby  Standby VM running in parallel with active VM  Standby is not fully synchronized  VMware HA and Symantec’s Veritas Cluster Server are examples

Types of Virtual Machines High Availability Shared hot standby  Uses check pointing mechanism to update the standby replica  Requires fewer resources for standby replica Cold standby  Standby replica is powered off and lies on storage media  Brought to service when active VM fails  Useful for situations where availability requirements are low

Conclusion Dependability is one of the major challenges in cloud computing Adoption of cloud computing can be increased by addressing the dependability challenges