Cloud Computing
Lecture Outline Exploring Cloud Backups Disaster Recovery in Cloud Cloud Management
Exploring Cloud Backups
Cloud Backup Cloud backup, also known as online backup, is a strategy for backing up data that involves sending a copy of the data over a proprietary or public network to an off-site server. The server is usually hosted by a third-party service provider, who charges the backup customer a fee based on capacity, bandwidth or number of users.
Backup Types 1.Full system or image backups 2.Point-in-time (PIT) backups or snapshots 3.Differential and incremental backups 4.Reverse Delta backup 5.Continuous Data Protection (CDP) or mirroring 6.Open file backup 7.Data archival
Full system or image backups An image backup creates a complete copy of a volume, Including all system files, the boot record, and any other data contained on the disk. Ghost is an example of software that supplies this type of backup.
Point-in-time (PIT) backups or snapshots The data is backed up, and then every so often changes are amended to the backup creating what is referred to as an incremental backup. This type of backup lets you restore your data to a point in time and saves multiple copies of any file that has been changed. At least 10 to 30 copies of previous versions of files should be saved. The first backup is quite slow over an Internet connection, but the incremental backup can be relatively fast. For example, software such as Carbonite may take several days to backup a system, but minutes to create the snapshot.
Differential and Incremental backups In a differential backup, all of the changed files since the last full backup are copied by the backup software, which requires that the software leave the archive bit set to ON for any differential backup, as only a full backup can clear all files' archive bit. Reverse Delta backup A reverse delta backup creates a full backup first and then periodically synchronizes the full copy with the live version Software that uses this system is Apple's Time Machine and the RDIFF-BACKUP utility.
Continuous Data Protection (CDP) or mirroring The goal of this type of backup system is to create a cloned copy of your current data or drive. Open file backup Some applications such as database systems and messaging systems are mission critical and cannot be shut down before being backed up. An open file backup analyzes the transactions that are in progress, compares them to the file(s) at the start of the backup and the file(s) at the end of the backup, and creates a backup that represents a complete file as it would exist at the time the backup started after all the transactions have been processed.
Data archival The term archiving is used to specify the migration of data that is no longer in use to secondary or tertiary long-term data storage for retention. An archive is useful for legal compliance or to provide a long-term historical record. Note: Data archives are often confused with backups, but the two operations are quite different. A backup creates a copy of the data, whereas an archive removes older information that is no longer operational and saves it for long-term storage. You can't restore your current data set from an archive.
Disaster Recovery in cloud
DISASTER RECOVERY Disaster recovery (DR) is about preparing for and recovering from a disaster.
DISASTER RECOVERY Cloud disaster recovery (cloud DR) is backup and restore strategy that involves storing and maintaining copies of electronic records in a cloud computing environment as a security measure.
GOAL OF DR The goal of cloud DR is to provide an organization with a way to recover data and/or implement Failover in the event of a man-made or natural catastrophe.
DISASTER Any event that has a negative impact on your business continuity or finances could be termed a disaster.
The Need for Disaster Recovery
Why Downtime Matters *September 2, 2010, Business Continuity and Disaster Recovery are top IT Priorities for 2010 and Forrester Total economic damage from disaster in 2009* Economic impact felt in the US from disasters in 2009* $10.8 Billion $41.3 Billion
Better Understanding of Protection *Jan. 25, 2010 – The State of Enterprise IT: 2009 to Forrester of enterprises have indicated that improving disaster recovery capabilities is a high priority* 78% Critical Priority 30% - High Priority 48% Better able to identify and quantify risk Better understanding of economic impact Less tolerance for downtime and data loss
WHY WE ARE TALKING ABOUT DR? Over 70% of businesses involved in a major fire either do not reopen, or subsequently fail within 3 years of fire. (Source continuitycentral.com) 80% of businesses affected by a major incident either never re-open or close within 18 months (Source Axa) 70 percent of companies go out of business after a major data loss (Source continuitycentral.com) 80% of businesses suffering a computer disaster, who have no disaster recovery plans, go out of business. (Source “A Bridge Too Far”, IBM BusinessRecovery Service & Cranfield, 1993) A recent study from Gartner, Inc., found that 90 percent of companies that experience data loss go out of business within two years. 80 percent of companies without well-conceived data protection and recovery strategies go out of business within 2 years of a major disaster. (Source: US National Archives and Records Administration)
Define Your Objectives Time between declaration and service availability Time to restore services to useable state Recovery Time Objective (RTO) Data in system lost at disaster time Amount of data entered since last backup Recovery Point Objective (RPO) Time required to test recovery plans Resources used for testing Test Time Objective (TTO)
RTO what it implies? Have a system that records 1000 transaction at hour Take a snapshot of a system at 03:00 am (every day) 10:00 am a disaster event occurs You spend 1 hour to sort things out for the backup (off-site, preparation, etc.) Recover operation takes 4 hours in order to get back to operate (at minimum service level) 5 hours is the: RECOVERY TIME OBJECTIVE
RPO – WHAT IT IMPLIES? Have a system that records 1000 transaction at hour Take a snaphot of a system at 03:00 am (every day) 10:00 am a disaster event occurs In this case we lost around 7000 transactions. – 1000 transactions 03:00 04:00 – 1000 transactions 04:00 05:00 –…–… But: we are accepting 24 hours of data loss transactions (RPO)
The Cloud Stack
DR and the Cloud
What is a Workload? A workload is an integrated stack of application, middleware, and operating system that accomplishes a computing task A workload is portable and platform agnostic– it can run in physical, virtual or cloud computing environments A workload or a collection of workloads makes up a business service, which is what the end user consumes
Update your DR with Virtualization One virtual server host can protect several servers in production Eliminate the Multi-Platform problem Simplify testing as Virtual Machines can be isolated
Consolidated Recovery Leveraging Virtual Infrastructure For Protection of All Your Servers Solution Replicate workload into an off-line virtual machine One click failover One click test restore Flexible failback Benefits Drastically reduce TCO and achieve whole workload protection Simplify testing with bootable backups Finally a way to complete your DR architecture Virtual production servers Virtual Recovery Hosts Physical production servers
Protect to the Cloud Virtual production servers Hosted Virtual Recovery Hosts Physical production servers Wide Area Network
Products from Novell
Backup to virtual machines Backup to virtual machines Incremental replication Incremental replication Whole-workload protection for all server workloads. Easy to test One-click failover One-click failover PlateSpin Protect Physical Servers Virtual Hosts Blade Servers Image Archives Workload Decoupled from Hardware
PlateSpin ® Forge Protects up to 25 workloads PlateSpin Forge Includes: Storage Replication software Hypervisor Plug In and Protect Solution for : Medium enterprises Branch or field use for large enterprises Hosted recovery World’s first disaster recovery hardware appliance with embedded virtualization
Build a Protection Cloud
Site A Build a Recovery Cloud Site BSite C PlateSpin Protect Virtual Resources = PlateSpin Protect + Virtual Resources Recovery Cloud
Recovery Cloud Setup Workload Replications Replicate every hour (1h RPO) Scheduled replications: Workload changes are automatically replicated into virtual machines inside the Recovery Cloud
Recovery Cloud Recovery Cloud Easy Test Failover Test Failover: recover workloads in isolated virtual networks to avoid production disruptions Users connect to running workloads to test their applications Isolated Virtual Network
Recovery Cloud Recover Workloads In Minutes Offline Detection: PlateSpin Protect sends out notification when the protected workload goes offline Failover: Workloads are recovered in minutes inside the Recovery Cloud Users connect to workloads running in the Recovery Cloud
Recovery Cloud Restore the Production Environment Failback: move the workload back into production to the same or a different host Virtual or Physical Host
Solution Flexibility
On-Premise Production Data Center Service Provider Data Center Administrator Recovery Resources WAN Protected Workloads Protect Node Protect Management Console
Virtual Private Cloud Customer Data Center Service Provider Data Center Administrator Recovery Resources WAN Protected Workloads Protect Node Protect Management Console
Hybrid Model Customer Data Center Service Provider Data Center Administrator WAN Protect Node Protect Management Console Protect Node Protected Workloads Recovery Resources
What do Customers Have to Say?
Nichols College “Disaster recovery solutions can be very complex... PlateSpin Forge is very straightforward. There’s just one piece of hardware to manage. It’s low maintenance, and has low overhead. Without it, we certainly would have spent more money on another disaster recovery solution that would have required more resources to support it.” Customer Results Reed Smith LLP "With PlateSpin Protect, we can recover multiple sites with the same set of hardware quite easily, in a matter of minutes."
Cloud Management
Management of cloud computing products and services. Software and technologies designed for operating and monitoring applications, data and services residing in the cloud. Cloud Management
Cloud computing deployments must be monitored and managed in order to be optimized for best performance. Why Cloud Management
Cloud management includes not only managing resources in the cloud, but managing resources on-premises. The management of resources in the cloud requires new technology. But, management of resources on-premises allows vendors to use well-established network management technologies. Cloud Management
Cloud management software provides capabilities for managing faults, configuration, accounting, performance, and security; this is referred to as FCAPS. Many products address one or more of these areas, and through network frameworks, you can access all five areas. Framework products are being repositioned to work with cloud systems. Cloud Management Software
Benefits of Cloud Management 1. Global management 2. Remote office and distributed storage management 3. Information Access for disaster recovery 4. Cost reduction 5. Real time reporting 6. Easy Upgrades 7. Encrypted information 8. Compliance management 9. Ease of implementation
Administrating the cloud These fundamental features are offered by traditional network management systems: Administration of resources Configuring resources Enforcing security Monitoring operations Optimizing performance Policy management Performing maintenance Provisioning of resources
Administrating the cloud Network management systems are often described in terms of the acronym FCAPS, which stands for these features: Fault Configuration Accounting Performance Security Most network management packages have one or more of these characteristics; no single package provides all five elements of FCAPS. But Now, These five vendors have products for cloud management. BMC Cloud Computing Computer Associates Cloud HP Cloud Computing IBM Cloud Computing Microsoft Cloud Services
Cloud Management Responsibilities What separates a network management package from a cloud computing management package is the “cloudly” characteristics that cloud management service must have: Billing is on a pay-as-you-go basis. The management service is extremely scalable. The management service is ubiquitous. Communication between the cloud and other systems uses cloud networking standards To monitor an entire cloud computing deployment stack, you monitor six different categories: 1. End-user services such as HTTP, TCP, POP3/SMTP, and others 2. Browser performance on the client 3. Application monitoring in the cloud, such as Apache, MySQL, and so on 4. Cloud infrastructure monitoring of services such as Amazon Web Services, GoGrid, Rackspace, and others 5. Machine instance monitoring where the service measures processor utilization, memory usage, disk consumption, queue lengths, and other important parameters 6. Network monitoring and discovery using standard protocols like the Simple Network Management Protocol (SNMP), Configuration Management Database (CMDB), Windows management Instrumentation (WMI), and the like
Management Responsibilities by Service Model Type In the cloud, the particular service model you are using directly affects the type of monitoring you are responsible for. – Consider the case of an Infrastructure as a Service (IaaS) vendor such as Amazon Web Services or Rackspace. – You can monitor your usage of resources either through their native monitoring tools like Amazon CloudWatch or Rackspace Control Panel or through the numerous third-party tools that work with these sites' APIs
Management Responsibilities by Service Model Type The situation—as you move first to Platform as a Service (PaaS) like Windows Azure or Google App Engine and then onto Software as a Service (SaaS) for which Salesforce.com is a prime example— becomes even more restrictive. When you deploy an application on Google's PaaS App Engine cloud service, the Administration Console provides you with the following monitoring capabilities: Create a new application, and set it up in your domain. Invite other people to be part of developing your application. View data and error logs. Analyze your network traffic. Browse the application data store, and manage its indexes. View the application's scheduled tasks. Test the application, and swap out versions.
Management Responsibilities by Service Model Type
Life cycle management Cloud services have a defined lifecycle, just like any other system deployment. A management program has to touch on each of the six different stages in that lifecycle: 1. The definition of the service as a template for creating instances – creation, updating, and deletion of service templates 2. Client interactions with the service, usually through an SLA (Service Level Agreement) contract – creation, updating, and deletion of service offerings 3. The deployment of an instance to the cloud and the runtime management of instances 4. The definition of the attributes of the service while in operation and performance of modifications of its properties – The chief task during this management phase is to perform service optimization and customization. 5. Management of the operation of instances and routine maintenance – During Phase 5, you must monitor resources, track and respond to events, and perform reporting and billing functions. 6. Retirement of the service – End of life tasks include data protection and system migration, archiving, and service contract termination.
Cloud Management Products ProductDescription AbiCloudVirtual machine conversion and management BMC Cloud Computing Initiative Cloud planning, lifecycle management, optimization, and guidance CloudKickCloud server monitoring HP Cloud ComputingA variety of management products and services, both released and under development IBM Service management and Cloud Computing Various IBM Tivoli managers and monitors IntuneCloud-based Windows system management ScienceLogicDatacenter and cloud management solutions and appliances Cloud management software and services is a very young industry, and as such, it has a very large number of companies, some with new products and others with older products competing in this area.
Core Management Features Support of different cloud types Creation and provisioning of different types of cloud resources, such as machine instances, storage, or staged applications Performance reporting including availability and uptime, response time, resource quota usage, and other characteristics The creation of dashboards that can be customized for a particular client's needs
Efforts are underway to develop cloud management interoperability standards 1. DMTF's (Distributed Management Task Force) Open Cloud Standards Incubator. The goal of these efforts is to develop management tools that work with any cloud type. 2. Cloud Commons is developing a technology called the Service Measurement Index (SMI). SMI aims to deploy methods for measuring various aspects of cloud performance in a standard way. Cloud Management - Emerging standards
A Case Study Need: A company wanted to help diabetes patients more easily by sharing accurate, up-to-date medical information with physicians.
Solution Working with IBM,the company created a mobile app, supported by IBM Bluemix and SoftLayer technologies, that captures and aggregates data directly from other medical devices and health tools.
Benefit The virtualized IBM architecture cuts ongoing operational costs by 30 percent with a flexible platform, while physicians and diabetics have clearer insight into patient health, encouraging better care options.