Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 2. CAP and Challenges

Similar presentations


Presentation on theme: "Lecture 2. CAP and Challenges"— Presentation transcript:

1 Lecture 2. CAP and Challenges
COSC6376 Cloud Computing Lecture 2. CAP and Challenges Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston

2 Outline Ecosystem CAP Challenges

3 Summary Assignment Paper can be downloaded from the class website
Due next Tuesday in class

4 NIST: Interactions between Actors in Cloud Computing
Cloud Consumer Cloud Provider Cloud Broker Cloud Auditor Cloud Carrier The communication path between a cloud provider & a cloud consumer The communication paths for a cloud auditor to collect auditing information The communication paths for a cloud broker to provide service to a cloud consumer

5 Conceptual Reference Diagram
Cloud Carrier Cloud Consumer Cloud Auditor Broker Security Audit Privacy Impact Audit Performance Cloud Service Management Service Layer Business Support Service Arbitrage Aggregation Service Intermediation Provisioning/ Configuration Portability/ Interoperability Physical Resource Layer IaaS SaaS PaaS Resource Abstraction and Control Layer Hardware Facility

6 Resource Abstraction All problems in computer science can be solved by another level of indirection (abstraction) - David John Wheeler

7 Six Layers of Cloud Services
Salesforce.com, Webex, App Engine, Microsoft Azure Amzon AWS, Racksapce, IBM Ensembles Savvis, Intermap, Digital Realty Trust AT & T VMWare, IBM, Xen

8 Spectrum of Clouds Instruction Set VM (Amazon EC2, 3Tera)
Bytecode VM (Microsoft Azure) Framework VM Google AppEngine, Force.com Lower-level, Less management Higher-level, More management EC2 Azure AppEngine Force.com

9 Amazon EC2 Like physical hardware, users can control nearly the entirely software stack, from the kernel upwards. A few API calls to request and configure the virtualized hardware. No limit on the kinds of applications that can be hosted. Low level of virtualization-raw CPU cycles, IP connectivity-allow developers to code whatever they want. Hard to offer scalability and failover.

10 Google AppEngine and Force.com
Does one thing well: running web apps App Engine handles HTTP(S) requests, nothing else Think RPC: request in, processing, response out Works well for the web and AJAX; also for other services Request-reply based. Not suitable for general- purpose. Severely rationed in how much CPU time they can use in a request. Automatic scaling and high-availability.

11 Microsoft’s Azure Written using the .NET libraries, and compiled to the language independent managed environment. General -purpose computing. Users get a choice of language, but can not control the operating system or runtime. Libraries provide automatic network configuration and failover/scalability but need users' cooperation also.

12 Spectrum Azure General-purpose Can not control OS
A degree of scalability Google appengine/force.com Highly scalable Yet not general-purpose Amazon EC2 General-purpose Hard to offer scalability

13 Major Cloud Providers and Service Offerings

14 Public, Private, and Hybrid Clouds

15 Hybrid Clouds Using multiple clouds for different applications to match needs Moving an application to meet requirements at specific stages in its lifecycle, from early development through unit test, scale testing, pre-production and ultimately full production scenarios Moving workloads closer to end users across geographic locations, including user groups within the enterprise, partners and external customers Meeting peak demands efficiently in the cloud while the low steady-state is handled internally Maintaining confidential data on better protected clouds while allowing distributed computation on more computationally efficient ones.

16

17 Cloud Interoperability Standards
Open Cloud Computing Interface – Infrastructure EC2 API Simple Storage Service (S3) API Windows Azure Storage Service REST APIs Windows Azure Service Management REST APIs Deltacloud API Rackspace Cloud Servers API Rackspace Cloud Files API Cloud Data Management Interface vCloud API GlobusOnline REST API

18 CAP

19 The CAP Theorem Three properties of a system: consistency, availability and partitions Availability Consistency Partition tolerance 19

20 The CAP Theorem Once a writer has written, all readers will see that write Availability Consistency Partition tolerance

21 Consistency Model A consistency model determines rules for visibility and apparent order of updates. For example: Row X is replicated on nodes M and N Client A writes row X to node N Some period of time t elapses. Client B reads row X from node M Does client B see the write from client A? Consistency is a continuum with tradeoffs For NoSQL, the answer would be: maybe CAP Theorem states: Strict Consistency can't be achieved at the same time as availability and partition-tolerance.

22 Consistency Case 1 Case 2 Upload a picture to facebook
Send a message to your friend to check out the picture Will your friend see it? Case 2 Post a comment C1 on your friend’s page at time t Post another comment C2 10 seconds later Will your friend see two comments with C1 first, followed by C2

23 Eventual Consistency When no updates occur for a long period of time, eventually all updates will propagate through the system and all the nodes will be consistent For a given accepted update and a given node, eventually either the update reaches the node or the node is removed from service

24 GPS Powered Distributed Database
Spanner allows server nodes to coordinate without a whole lot of communication.’ Google Spanner, the Largest Single Database on Earth

25 The CAP Theorem Every request received by a non-failing node in the system must result in a response (must terminate) System is available during software and hardware upgrades and node failures Availability Consistency Partition tolerance

26 Availability Traditionally, thought of as the server/process available five 9’s ( %). However, for large node system, at almost any point in time there’s a good chance that a node is either down or there is a network disruption among the nodes. Want a system that is resilient in the face of network disruption

27 The CAP Theorem A system can continue to operate in the presence of a network partitions. Availability Consistency Partition tolerance

28 The CAP Theorem You can have at most two of these three properties for any shared-data system To scale out, you have to partition. That leaves either consistency or availability to choose from In almost all cases, you would choose availability over consistency C A P Availability Partition-resilience Claim: every distributed system is on one side of the triangle.

29 Challenges

30 Adoption Challenges Challenge Opportunity Availability
Multiple providers & DCs Data lock-in Standardization Data Confidentiality, Auditability, and privacy Encryption, VLANs, Firewalls; Geographical Data Storage; Privacy preserving data outsourcing

31 Challenges and Opportunities
Availability of Service Service Duration Data S3 outage: authentication service overload leading to unavailability 2hours 2/15/08 S3 outage: Single bit error leading to gossip protocol blowup. 6-8hours 7/20/08 AppEngine partial outage: programming error 5 hours 6/17/08 Gmail. 1.5hours 08/11/08

32 Adoption Challenges Challenge Opportunity Availability
Multiple providers & DCs Data lock-in Standardization Data Confidentiality, Auditability, and privacy Encryption, VLANs, Firewalls; Geographical Data Storage; Privacy preserving data outsourcing

33

34 Senior Execs Move Forward with Cloud Investments

35 Legal framework of Cloud Computing
Legal compliance issues Service levels and performance Cross-border issues Data protection, rights and usage Privacy and security Termination and transition

36 Compliance of Cloud Computing
Auditing requirements Many contracts impose auditing possibilities that include physical inspection how can these auditing requirements be complied with when geographically decentralized cloud services are used? Compliance IaS Data retention obligations Tax related storage requirements Labor law related book keeping requirements SaaS Electronic invoicing legislation Ecommerce legislation Electronic signature legislation

37 HIPPA Compliance? What is HIPPA? What is Regulated?
Health Insurance Portability and Accountability Act of 1996 – a Federal Law What is Regulated? Accountability: Protects health data integrity, confidentiality and availability Reduces Fraud and Abuse Establishes Standards for Protection of Health Information Privacy (Operational, Consumer Control, Administration) Security (Administrative, Physical, Technical, Network) Definition of Privacy Privacy is the right of an individual to keep his/her individual health information from being disclosed

38 Cross Border Data Transfer/Storage
EU Only use cloud provider with data center within EU e.g. Amazon EC2: choice of location (US East, US West or Ireland) Australia Financial services companies must first notify Australian Prudential Regulatory Authority (APRA) of data offshore transfer Make sure that agreement is concluded with the cloud provider

39 Cross Border Data Transfer/Storage
Applicable Law & Competent court If outside own country, any litigation can become prohibitively expensive . . Data stored in the U.S. is subject to U.S. law, for example: US Patriot Act – US government’s authority extends to compel disclosure of records held by cloud providers

40 Challenges of Datasets over Multiple Clouds
Interesting datasets might be available in different clouds Different cloud providers Private or public clouds Services mashing up datasets Inevitably crossing clouds Federated cloud architectures

41 Growth Challenges Challenge Opportunity Data transfer bottlenecks
FedEx-ing disks, Data Backup/Archival Performance unpredictability Improved VM support, flash memory, scheduling VMs Scalable storage Invent scalable store Bugs in large distributed systems Invent Debugger that relies on Distributed VMs Scaling quickly Invent Auto-Scaler that relies on ML; Snapshots

42 Challenges and Opportunities
∙Data Transfer bottlenecks Obstacles: large data transferring is expensive. e.g. Ship 10 TB from UC Berkeley to Amazon Bandwidth: 20 M/s Time: 45 days Money: $1000 Opportunities: Ship disks. Make it attractive to keep data in cloud. Reduce the cost of WAN bandwidth.

43 Growth Challenges Challenge Opportunity Data transfer bottlenecks
FedEx-ing disks, Data Backup/Archival Performance unpredictability Improved VM support, flash memory, scheduling VMs Scalable storage Invent scalable store Bugs in large distributed systems Invent Debugger that relies on Distributed VMs Scaling quickly Invent Auto-Scaler that relies on ML; Snapshots

44 Real-time Bidding (Ads)

45 Algorithms on Big data Working on “Big Data”
Data mining Machine learning Visualization Traditionally assume data is in flat files or relational databases Distributed data organization puts new challenges Redesign algorithms Redesign frameworks

46 Policy and Business Challenges
Opportunity Reputation Fate Sharing Offer reputation-guarding services like those for Software Licensing Pay-for-use licenses; Bulk use sales

47 Come to the Dark Side Spam as a service Crimeware as a service
Password cracking cloud DoS attack as a service How likely is the risk buy services using stolen credit card numbers create ec2 instances using stolen keys attack authentication (SOAP, XML. XML wrapping attacks) hijack cloud infrastructure

48 Botnet as a Service

49 C & C Activities 2013 GLOBAL THREAT INTELLIGENCE REPORT (GTIR)”.

50 Underground E-shop selling access to malware-infected hosts

51 Botnet prices (Trend Micro)
DDoS attacks Spamming ( , social networks) Covert channel for information exchange PsyOPS in social networks Bitcoins


Download ppt "Lecture 2. CAP and Challenges"

Similar presentations


Ads by Google