Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China www.jiahenglu.net.

Slides:



Advertisements
Similar presentations
Client/Server Computing (the wave of the future) Rajkumar Buyya School of Computer Science & Software Engineering Monash University Melbourne, Australia.
Advertisements

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Database Architectures and the Web
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Distributed Systems Architectures Slide 1 1 Chapter 9 Distributed Systems Architectures.
Lecturer: Sebastian Coope Ashton Building, Room G.18 COMP 201 web-page: Lecture.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Distributed Processing, Client/Server, and Clusters
Technical Architectures
Introduction to Distributed Systems
Distributed Systems Architectures
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Overview Distributed vs. decentralized Why distributed databases
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Ch 12 Distributed Systems Architectures
The Architecture of Transaction Processing Systems
Chapter 12 Distributed Database Management Systems
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
Chapter 9: The Client/Server Database Environment
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed Systems Architecture Presentation II Presenters Rose Kit & Turgut Tezir.
Client/Server Architecture
1 © Prentice Hall, 2002 The Client/Server Database Environment.
DATABASE MANAGEMENT SYSTEMS 2 ANGELITO I. CUNANAN JR.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 1.
Lecture The Client/Server Database Environment
 Distributed Software Chapter 18 - Distributed Software1.
The Client/Server Database Environment
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
Client/Server Architectures
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
MBA 664 Database Management Systems Dave Salisbury ( )
Database Architectures and the Web Session 5
1 소프트웨어공학 강좌 Chap 9. Distributed Systems Architectures - Architectural design for software that executes on more than one processor -
Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.
Source: George Colouris, Jean Dollimore, Tim Kinderberg & Gordon Blair (2012). Distributed Systems: Concepts & Design (5 th Ed.). Essex: Addison-Wesley.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
SEMINOR. INTRODUCTION 1. Middleware is connectivity software that provides a mechanism for processes to interact with other processes running on multiple.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Lecture 22: Client-Server Software Engineering
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
1 Distributed Databases BUAD/American University Distributed Databases.
Distributed System Architectures Yonsei University 2 nd Semester, 2014 Woo-Cheol Kim.
6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Introduction to Distributed Platforms
Database Architectures and the Web
The Client/Server Database Environment
Distributed system (Lecture 02)
The Client/Server Database Environment
CSC 480 Software Engineering
Chapter 9: The Client/Server Database Environment
Database Architectures and the Web
#01 Client/Server Computing
Chapter 17: Database System Architectures
Mobile Agents.
Database System Architectures
#01 Client/Server Computing
Presentation transcript:

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China

Cloud computing

Review: What is cloud computing? Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a serve over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.

Review: Characteristics of cloud computing Virtual. software, databases, Web servers, operating systems, storage and networking as virtual servers. On demand. add and subtract processors, memory, network bandwidth, storage.

IaaS Infrastructure as a Service PaaS Platform as a Service SaaS Software as a Service Review: Types of cloud service

Any question and any comments ?

Google App Engine

10 Review: Google App Engine Does one thing well: running web apps Simple app configuration Scalable Secure

Any question and any comments ?

Distributed system

A distributed system A B C D

client-server system Server Client

multiple servers Server

Applications Amazon Google Airline reservation systems Finance: e-transactions, e-banking, stock-exchange Military

Why distributed systems? What are the advantages? distributed vs centralized? multi-servervs client-server?

Why distributed systems? What are the advantages? distributed vs centralized? multi-servervs client-server? Geography Concurrency => Speed High-availability (if failures occur).

Why not distributed systems? What are the disadvantages? distributed vs centralized? multi-servervs client-server?

Why not distributed systems? What are the disadvantages? distributed vs centralized? multi-servervs client-server? Expensive (to have redundancy) Concurrency => Interleaving => Bugs Failures lead to incorrectness.

Outline Concepts and Terminology What is Distributed Distributed data & objects Distributed execution Three tier architectures Transaction concepts

What’s a Distributed System? Centralized: everything in one place stand-alone PC or Mainframe Distributed: some parts remote distributed users distributed execution distributed data

Why Distribute? No best organization Companies constantly swing between Centralized: focus, control, economy Decentralized: adaptive, responsive, competitive Why distribute? reflect organization or application structure empower users / producers improve service (response / availability) distributed load use PC technology (economics)

What Should Be Distributed? Users and User Interface Thin client Processing Trim client Data Fat client Will discuss tradeoffs later Database Business Objects workflow Presentation

Transparency in Distributed Systems Make distributed system as easy to use and manage as a centralized system Give a Single-System Image Location transparency: hide fact that object is remote hide fact that object has moved hide fact that object is partitioned or replicated Name doesn’t change if object is replicated, partitioned or moved.

Naming- The basics Objects have Globally Unique Identifier (GUIDs) location(s) = address(es) name(s) addresses can change objects can have many names Names are context dependent: KGB  CIA) Many naming systems UNC: \\node\device\dir\dir\dir\object Internet: LDAP: ldap://ldap.domain.root/o=org,c=US,cn=dir guid Jim Address James

Name Servers in Distributed Systems Name servers translate names + context to address (+ GUID) Name servers are partitioned (subtrees of name space) Name servers replicate root of name tree Name servers form a hierarchy Distributed data from hell: high read traffic high reliability & availability autonomy root North South Southern names Northern names root

Autonomy in Distributed Systems Owner of site (or node, or application, or database) Wants to control it If my part is working, must be able to access & manage it (reorganize, upgrade, add user,…) Autonomy is Essential Difficult to implement. Conflicts with global consistency examples: naming, authentication, admin…

Security The Basics Authentication server subject + Authenticator => (Yes + token) | No Security matrix: who can do what to whom Access control list is column of matrix “who” is authenticated ID In a distributed system, “who” and “what” and “whom” are distributed objects subject Object Permissions

Security in Distributed Systems Security domain: nodes with a shared security server. Security domains can have trust relationships: A trusts B: A “believes” B when it says this is Security domains form a hierarchy. Delegation: passing authority to a server when A asks B to do something (e.g. print a file, read a database) B may need A’s authority Autonomy requires: each node is an authenticator each node does own security checks Internet Today: no trust among domains (fire walls, many passwords) trust based on digital signatures

Clusters The Ideal Distributed System. Cluster is distributed system BUT single location manager security policy relatively homogeneous communications is high bandwidth low latency low error rate Clusters use distributed system techniques for load distribution storage execution growth fault tolerance

Cluster: Shared What? Shared Memory Multiprocessor Multiple processors, one memory all devices are local DEC or SGI or Sequent 16x nodes Shared Disk Cluster an array of nodes all shared common disks VAXcluster + Oracle Shared Nothing Cluster each device local to a node ownership may change Tandem, SP2, Wolfpack

Outline Concepts and Terminology Why Distribute Distributed data & objects Partitioned Replicated Distributed execution Three tier architectures Transaction concepts

Partitioned Data Break file into disjoint groups Exploit data access locality Put data near consumer Less network traffic Better response time Better availability Owner controls data autonomy Spread Load data or traffic may exceed single store Orders N.A. S.A. Europe Asia

How to Partition Data? How to Partition by attribute or random or by source or by use Problem: to find it must have Directory (replicated) or Algorithm Encourages attribute-based partitioning N.A. S.A. Europe Asia

Replicated Data Place fragment at many sites Pros: +Improves availability +Disconnected (mobile) operation +Distributes load +Reads are cheaper Cons:  N times more updates  N times more storage Placement strategies: Dynamic: cache on demand Static: place specific Catalog

Updating Replicated Data When a replica is updated, how do changes propagate? Master copy, many slave copies (SQL Server) always know the correct value (master) change propagation can be transactional as soon as possible periodic on demand Symmetric, and anytime (Access) allows mobile (disconnected) updates updates propagated ASAP, periodic, on demand non-serializable colliding updates must be reconciled. hard to know “real” value

Outline Concepts and Terminology Why Distribute Distributed data & objects Partitioned Replicated Distributed execution remote procedure call queues Three tier architectures Transaction concepts

Distributed Execution Threads and Messages Thread is Execution unit (software analog of cpu+memory) Threads execute at a node Threads communicate via Shared memory (local) Messages (local and remote) threads shared memory messages

Peer-to-Peer or Client-Server Peer-to-Peer is symmetric: Either side can send Client-server client sends requests server sends responses simple subset of peer-to-peer request response

Connection-less or Connected Connection-less request contains client id client context work request client authenticated on each message only a single response message e.g. HTTP, NFS v1  Connected (sessions)  open - request/reply - close  client authenticated once  Messages arrive in order  Can send many replies (e.g. FTP)  Server has client context (context sensitive)  e.g. Winsock and ODBC  HTTP adding connections

Remote Procedure Call: The key to transparency Object may be local or remote Methods on object work wherever it is. Local invocation y = pObj->f(x); f() x val y = val; return val;

Remote Procedure Call: The key to transparency Remote invocation Obj Local? x val y = val; f() return val; y = pObj->f(x); marshal un marshal un marshal x proxy un marshal pObj->f(x) marshal x stub Obj Local? x val f() return val; val

Transaction Object Request Broker (ORB) Registers Servers Manages pools of servers Connects clients to servers Does Naming, request-level authorization, Provides transaction coordination (new feature) Old names: Transaction Processing Monitor, Web server, NetWare Object-Request Broker

Send updates to correct partition Using RPC for Transparency Partition Transparency part Local? x y = pfile->write(x); send to correct partition send to correct partition x un marshal pObj->write(x) marshal x x val write() return val; val

Send updates to EACH node Using RPC for Transparency Replication Transparency x y = pfile->write(x); Send to each replica Send to each replica x val

Outline Concepts and Terminology Why Distributed Distributed data & objects Distributed execution remote procedure call queues Three tier architectures what why Transaction concepts

Client/Server Interactions All can be done with RPC Request-Response response may be many messages Conversational server keeps client context Dispatcher three-tier: complex operation at server Queued de-couples client from server allows disconnected operation CS CS C S S S C S S

Queued Request/Response Time-decouples client and server Three Transactions Almost real time, ASAP processing Communicate at each other’s convenience Allows mobile (disconnected) operation Disk queues survive client & server failures Client Server Submit Perform Response

Why Queued Processing? Prioritize requests ambulance dispatcher favors high-priority calls Manage Workflows Deferred processing in mobile apps Interface heterogeneous systems EDI, MOM: Message-Oriented-Middleware DAD: Direct Access to Data OrderBuildShipInvoicePay

Work Distribution Spectrum Presentation and plug-ins Workflow manages session & invokes objects Business objects Database Fat Thin Fat Thin Database Business Objects workflow Presentation

PC Evolution to Three Tier Intelligence migrated to server Stand-alone PC (centralized) PC + File & print server message per I/O PC + Database server message per SQL statement PC + App server message per transaction ActiveX Client, ORB ActiveX server, Xscript disk I/O IO request reply SQL Statement Transaction

The Pattern: Three Tier Computing Clients do presentation, gather input Clients do some workflow (Xscript) Clients send high-level requests to ORB (Object Request Broker) ORB dispatches workflows and business objects -- proxies for client, orchestrate flows & queues Server-side workflow scripts call on distributed business objects to execute task Database Business Objects workflow Presentation

The Three Tiers Web Client HTML VB or Java Script Engine VB or Java Virt Machine VBscritpt JavaScrpt VB Java plug-ins Internet ORB HTTP+ DCOM Object server Pool Middleware ORB TP Monitor Web Server... DCOM (oleDB, ODBC,...) Object & Data server. LU6.2 IBM Legacy Gateways

Why Did Everyone Go To Three-Tier? Manageability Business rules must be with data Middleware operations tools Performance (scalability) Server resources are precious ORB dispatches requests to server pools Technology & Physics Put UI processing near user Put shared data processing near shared data Database Business Objects workflow Presentation

What Middleware Does ORB, TP Monitor, Workflow Mgr, Web Server Registers transaction programs workflow and business objects (DLLs) Pre-allocates server pools Provides server execution environment Dynamically checks authority (request-level security) Does parameter binding Dispatches requests to servers parameter binding load balancing Provides Queues Operator interface

Server Side Objects Easy Server-Side Execution ORB gives simple execution environment Object gets start invoke shutdown Everything else is automatic Drag & Drop Business Objects Network Thread Pool Queue Connections Context Security Shared Data Receiver Synchronization Service logic Configuration Management A Server