Lecture 18: Scalable Web Services

Slides:



Advertisements
Similar presentations
Licensed under Creative Commons Attribution Sharealike Noncommercial
Advertisements

Database Systems: Design, Implementation, and Management
Tableau Software Australia
Database Architectures and the Web
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Distributed Database Management Systems
Overview Distributed vs. decentralized Why distributed databases
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed Systems: Client/Server Computing
Distributed Databases
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
22-Aug-15 | 1 |1 | Help! I need more servers! What do I do? Scaling a PHP application.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Design – Lecture 16
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
VMware vSphere Configuration and Management v6
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
LM 9. Distributed Database Dr. Lei Li 1. Note: The content of the slides including figures are mainly based on a publicly available textbook chapter:
Databases and DBMSs Todd S. Bacastow January 2005.
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
N-Tier Architecture.
Improving searches through community clustering of information
Distributed Shared Memory
Parallel Databases.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Distributed Systems CS
The Client/Server Database Environment
The Client/Server Database Environment
CHAPTER 3 Architectures for Distributed Systems
Introduction to client/server architecture
Storage Virtualization
Database Performance Tuning and Query Optimization
#01 Client/Server Computing
What is the Azure SQL Datawarehouse?
Data, Databases, and DBMSs
G063 - Distributed Databases
Outline Midterm results summary Distributed file systems – continued
Introduction to Databases Transparencies
Lecture 1: Multi-tier Architecture Overview
Cloud computing mechanisms
AWS Cloud Computing Masaki.
Distributed Databases
Cloud Computing Architecture
Specialized Cloud Architectures
Distributed Systems CS
Chapter 11 Database Performance Tuning and Query Optimization
Database System Architectures
Client/Server Computing and Web Technologies
#01 Client/Server Computing
Presentation transcript:

Lecture 18: Scalable Web Services COSC6376 Cloud Computing Lecture 18: Scalable Web Services Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston

Outline Scalable services

Building scalable web services A relatively easy problem? Why? HTTP: stateless, request-response protocol decoupled, independent requests How? divide and conquer replicate, partition, distribute, load balance In the end you want to build a Web 2.0 app that can serve millions of users with ZERO downtime

The Variables Scalability - Number of users / sessions / transactions / operations the entire system can perform Performance – Optimal utilization of resources Responsiveness – Time taken per operation Availability - Probability of the application or a portion of the application being available at any given point in time Downtime Impact - The impact of a downtime of a server/service/resource - number of users, type of impact etc Cost Maintenance Effort High: scalability, availability, performance & responsiveness Low: downtime impact, cost & maintenance effort Building a Scalable Architecture for Web Apps. Bhavin Turakhia

The Factors Platform selection Hardware Application Design Database/Datastore Structure and Architecture Deployment Architecture Storage Architecture Abuse prevention Monitoring mechanisms … and more Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Lets Start … We will now build an example architecture for an example app using the following iterative incremental steps – Inspect current Architecture Identify Scalability Bottlenecks Identify SPOFs and Availability Issues Identify Downtime Impact Risk Zones Apply one of - Vertical Scaling Vertical Partitioning Horizontal Scaling Horizontal Partitioning Repeat process Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 1 – Lets Start … Appserver & DBServer Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 2 – Vertical Scaling Appserver, DBServer CPU CPU RAM RAM Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 2 - Vertical Scaling Introduction Increasing the hardware resources without changing the number of nodes Referred to as “Scaling up” the Server Advantages Simple to implement Disadvantages Finite limit Hardware does not scale linearly (diminishing returns for each incremental unit) Requires downtime Increases downtime Impact Incremental costs increase exponentially Appserver, DBServer CPU CPU CPU CPU RAM RAM RAM RAM Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 3 – Vertical Partitioning (Services) Introduction Deploying each service on a separate node Positives Increases per application Availability Task-based specialization, optimization and tuning possible better cache performance Reduces context switching No changes to App required Flexibility increases AppServer DBServer Example www.blah.com mail.blah.com images.blah.com shopping.blah.com my.blah.com etc. etc. Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Vertical Partitioning (Services) Disadvantages lower peak capacity sub-optimal resource utilization coarse load balancing across servers/services finite Scalability management costs Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Understanding Vertical Partitioning The term Vertical Partitioning denotes – Increase in the number of nodes by distributing the tasks/functions Each node (or cluster) performs separate Tasks Each node (or cluster) is different from the other Vertical Partitioning can be performed at various layers (App / Server / Data / Hardware etc) Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 4 – Horizontal Scaling (App Server) Introduction Increasing the number of nodes of the App Server through Load Balancing Referred to as “Scaling out” the App Server Load Balancer AppServer AppServer AppServer DBServer Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Understanding Horizontal Scaling The term Horizontal Scaling denotes – Increase in the number of nodes by replicating the nodes Each node performs the same Tasks Each node is identical Typically the collection of nodes maybe known as a cluster Also referred to as “Scaling Out” Horizontal Scaling can be performed for any particular type of node (AppServer / DBServer etc) Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balancer – Hardware vs Software Hardware Load balancers are faster Software Load balancers are more customizable With HTTP Servers load balancing is typically combined with http accelerators Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balancer – Session Management Sticky Sessions Sticky Sessions Requests for a given user are sent to a fixed App Server Observations Asymmetrical load distribution Downtime Impact – Loss of session data User 1 User 2 Load Balancer AppServer AppServer AppServer Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balancer – Session Management Central Session Storage Central Session Storage Central Session Store Introduces SPOF An additional variable Session reads and writes generate Disk + Network I/O Also known as a Shared Session Store Cluster Load Balancer AppServer AppServer AppServer Session Store Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balancer – Session Management Clustered Session Management Easier to setup No SPOF Network I/O increases exponentially with increase in number of nodes In very rare circumstances a request may get stale session data User request reaches subsequent node faster than intra-node message Intra-node communication fails Clustered Session Management Load Balancer AppServer AppServer AppServer Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balancer – Session Management Recommendation Use Clustered Session Management if you have – Smaller Number of App Servers Fewer Session writes Use a Central Session Store elsewhere Use sticky sessions only if you have to Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balancer – Removing SPOF Active-Passive LB In a Load Balanced App Server Cluster the LB is an SPOF Active-Active nevertheless assumes that each LB is independently able to take up the load of the other If one wants ZERO downtime, then Active-Active becomes truly cost beneficial only if multiple LBs (more than 3 to 4) are daisy chained as Active-Active forming an LB Cluster Users Load Balancer Load Balancer AppServer AppServer AppServer Active-Active LB Users Load Balancer Load Balancer AppServer AppServer AppServer Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 4 – Horizontal Scaling (App Server) At the end of Step 4 Positives Increases Availability and Scalability No changes to App required Easy setup Negatives Finite Scalability Load Balanced App Servers DBServer Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 5 – Vertical Partitioning (Hardware) Introduction Partitioning out the Storage function using a SAN Positives Allows “Scaling Up” the DB Server Boosts Performance of DB Server Negatives Increases Cost Load Balanced App Servers DBServer SAN Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 6 – Horizontal Scaling (DB) Introduction Increasing the number of DB nodes Referred to as “Scaling out” the DB Server Options Shared nothing Cluster Real Application Cluster (or Shared Storage Cluster) Load Balanced App Servers DBServer DBServer DBServer SAN Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Shared Nothing Cluster Each DB Server node has its own complete copy of the database Nothing is shared between the DB Server Nodes This is achieved through DB Replication at DB / Driver / App level or through a proxy Supported by most database software natively or through 3rd party software DBServer DBServer DBServer Database Database Database Note: Actual DB files maybe stored on a central SAN Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Replication Considerations Master-Slave Writes are sent to a single master which replicates the data to multiple slave nodes Replication maybe cascaded Simple setup No conflict management required Multi-Master Writes can be sent to any of the multiple masters which replicate them to other masters and slaves Conflict Management required Deadlocks possible if same data is simultaneously modified at multiple places Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Replication Considerations Asynchronous Guaranteed, but out-of-band replication from Master to Slave Master updates its own db and returns a response to client Replication from Master to Slave takes place asynchronously Faster response to a client Slave data is marginally behind the Master Requires modification to App to send critical reads and writes to master, and load balance all other reads Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Replication Considerations Synchronous Guaranteed, in-band replication from Master to Slave Master updates its own db, and confirms all slaves have updated their db before returning a response to client Slower response to a client Slaves have the same data as the Master at all times Requires modification to App to send writes to master and load balance all reads Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Replication Considerations Replication at RDBMS level Support may exists in RDBMS or through 3rd party tool Faster and more reliable App must send writes to Master, reads to any db and critical reads to Master Replication at Driver level Driver layer ensures writes are performed on all connected DBs Reads are load balanced Critical reads are sent to a Master In most cases RDBMS agnostic Slower and in some cases less reliable Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Real Application Cluster All DB Servers in the cluster share a common storage area on a SAN All DB servers mount the same block device The filesystem must be a clustered file system (eg GFS) Currently only supported by Oracle Real Application Cluster Can be very expensive (licensing fees) DBServer DBServer DBServer SAN Database Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Load Balanced App Servers Recommendation Try and choose a DB which natively supports Master- Slave replication Use Master-Slave Async replication Write your layer to ensure writes are sent to a single DB reads are load balanced Critical reads are sent to a master Load Balanced App Servers DBServer DBServer DBServer Writes & Critical Reads Other Reads Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 6 – Horizontal Scaling (DB) Our architecture now looks like this Positives As Web servers grow, Database nodes can be added DB Server is no longer SPOF Negatives Finite limit Load Balanced App Servers DB Cluster DB SAN Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 7 – Vertical / Horizontal Partitioning (DB) Introduction Increasing the number of DB Clusters by dividing the data Options Vertical Partitioning - Dividing tables / columns Horizontal Partitioning - Dividing by rows (value) Load Balanced App Servers DB Cluster DB SAN Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Vertical Partitioning (DB) Take a set of tables and move them onto another DB Eg in a social network - the users table and the friends table can be on separate DB clusters Each DB Cluster has different tables Application code or Driver code or a proxy knows where a given table is and directs queries to the appropriate DB Can also be done at a column level by moving a set of columns into a separate table App Cluster DB Cluster 1 Table 1 Table 2 DB Cluster 2 Table 3 Table 4 Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Vertical Partitioning (DB) Negatives One cannot perform SQL joins or maintain referential integrity Finite Limit App Cluster DB Cluster 1 Table 1 Table 2 DB Cluster 2 Table 3 Table 4 Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Horizontal Partitioning (DB) Take a set of rows and move them onto another DB Eg in a social network – each DB Cluster can contain all data for 1 million users Each DB Cluster has identical tables Application code or Driver code or a proxy knows where a given row is and directs queries to the appropriate DB Negatives SQL unions for search type queries must be performed within code App Cluster DB Cluster 1 Table 1 Table 2 Table 3 Table 4 DB Cluster 2 Table 1 Table 2 Table 3 Table 4 1 million users 1 million users Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Horizontal Partitioning (DB) Techniques FCFS 1st million users are stored on cluster 1 and the next on cluster 2 Round Robin Least Used (Balanced) Each time a new user is added, a DB cluster with the least users is chosen Hash based A hashing function is used to determine the DB Cluster in which the user data should be inserted Value Based User ids 1 to 1 million stored in cluster 1 OR all users with names starting from A-M on cluster 1 Except for Hash and Value based all other techniques also require an independent lookup map – mapping user to Database Cluster This map itself will be stored on a separate DB (which may further need to be replicated) Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 7 – Vertical / Horizontal Partitioning (DB) Our architecture now looks like this Positives As App servers grow, Database Clusters can be added Note: This is not the same as table partitioning provided by the db (eg MSSQL) We may actually want to further segregate these into Sets, each serving a collection of users (refer next slide Load Balanced App Servers Lookup Map DB Cluster DB DB Cluster DB SAN Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 8 – Separating Sets Now we consider each deployment as a single Set serving a collection of users Global Lookup Map Global Redirector Load Balanced App Servers Load Balanced App Servers Lookup Map Lookup Map DB Cluster DB DB Cluster DB DB Cluster DB DB Cluster DB SAN SAN SET 1 – 10 million users SET 2 – 10 million users Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Creating Sets The goal behind creating sets is easier manageability Each Set is independent and handles transactions for a set of users Each Set is architecturally identical to the other Each Set contains the entire application with all its data structures Sets can even be deployed in separate datacenters Users may even be added to a Set that is closer to them in terms of network latency Building a Scalable Architecture for Web Apps. Bhavin Turakhia

Step 8 – Horizontal Partitioning (Sets) Our architecture now looks like this Positives Infinite Scalability Negatives Aggregation of data across sets is complex Users may need to be moved across Sets if sizing is improper Global App settings and preferences need to be replicated across Sets Global Redirector App Servers Cluster App Servers Cluster DB Cluster DB Cluster DB Cluster DB Cluster SAN SAN SET 1 SET 2 Building a Scalable Architecture for Web Apps. Bhavin Turakhia