Scalable Platforms for Web Services EECS600 Internet Applications Michael Rabinovich.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter Steenkiste (CMU) with Ningning Hu (CMU), Oliver Spatscheck.
Towards Autonomic Adaptive Scaling of General Purpose Virtual Worlds Deploying a large-scale OpenSim grid using OpenStack cloud infrastructure and Chef.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Goal: To build a ubiquitous and robust storage infrastructure Requirement: Scalability, availability, performance, robustness Solution: Dynamic object.
FI-WARE – Future Internet Core Platform FI-WARE Cloud Hosting July 2011 High-level description.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Routing and Routing Protocols
Routing.
1© Copyright 2015 EMC Corporation. All rights reserved. SDN INTELLIGENT NETWORKING IMPLICATIONS FOR END-TO-END INTERNETWORKING Simone Mangiante Senior.
Multipath Protocol for Delay-Sensitive Traffic Jennifer Rexford Princeton University Joint work with Umar Javed, Martin Suchara, and Jiayue He
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Center Traffic Management COS 597E: Software Defined Networking.
Datacenter Networks Mike Freedman COS 461: Computer Networks
Lecture Week 3 Introduction to Dynamic Routing Protocol Routing Protocols and Concepts.
CSE598C Project: Dynamic virtual server placement Yoojin Hong.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
ACDN: A CDN for Applications Pradnya Karbhari Michael Rabinovich Zhen Xiao Fred Douglis AT&T Labs -- Research.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Active Network Applications Tom Anderson University of Washington.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
1 Oracle 9i AS Availability and Scalability Margaret H. Mei Senior Product Manager, ST.
SANPoint Foundation Suite HA Robert Soderbery Sr. Director, Product Management VERITAS Software Corporation.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
1 NETE4631 Managing the Cloud and Capacity Planning Lecture Notes #8.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
1 Introducing Routing 1. Dynamic routing - information is learned from other routers, and routing protocols adjust routes automatically. 2. Static routing.
1 Chapter 27 Internetwork Routing (Static and automatic routing; route propagation; BGP, RIP, OSPF; multicast routing)
INFORMATION AND COMMUNICATION SYSTEMS MERIT 2008 Research Symposium Melbourne Engineering Graduates Look to the Future System Architecture An internetworking.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Autonomic SLA-driven Provisioning for Cloud Applications Nicolas Bonvin, Thanasis Papaioannou, Karl Aberer Presented by Ismail Alan.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Maintaining and Updating Windows Server Monitoring Windows Server It is important to monitor your Server system to make sure it is running smoothly.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
A P2P-Based Architecture for Secure Software Delivery Using Volunteer Assistance Purvi Shah, Jehan-François Pâris, Jeffrey Morgan and John Schettino IEEE.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
1 7-Jan-16 S Ward Abingdon and Witney College Dynamic Routing CCNA Exploration Semester 2 Chapter 3.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
Chapter 25 Internet Routing. Static Routing manually configured routes that do not change Used by hosts whose routing table contains one static route.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Routing and Routing Protocols CCNA 2 v3 – Module 6.
Workload Distribution Architecture
Dynamic Web Application Deployment
Introduction to Distributed Platforms
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
GGF15 – Grids and Network Virtualization
Routing.
Edge computing (1) Content Distribution Networks
AWS Cloud Computing Masaki.
Specialized Cloud Architectures
Routing.
Presentation transcript:

Scalable Platforms for Web Services EECS600 Internet Applications Michael Rabinovich

Web Resource Provisioning Challenge How much resources to provision? –Potentially unlimited client populations How to add capacity quickly? –In time for serving flash crowds What to do with extra capacity once the flash crowd ebbs?

Web Resource Types and Scaling Technologies Static Files Dynamic Pages Internet Applications Caching CDNs HPP,CSI,ESI, Result caching ???

Utility Computing Analogous to power grid –Autonomous resources added to the grid – Clients’ needs satisfied by transparent work distribution Started in scientific computing for CPU-intensive tasks Many difficult challenges in moving to Internet at large –Security and privacy –Billing and accounting –Migration of computation –Data consistency

Utility Computing for the Web: A Feasible Interim Approach Confined to a single network/hosting service provider –Security and privacy simplified by private network Natural boundaries in computation simplify migration –Automatic app deployment rather than migration of computation Typical tiered architectures simplify data consistency

Tiered Architecture of Internet Applications Web Gateway App Servers Corporate DB Server Tier 1Tier 2Tier 3 Internet Corporate network Web/App server Core App Servers Corporate DB Server Tier 1 Tier 2

High-Level View of the Platform Auth. DNS Policy Engine Usage feedback Telemetry Request Distribution Table Usage feedback Telemetry Replica placement Instructions Replica placement Instructions VIP1 VIP2 VIP3 VIP4 VIP1 VIP2

Granularity of Resource Sharing VIP1 VIP3 VIP4 Auth. DNS Policy Engine Usage feedback Telemetry Request Distribution Table Usage feedback Telemetry Replica placement Instructions Replica placement Instructions VIP2 VIP1 VIP2 Application-level server sharing –Efficient –Complex load predictions –Poor resource isolation (security, accounting) Kernel-level server sharing –Multiple app server processes Virtual machine sharing –Runtime overhead (up to x30 slowdown for system calls) Whole server allocation –Simple –Good resource isolation –Supported by cluster technologies –Slow server allocation

Major Components Framework –Dynamically installing and uninstalling applications –Maintaining replica consistency –Supporting stateful client sessions Algorithms and policies –Admission control algorithm –Request distribution algorithm –Application placement algorithm Performance and demand monitoring Auth. DNS Policy Engine

Application Server Sharing Built on top of standard Web server: Apache + (Fast) CGI Uses standard HTTP throughout

Replication Framework Inspired by work on software distribution (e.g., Marimba) Metafile for each application containing: – A list of time-stamped files (data and executable files) –An initialization script (or a pointer to it) FILE /home/applications/mapping/query_engine.cgi 1999.apr.14.08:46:12 FILE /home/applications/mapping/map_database 2000.oct.15.13:15:59 FILE /home/applications/mapping/user_preferences 2001.jan.30.18:00:05 SCRIPT mkdir /home/applications/mapping/access_stats setenv ACCESS_DIRECTORY /home/applications/mapping/access_stats ENDSCRIPT

Application Metafile A metafile is a simple static Web page Having a metafile is sufficient to deploy the application Having the current metafile is sufficient to bring the application replica up-to-date. Consistency of application replicas = consistency of cached copies of the metafile

Framework Tasks Creating a replica –Obtaining a metafile –Obtaining a tar file with all files listed in the metafile –Running the initialization script Updating a replica –Obtaining the diff of metafiles –Obtaining files that are new Deleting a replica –Retaining the deleted replica for some time to process residual requests before physical deletion

Fail-Over Support Session state maintenance

Using Cookies for Session Fail-Over

Algorithms and Policies Metrics Measurement infrastructure Algorithms

Metrics Taxonomy Measuring “What” –Proximity –Server load –Aggregate Measuring “How” –Passive measurements –Active measurements Measuring “When”: –Synchronous –Asynchronous Metric stability –Dynamic –Static

Proximity Metrics Goal: select a “nearby” server to process a client request Benefits: lower latency and network load Static metrics: –Geographical distance (+) Provides strong lower bound for latency (-) May not correlate with routing paths –Autonomous System hops (+) Counts peering points and not over-provisioned backbone hops (-) Equates large and small ASs –Network hops (+) Distinguishes large and small ASs (-) Equates local area hops, wide-area hops, and NAP hops –General drawback: do not account for congestion Dynamic metrics: –Message RTT –Path bandwidth

Combining Static Proximity Metrics Divide the Globe into large regions A client is closer to servers in the same region than in different regions Among servers in the same region, a client is closer to a server that is fewer AS hops away Among servers in the same region and the same AS, a client is closer to a server with fewer network hops Otherwise servers are equidistant.

Proximity Metrics in Practice Companies’ proprietary secret sauce –Knowledge of peering points and their congestion –Pricing with neighboring ISPs –Knowledge of Internet topology

Server Load Metrics CPU utilization Disk utilization Network card utilization The number of TCP connections Number of pending requests Server response time Others

Passive (Aggregate) Metrics

Aging Dynamic Metrics Exponential aging ave_new = (1-r) x ave_cur + r x sample

Measurement Infrastructure

Client Groups Replica placement and request distribution components must agree on the notion of proximity Replica placement is based on accesses by clients Request distribution is based on client DNS Associate clients with their LDNS servers Aggregate proximity metrics over client groups Auth. DNS LDNS Node 1 Node 2

Measuring Client-Node Proximity Goal: (client_group I, node J, app K)  Performance End-to-end response time measurements –Simple –Catch-all measure Scalability problem –1.5M client groups –~20 nodes –~100 applications –3G entries with random access! Measurement availability problem Proximity depends on multiple factors

Measurements Architecture Client group 1 Client group 2 Front-end delay measurements (aggregated over all applications) Back-end delay Measurements (aggregated over all client groups) Cost(client group I, node J, app K) = Front_delay(I,J)* Front_traffic_ratio(K) + Back_delay1(J)*Back_traffic1_ratio(K) + Back-delay2(J)*Back_traffic2_ratio(K) Front-end and back-end traffic ratios (aggregated over all client groups and AAN nodes)

Algorithms Request distribution Replica placement Taxonomy –Global optimization vs. greedy –Centralized vs. distributed/hierarchical

Request Distribution Issues Combination of server load and client proximity factors Algorithm Stability –Load oscillations due to “herd effect” –Randomization violates proximity in the common case Challenge: –Select the closest server –Guard against overload –Be responsive in load redistribution –Avoid load oscillations

Replica Placement Issues Combination of server load and client proximity factors Simple Idea (which does not quite work): –If an existing server is overloaded, create more replicas –If an existing server is underloaded, remove some replicas –If a node is closer than existing replicas for significant part of demand, migrate or replicate there

Vicious Cycles of Replications Proximity-driven replication is based on demand thresholds –Expressed in requests or bytes served Load-based replication is based on load thresholds Load and demand thresholds may cause vicious cycles Tuning thresholds is possible but leads to a fragile system 20 reqs/sec App1 15 reqs/sec5 reqs/sec