G22.3250-001 Robert Grimm New York University Scalable Network Services.

Slides:



Advertisements
Similar presentations
Distributed Processing, Client/Server and Clusters
Advertisements

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Enterprise Job Scheduling for Clustered Environments Stratos Paulakis, Vassileios Tsetsos, and Stathes Hadjiefthymiades P ervasive C omputing R esearch.
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
Distributed components
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Topics ACID vs BASE Starfish Availability TACC Model Transend Measurements SNS Architecture.
CSE 190: Internet E-Commerce Lecture 16: Performance.
© 2001 Stanford Distinguishing P, S, D state n Persistent: loss inevitably affects application correctness, cannot easily be regenerated l Example: billing.
G Robert Grimm New York University Scalable Network Services.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
“ Adapting to Network and Client Variation Using Infrastructural Proxies : Lessons and Perspectives ” University of California Berkeley Armando Fox, Steven.
A Distributed Proxy Server for Wireless Mobile Web Service Kisup Kim, Hyukjoon Lee, and Kwangsue Chung Information Network 2001, 15 th Conference.
Topics ACID vs BASE Starfish Availability TACC Model Transend Measurements SNS Architecture.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
Presentation on Clustering Paper: Cluster-based Scalable Network Services; Fox, Gribble et. al Internet Services Suman K. Grandhi Pratish Halady.
Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…
Systems Issues for Scalable, Fault Tolerant Internet Services Yatin Chawathe Eric Brewer To appear in Middleware ’98
Lecture 8 Epidemic communication, Server implementation.
Study of Server Clustering Technology By Thao Pham and James Horton For CS526, Dr. Chow.
Course Instructor: Aisha Azeem
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Copyright © 2002 Wensong Zhang. Page 1 Free Software Symposium 2002 Linux Virtual Server: Linux Server Clusters for Scalable Network Services Wensong Zhang.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Exploiting Application Semantics: Harvest, Yield CS 444A Fall 99 Software for Critical Systems Armando Fox & David Dill © 1999 Armando Fox.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
1 The Google File System Reporter: You-Wei Zhang.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.
Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Server to Server Communication Redis as an enabler Orion Free
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
 Cachet Technologies 1998 Cachet Technologies Technology Overview February 1998.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Hiearchial Caching in Traffic Server. Hiearchial Caching  A set of techniques and mechanisms to increase the size and performance of network caches.
System Models Advanced Operating Systems Nael Abu-halaweh.
SERVERS. General Design Issues  Server Definition  Type of server organizing  Contacting to a server Iterative Concurrent Globally assign end points.
Cluster-Based Scalable
Improving searches through community clustering of information
Large Distributed Systems
Network Load Balancing
Ministry of Higher Education
Distributed Content in the Network: A Backbone View
Advanced Operating Systems
Systems Issues for Scalable, Fault Tolerant Internet Services
Building a Database on S3
Introduction to Databases Transparencies
Database System Architectures
Abstractions for Fault Tolerance
Presentation transcript:

G Robert Grimm New York University Scalable Network Services

Administrivia  Final reports due Tuesday, May 4, 10 am  Just like a conference paper, pages  Final presentations Thursday, May 6, 3:30-4:45 pm  Just like a conference talk  20 minutes talk  10 minutes questions  Added one more paper for 4/15 class  Follow-up discussion to main paper from HotOS  “Manageability” typo on course home page fixed

Altogether Now: The Three Questions  What is the problem?  What is new or different?  What are the contributions and limitations?

Implementing Global Services: Clusters to the Rescue?  Advantages  Scalability  Internet workloads tend to be highly parallel  Clusters can be more powerful than biggest iron  Clusters can also be expanded incrementally  Availability  Clusters have “natural” redundancy due to independence of nodes  There are no central points of failure in clusters anymore?  Clusters can be upgraded piecemeal (hot upgrade)  Commodity building blocks  Clusters of PCs have better cost/performance than big iron  PCs are also much easier to upgrade and order

Implementing Global Services: Clusters to the Rescue? (cont.)  Challenges  Administration  Maintaining lots of PCs vs. one big box can be daunting  Component vs. system replication  To achieve incremental scalability, we need component-level instead of system-level replication  Partial failures  Clusters need to handle partial failures while big iron does not  Do we believe this?  Shared state  Distributing shared state across the cluster is even harder than for big iron

BASE Semantics  Trading between consistency, performance, availability  One extreme: ACID  Perfect consistency, higher overheads, limited availability  An alternative point: BASE  Basically available  Can tolerate stale data  Soft state to improve performance  Eventual consistency  Can provide approximate answers  The reality: Different components with different semantics  Focus on services with mostly BASE data

Overview of Architecture  Front ends accept requests and include service logic  Workers perform the actual operations  Caches store content before and after processing  User profile DB stores user preferences (ACID!)  Manager performs load balancing  Graphical monitor supports administrators

Software Structured in Three Layers  SNS (Scalable Network Service)  To be reused across clusters  TACC (Transformation, Aggregation, Caching, Customization)  Easily extensible through building blocks  Service  Application-specific

The SNS Support Layer  Components replicated for availability and scalability  Instances are independent from each other and launched on demand  Exceptions: Network, resource manager, user profile database  Control decisions isolated in front ends  Workers are simple and stateless (profile data part of inputs)  Load balancing enabled by central manager  Load data collected from workers and distributed to FEs  Centralization limits complexity  Overflow pool helps handle bursts  User workstations within an organization (is this realistic?)

The SNS Support Layer (cont.)  Timeouts (?) detect failures, peers restart services  Cached state used to initialize service  Soft state rebuilt over time  Stubs hide complexity from front ends and workers  Worker stubs supply load data, report failures  Also hide multi-threaded operation  Manager stubs select workers  Including load balancing decisions based on hints

The TACC Programming Model  Transformation  Changes content of a single resource  Aggregation  Collects data from several sources  Caching  Avoids lengthy accesses from origin servers  Avoids repeatedly performing transformations/aggregations  Customization  Based on key-value pairs provided with request  Key goal: Compose workers just like Unix pipes

TranSend Implementation  Web distillation proxy for Berkeley’s modem pool  Front ends process HTTP requests  Look up user preferences  Build pipeline of distillers  Manager tracks distillers  Announces itself on IP multicast group  Collects load information as running estimate  Performs lottery-scheduling between instances  Spawns new instances of distillers

TranSend Implementation (cont.)  Fault tolerance and recovery provided by peers  Based on broken connections, timeouts, beacons  For former, distillers register with manager  Manager (re)starts distillers and front ends  Front ends restart manager  Overall scalability and availability aided by BASE  Load balancing data is (slightly) stale  Timeouts allow recovery from incorrect choices  Content is soft state and can be recovered from cache/server  Answers may be approximate (on system overload)  Use cached data instead of recomputing data

Comparison between TranSend and HotBot

TranSend Evaluation  Based on traces from modem pool  25,000 users  600 modems  14.4K, 28.8K  Distribution of content sizes  Most content accessed is small  But average byte part of large content  Therefore, distillation makes sense  Evaluation of TranSend based on trace playback engine, isolated cluster

Burstiness of Requests  Three different time scales  24 hours  3.5 hours  3.5 minutes  How to manage overflow pool?  Select utilization level for worker pool  Or, select utilization level for overflow pool  Are these the right strategies?

Distiller Performance  Not surprisingly, grows with input size: 8ms/1KB

Cache Partition Performance  Summary of results  Average cache hit takes 27ms, 15ms of which used for TCP  What’s wrong with this picture?  95% of all cache hits take less than 100ms to service  Miss penalty varies widely: 100ms to 100s  Some observations  Cache hit rate grows with cache size, but plateaus for population size (why?)  Cache hit rate also grows with population size, even for fixed-size cache (why?)  What do we conclude for front end capacity?

Self-Tuning and Load Balancing  Spawning threshold H, stability period D

Self-Tuning and Load Balancing The Gory Details  What is the trade-off for D?

Scalability under Load  What are the most important bottlenecks?  How can we overcome them?

What Do You Think?