Cluster Load Balancing for Fine-grain Network Services

Slides:

Advertisements

Similar presentations

The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.

Advertisements

Welcome to Middleware Joseph Amrithraj

Enterprise Web Architecture and Performance Shennon Shen & Scott Carey --- Plumtree Software Inc.

1 Scoped and Approximate Queries in a Relational Grid Information Service Dong Lu, Peter A. Dinda, Jason A. Skicewicz Prescience Lab, Dept. of Computer.

2. Computer Clusters for Scalable Parallel Computing

Energy Conservation in Datacenters through Cluster Memory Management and Barely-Alive Memory Servers Vlasia Anagnostopoulou Susmit.

Scaling up a Web-Based Intelligent Tutoring System Jozsef Patvarczki, Shane Almeida, and Neil Heffernan Computer Science Department Our research team has.

SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,

Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.

Web Server Hardware and Software

City University London

Computer Science Scalability of Linux Event-Dispatch Mechanisms Abhishek Chandra University of Massachusetts Amherst David Mosberger Hewlett Packard Labs.

Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.

Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.

OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”

Web Server Software Architectures Author: Daniel A. Menascé Presenter: Noshaba Bakht.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

New Challenges in Cloud Datacenter Monitoring and Management

Understanding and Managing WebSphere V5

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy,

OpenFlow-Based Server Load Balancing GoneWild Author : Richard Wang, Dana Butnariu, Jennifer Rexford Publisher : Hot-ICE'11 Proceedings of the 11th USENIX.

PMIT-6102 Advanced Database Systems

Pepper: An Elastic Web Server Farm for Cloud based on Hadoop Author ： S. Krishnan, J.-S. Counio Date ： Speaker ： Sian-Lin Hong IEEE International.

Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

An Efficient Topology-Adaptive Membership Protocol for Large- Scale Cluster-Based Services Jingyu Zhou * §, Lingkun Chu*, Tao Yang* § * Ask Jeeves §University.

Google File System Simulator Pratima Kolan Vinod Ramachandran.

July 2003 Sorrento: A Self-Organizing Distributed File System on Large-scale Clusters Hong Tang, Aziz Gulbeden and Tao Yang Department of Computer Science,

High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.

Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.

Orbited Scaling Bi-directional web applications A presentation by Michael Carter

Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.

1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.

International Directory Network (IDN) Scalability, Security and Interoperability WGISS, 2006 Tom Northcutt Systems Administrator: GCMD September 13, 2006.

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,

THE LITTLE ENGINE(S) THAT COULD: SCALING ONLINE SOCIAL NETWORKS B 圖資三謝宗昊.

INTERNET AND ADHOC SERVICE DISCOVERY BY: NEHA CHAUDHARY.

Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.

9 Systems Analysis and Design in a Changing World, Fourth Edition.

GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.

Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Experiences with OGSA-DAI : Portlet Access and Benchmark Deepti Kodeboyina and Beth Plale Computer Science Dept. Indiana University.

11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.

Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.

Computer Network Lab. Korea University Computer Networks Labs Se-Hee Whang.

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.

Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.

1 Design and Implementation of a High-Performance Distributed Web Crawler Polytechnic University Vladislav Shkapenyuk, Torsten Suel 06/13/2006 석사 2 학기.

Optimizing Distributed Actor Systems for Dynamic Interactive Services

SDN challenges Deployment challenges

Cluster-Based Scalable

Introduction to Load Balancing:

Distributed Database Management Systems

Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.

Principles of Network Applications

CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037

Processes The most important processes used in Web-based systems and their internal organization.

CHAPTER 3 Architectures for Distributed Systems

Department of Computer Science University of California, Santa Barbara

Chapter 17: Database System Architectures

Distributed P2P File System

Integrated Resource Management for Cluster-based Internet Services

Shen, Yang, Chu, Holliday, Kuschner, and Zhu

Introduction to Databases Transparencies

CS510 - Portland State University

Database System Architectures

2019/5/13 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter：Hung-Yen Wang Authors：Peng Wang, George Trimponias, Hong Xu,

Department of Computer Science University of California, Santa Barbara

Presentation transcript:

Cluster Load Balancing for Fine-grain Network Services Shen, Yang, and Chu 1/3/2019 Cluster Load Balancing for Fine-grain Network Services Kai Shen, Tao Yang, and Lingkun Chu Department of Computer Science University of California at Santa Barbara http://www.cs.ucsb.edu/projects/neptune IPDPS 2002

Cluster-based Network Services Shen, Yang, and Chu 1/3/2019 Cluster-based Network Services Emerging deployment of large-scale complex clustered services. Google: 150M searches per day; index of more than 2B pages; thousands of Linux servers. Teoma search (powering Ask Jeeves search): a Sun/Solaris cluster of hundreds of processors. Web portals: Yahoo!, MSN, AOL, etc. Key requirements: availability and scalability. 1/3/2019 IPDPS 2002 IPDPS 2002

Architecture of a Clustered Service: Search Engine Shen, Yang, and Chu 1/3/2019 Architecture of a Clustered Service: Search Engine Index servers (partition 1) Firewall/ Web switch Local-area network Index servers (partition 2) Web server/ Query handlers Doc servers 1/3/2019 IPDPS 2002 IPDPS 2002

“Neptune” Project http://www.cs.ucsb.edu/projects/neptune Shen, Yang, and Chu 1/3/2019 “Neptune” Project http://www.cs.ucsb.edu/projects/neptune A scalable cluster-based software infrastructure to shield clustering complexities from service authors. Scalable clustering architecture with load-balancing support. Integrated resource management. Service replication – replica consistency and performance scalability. Deployment: At Internet search engine Teoma www.teoma.com for more than a year. Serve Ask Jeeves search www.ask.com since December 2001. (Serving 6-7M searches per day as of January 2002.) 1/3/2019 IPDPS 2002 IPDPS 2002

Neptune Clustering Architecture – Inside a Node Shen, Yang, and Chu 1/3/2019 Neptune Clustering Architecture – Inside a Node Network to the rest of the cluster Service Access Point Service Load-balancing Subsystem Service Availability Directory Publishing Subsystem Service Consumers Service Runtime Services 1/3/2019 IPDPS 2002 IPDPS 2002

Cluster Load Balancing Shen, Yang, and Chu 1/3/2019 Cluster Load Balancing Design goals: Scalability – scalable performance; non-scaling overhead. Availability – no centralized node/component. For fine-grain services: Already widespread. Additional challenges: Severe system state fluctuation  more sensitive to load information delay. More frequent service requests  low per-request load balancing overhead. 1/3/2019 IPDPS 2002 IPDPS 2002

Shen, Yang, and Chu 1/3/2019 Evaluation Traces Traces of two service cluster components from Internet search engine Teoma; collected during one-week of July 2001; the peak-time portion is used. Traces Number of requests Arrival interval Service time Total Peak Mean Std-dev MediumGrain 1,055,359 126,518 341.5ms 321.1ms 208.9ms 62.9ms FineGrain 1,171,838 98,933 436.7ms 349.4ms 22.2ms 10.0ms 1/3/2019 IPDPS 2002 IPDPS 2002

Broadcast Policy Broadcast policy: Shen, Yang, and Chu 1/3/2019 Broadcast Policy Broadcast policy: An agent at each node collects the local load index and broadcasts it at various intervals. Another agent listens to broadcasts from other nodes and maintains a directory locally. Each service request is directed to the node with lightest load index in the local directory. Load index – number of active service requests. Advantages: Require no centralized component; Very low per-request overhead. 1/3/2019 IPDPS 2002 IPDPS 2002

Broadcast Policy with Varying Broadcast Frequency (16-node) Shen, Yang, and Chu 1/3/2019 Broadcast Policy with Varying Broadcast Frequency (16-node) Mean response time (norm. to Cent.) Mean response time (norm. to Cent.) <A> server 50% busy <B> server 90% busy 10 10 MediumGrain 8 MediumGrain 8 FineGrain FineGrain Centralized Centralized 6 6 4 4 2 2 Mean response time (norm. to IDEAL) 31.25 62.5 125 250 500 1000 Mean response time (norm. to IDEAL) 31.25 62.5 125 250 500 1000 Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean broadcast interval (in ms) Mean broadcast interval (in ms) Too much dependent on frequent broadcasts for fine-grain services at high load. Reasons: load index staleness, flocking effect. 1/3/2019 IPDPS 2002 IPDPS 2002

Shen, Yang, and Chu 1/3/2019 Random Polling Policy For each service request, a polling agent on the service consumer node randomly polls a certain number (poll size) of service nodes for load information; picks the node responding with the lightest load. Random polling with a small poll size. Require no centralized components; Per-request overhead is limited by the poll size; Small load information delay due to just-in-time polling. 1/3/2019 IPDPS 2002 IPDPS 2002

Is a Small Poll Size Enough? Service nodes are kept 90% busy in average Shen, Yang, and Chu 1/3/2019 Is a Small Poll Size Enough? <A> MediumGrain trace <B> FineGrain trace 1000 Random 100 Mean response time (in ms) Polling 2 Mean response time (in ms) Polling 3 800 80 Polling 4 Centralized 600 60 400 40 200 20 Mean response time (in milliseconds) Mean response time (in milliseconds) 50 100 50 100 Number of service nodes Number of service nodes In principle, it matches the analytical results on the supermarket model. [Mitzenmacher96] 1/3/2019 IPDPS 2002 IPDPS 2002

System Implementation of Random Polling Policies Shen, Yang, and Chu 1/3/2019 System Implementation of Random Polling Policies Configurations: 30 dual-processor Linux servers connected by a fast Ethernet switch. Implementation: Service availability announcements made through IP multicast; Application-level services are loaded into Neptune runtime module as DLLs; run as threads; For each service request, polls are made concurrently in UDP. 1/3/2019 IPDPS 2002 IPDPS 2002

Experimental Evaluation of Random Polling Policy (16-node) Shen, Yang, and Chu 1/3/2019 Experimental Evaluation of Random Polling Policy (16-node) <A> MediumGrain trace <B> FineGrain trace 700 Random Random Mean response time (in ms) 80 600 Polling 2 Mean response time (in ms) Polling 2 Polling 3 Polling 3 500 Polling 4 60 Polling 4 Polling 8 Polling 8 400 Centralized Centralized 300 40 200 20 100 Mean response time (in milliseconds) 50% 60% 70% 80% 90% Mean response time (in milliseconds) 50% 60% 70% 80% 90% Mean response time (in milliseconds) Mean response time (in milliseconds) Server load level Server load level For FineGrain trace, large polling size performs even worse due to excessive polling overhead and long polling delay. 1/3/2019 IPDPS 2002 IPDPS 2002

Discarding Slow-responding Polls Shen, Yang, and Chu 1/3/2019 Discarding Slow-responding Polls Polling delay with a poll size of 3: 290us polling delay when service nodes are idle. In a typical run when service nodes are 90% busy: Mean polling delay – 3ms; 8.1% polls are not returned in 10ms.  Significant for fine-grain services (service time in tens of ms) Discarding slow-responding polls – shortens the polling delay.  8.3% reduction in mean response time. 1/3/2019 IPDPS 2002 IPDPS 2002

Shen, Yang, and Chu 1/3/2019 Related Work Clustering middleware and distributed systems – Neptune, WebLogic/Tuxedo, COM/DCOM, MOSIX, TACC, MultiSpace. HTTP switching – Alteon, ArrowPoint, Foundry, Network Dispatcher. Load-balancing for distributed systems – [Mitzenmacher96], [Goswami93], [Kunz91], MOSIX, [Zhou88], [Eager86], [Ferrari85]. Low-latency network architecture – VIA, InfiniBand. 1/3/2019 IPDPS 2002 IPDPS 2002

Conclusions http://www.cs.ucsb.edu/projects/neptune Shen, Yang, and Chu 1/3/2019 Conclusions Random-polling based load balancing policies are well-suited for fine-grain network services. A small poll size provides sufficient information for load balancing; while an excessively large poll size may even degrade the performance. Discarding slow-responding polls can further improve system performance. http://www.cs.ucsb.edu/projects/neptune 1/3/2019 IPDPS 2002 IPDPS 2002