The Tail at Scale Dean and Barroso, CACM Feb 2013 Presenter: Chien-Ying Chen (cchen140) CS538 - Advanced Computer Networks 1.

Slides:

Advertisements

Similar presentations

Operating System.

Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.

Introduction CSCI 444/544 Operating Systems Fall 2008.

1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

Panoptes: A Scalable Architecture for Video Sensor Networking Applications Wu-chi Feng, Brian Code, Ed Kaiser, Mike Shea, Wu-chang Feng (OGI: The Oregon.

1. Overview  Introduction  Motivations  Multikernel Model  Implementation – The Barrelfish  Performance Testing  Conclusion 2.

Improving Robustness in Distributed Systems Jeremy Russell Software Engineering Honours Project.

Cs238 CPU Scheduling Dr. Alan R. Davis. CPU Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU.

Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.

Device Management.

Computer Organization and Architecture

1 I/O Management in Representative Operating Systems.

DISTRIBUTED COMPUTING

Advances in Language Design

Computer System Architectures Computer System Software

9/14/2015B.Ramamurthy1 Operating Systems : Overview Bina Ramamurthy CSE421/521.

Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)

CPU Scheduling Chapter 6 Chapter 6.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.

B.Ramamurthy9/19/20151 Operating Systems u Bina Ramamurthy CS421.

Lecture 8: Design of Parallel Programs Part III Lecturer: Simon Winberg.

1 I/O Management and Disk Scheduling Chapter Categories of I/O Devices Human readable Used to communicate with the user Printers Video display terminals.

1. Memory Manager 2 Memory Management In an environment that supports dynamic memory allocation, the memory manager must keep a record of the usage of.

Dynamic Content On Edge Cache Server (using Microsoft.NET) Name: Aparna Yeddula CS – 522 Semester Project Project URL: cs.uccs.edu/~ayeddula/project.html.

Real Time Operating Systems Lecture 10 David Andrews

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.

임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.

Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,

Computing Infrastructure for Large Ecommerce Systems -- based on material written by Jacob Lindeman.

Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.

Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.

Client-Server Model of Interaction Chapter 20. We have looked at the details of TCP/IP Protocols Protocols Router architecture Router architecture Now.

GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.

Empirical Quantification of Opportunities for Content Adaptation in Web Servers Michael Gopshtein and Dror Feitelson School of Engineering and Computer.

Introduction to z/OS Basics © 2006 IBM Corporation Chapter 7: Batch processing and the Job Entry Subsystem (JES) Batch processing and JES.

Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.

Querying the Internet with PIER CS294-4 Paul Burstein 11/10/2003.

Background Computer System Architectures Computer System Software.

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)

PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.

DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)

Information Technology. *At Home *In business *In Education *In Healthcare Computer Uses.

OPERATING SYSTEMS CS 3502 Fall 2017

EEE Embedded Systems Design Process in Operating Systems 서강대학교 전자공학과

Operating Systems : Overview

Green cloud computing 2 Cs 595 Lecture 15.

Operating Systems (CS 340 D)

Alternative system models

Data Dissemination and Management (2) Lecture 10

Local secondary storage (local disks)

Software Architecture in Practice

Operating Systems (CS 340 D)

湖南大学-信息科学与工程学院-计算机与科学系

The Tail At Scale Dean and Barroso, CACM 2013, Pages 74-80

Operating Systems Bina Ramamurthy CSE421 11/27/2018 B.Ramamurthy.

TimeTrader: Exploiting Latency Tail to Save Datacenter Energy for Online Search Balajee Vamanan, Hamza Bin Sohail, Jahangir Hasan, and T. N. Vijaykumar.

Operating Systems : Overview

Operating Systems : Overview

Operating Systems : Overview

Operating Systems : Overview

Multithreaded Programming

CS510 - Portland State University

Data Dissemination and Management (2) Lecture 10

Presentation transcript:

The Tail at Scale Dean and Barroso, CACM Feb 2013 Presenter: Chien-Ying Chen (cchen140) CS538 - Advanced Computer Networks 1

Interacting with Web Services Internet What causes variable response time? Response Time Request Latency 2

Factors of Variable Response Time Shared Resources (Local) –CPU cores –Processors caches –Memory bandwidth Global Resource Sharing –Network switches –Shared file systems Daemons –Scheduled Procedures 3

Factors of Variable Response Time Maintenance Activities –Data reconstruction in distributed file systems –Periodic log compactions in storage systems –Periodic garbage collection in garbage-collected languages Queueing –Queueing in intermediate servers and network switches 4

Factors of Variable Response Time Power Limits –Throttling due to thermal effects on CPUs Garbage Collection –Random access in solid-state storage devices Energy Management –Power saving modes –Switching from inactive to active modes 5

Component-Level Variability Amplified By Scale Latency variability is magnified at the service level

Request Latency Measurement 7 Key Observation: –5% servers contribute nearly 50% latency. 50%+

Key Question What would you do with that 5% servers which contribute 50% latency? 8 Eliminateall interactive request latencies. Live with But it’s infeasible to eliminate all variations!

Reducing Component Variability Differentiating Service Classes –Differentiate non-interactive requests High Level Queuing –Keep low level queues short 9

Reducing Component Variability Reduce Head-of-line Blocking –Break long-running requests into a sequence of smaller requests. Synchronize Disruption –Do background activities altogether. 10

Living with Latency Variability Within Request Short-Term Adaptations –Handles latency of 10+ ms –Takes the advantage of redundancy Cross-Request Long-Term Adaptations –Handles latency of 10+ seconds –Takes the advantage of partitions 11

Within Request Short-Term Adaptations Hedged Requests –Send redundant requests. –Use the results from whichever replica responds first. –The overhead can be further reduced by tagging them as lower priority than primary requests. 12

Within Request Short-Term Adaptations 13 Tied Requests –Hedged requests with cancellation mechanism.

Within Request Short-Term Adaptations Alternatives: –Requests to the least-loaded server. (Probe and Submit) –The Distributed Shortest-Positioning Time First System method. 14

Cross-Request Long-Term Adaptations Conventional Data Partitions Micro-Partitions –Like the notion of virtual servers in distributed hashing table (DHT). Selective Replication Latency-Induced Probation 15

Large Information Retrieval Systems Google search engine –No certain answers “Good Enough” –Google’s IR systems are tuned to occasionally respond with good-enough results when an acceptable fraction of the overall corpus has been searched. 16

Large Information Retrieval Systems Canary Requests –Some requests exercising an untested code path may cause crashes or long delays. –Send requests to one or two leaf servers for testing. –The remaining servers are only queried if the root gets a successful response from the canary in a reasonable period of time. 17

Large Information Retrieval Systems Canary Requests –May introduce extra latency. –Comparing to the consequence of long delays, it’s worth to apply canary requests. 18

Hardware Trends and Their Effects Hardware will only be more and more diverse –So tolerating variability through software techniques are even more important over time. Higher bandwidth reduces per-message overheads. –It further reduces the cost of tied requests (making it more likely that cancellation messages are received in time). 19

Conclusion Variability of latency exists in large-scale services. Live with variable request latency –Infeasible to eliminate variable request latency in large-scale services. The importance of these techniques will only increase. (scale of services getting larger) Techniques for living with variable latency turn out to increase system utilization 20

Discussion How to minimize interactive request latencies in large-scale web services? Under what condition a request will get worst latency? Why Google uses canary-requests despite its introduce overhead? How much time should it wait until canary requests respond? 21