The Tail At Scale Dean and Barroso, CACM 2013, Pages 74-80

Slides:



Advertisements
Similar presentations
Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.
Advertisements

IT253: Computer Organization
Processes Management.
1 Parallel Scientific Computing: Algorithms and Tools Lecture #2 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
CSE506: Operating Systems Disk Scheduling. CSE506: Operating Systems Key to Disk Performance Don’t access the disk – Whenever possible Cache contents.
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
1: Operating Systems Overview
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Performance Evaluation
11/3/2005Comp 120 Fall November 10 classes to go! Cache.
Scalable Reader Writer Synchronization John M.Mellor-Crummey, Michael L.Scott.
Overview: Memory Memory Organization: General Issues (Hardware) –Objectives in Memory Design –Memory Types –Memory Hierarchies Memory Management (Software.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Virtual Memory.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
CMPE 421 Parallel Computer Architecture
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
OCR GCSE Computing © Hodder Education 2013 Slide 1 OCR GCSE Computing Chapter 2: Memory.
Some Notes on Performance Noah Mendelsohn Tufts University Web: COMP 40: Machine.
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
The Tail at Scale Dean and Barroso, CACM Feb 2013 Presenter: Chien-Ying Chen (cchen140) CS538 - Advanced Computer Networks 1.
Computer Architecture Lecture 25 Fasih ur Rehman.
Physical Memory and Physical Addressing ( Chapter 10 ) by Polina Zapreyeva.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Muhammad Waseem Iqbal Lecture # 20 Data Communication.
CMSC 611: Advanced Computer Architecture
Evaporation works like this
Non-Preemptive Scheduling
OPERATING SYSTEMS CS 3502 Fall 2017
Processes and threads.
Memory Miss Elliott.
Chapter 2.1 CPU.
Chapter 2 Memory and process management
Green cloud computing 2 Cs 595 Lecture 15.
Networking Recap Storage Intro
Operating Systems (CS 340 D)
Ramya Kandasamy CS 147 Section 3
CS240: Advanced Programming Concepts
Presented by Kristen Carlson Accardi
CSE-291 Cloud Computing, Fall 2016 Kesden
How will execution time grow with SIZE?
CSE451 I/O Systems and the Full I/O Path Autumn 2002
Swapping Segmented paging allows us to have non-contiguous allocations
Cache Memory Presentation I
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Some less known facts about log file sync and other LGWR-related waits
Operating Systems (CS 340 D)
CSE 451: Operating Systems Spring 2006 Module 18 Redundant Arrays of Inexpensive Disks (RAID) John Zahorjan Allen Center.
London Dispersion Forces
Parallel and Multiprocessor Architectures – Shared Memory
G063 - Distributed Databases
November 14 6 classes to go! Read
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Mark Zbikowski and Gary Kimura
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
2.C Memory GCSE Computing Langley Park School for Boys.
CSE 451: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
CS510 - Portland State University
London Dispersion Forces
Memory System Performance Chapter 3
Year 10 Computer Science Hardware - CPU and RAM.
Database System Architectures
CSE 332: Data Abstractions Memory Hierarchy
CSC Multiprocessor Programming, Spring, 2011
Presentation transcript:

The Tail At Scale Dean and Barroso, CACM 2013, Pages 74-80 Presented by: Gregory Kesden, CSE-291, Fall 2016

What Is “The Tail”? Most requests are answered promptly Some requests are slower As response times get faster, the difference between the fast and the slow grows And there are many reasons for slow, some simple, and some dynamic and resulting from interactions. There is a lot of mass near “fast”, but a really long tail from there If a request isn’t answered promptly, the likelihood of it taking just a little longer and much, much longer get much closer. There is a lot of probability mass across a very thing tail, if it is long enough

Why Variability Exists? Shared (local) Resources, e.g. processor, disk, etc Background daemons Global Resource sharing, e.g. network, DFSs, authentication, etc Queueing A dynamic phenomena, especially when multi-level Consider a simple example, such as synchronization at a busy network switch and last vs random drop. Burst vs Steady-state processor operation (heat shedding limits) Energy saving state transitions Garbage collection

Parallelism Can hide latency Can make it dramatically worse Make multiple queries, take fastest Of course, expensive Can make it dramatically worse If need to wait for slowest result Make 1000 queries, depend upon slowest 0.001 can be bad, and the result is bad.

Service Classes Interactive before batch requests, etc Often means higher-level queuing, rather than relying upon provided queues, e.g. within OS, etc.

Synchronizing Background Activities Considering highly parallel operations One slow result slows all, even if “rare” Unsynchronized disruptions are constantly disrupting something somewhere And that slows down everything Synchronized disruptions affect all systems at the same time Short period of bad response, rather than long period The rest of the time the path is clear Basically stacks the disruptions in a few actions, rather than spreading across many, because the penalty is the same for an action, regardless of how many are concentrated within it.

Work-Arounds: Replication Especially for read-only (or read-rare or loosely synchronized), where synchronization isn’t an issue More throughput, reduces queuing, limits impact of bad node Can be selective, e.g. hot items Heat up to high water mark Cool off to low water mark

Work-Arounds: Hedge Parallel requests Take first Expensive Alternately, reissue only after delay If past the “sweet spot” then ask again 2x sweet spot is better than much of long tail Can cancel prior request when issued or when succeeded

Work-Arounds: Micro-partitions Breaking up big units of work Avoids head-of-line blocking behind them Enables them to be parallelized Enables finer grained load balancing

Work-Arounds: Probation Temporarily exclude poorly responding servers After a certain amount of bad requests Can exponentially back-off, etc Can issue shadow requests to find out when okay again Many sources of latency are temporary Daemons, garbage collection, backups, checkpointing and pruning, network storms Just wait for them to pass

Good Enough Large Information Retrieval Systems may not need exact answer Top 10 best advertisements? Might 10 of the top 15 due? Just ignore a small fraction of slow responses and move on

Canaries Requests which miss caches, etc are not common They are more likely to find bad code paths Test to a few servers first, then try again to all Since canaries are only to a few servers, unlikely to hit server having a bad moment and add significant latency. These aren’t scaled.

Going To Get Better? (Naw) The faster hardware gets, the greater the variability between it and zero Energy efficiency and heat shedding are getting more complex Faster speeds in some cases yield trouble in others Scale is growing As scale shrinks, variance within same class of devices becomes more significant (But, there is some hope, in general, things are getting faster, wider, and more directly connected and parallel)