Dapper, a Large-Scale Distributed System Tracing Infrastructure

Slides:



Advertisements
Similar presentations
Overview of local security issues in Campus Grid environments Bruce Beckles University of Cambridge Computing Service.
Advertisements

MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
The Datacenter Needs an Operating System Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Trace Analysis Chunxu Tang. The Mystery Machine: End-to-end performance analysis of large-scale Internet services.
Introduction CSCI 444/544 Operating Systems Fall 2008.
1 DB2 Access Recording Services Auditing DB2 on z/OS with “DBARS” A product developed by Software Product Research.
The road to reliable, autonomous distributed systems
8.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Workload Management Massimo Sgaravatto INFN Padova.
Business Intelligence Dr. Mahdi Esmaeili 1. Technical Infrastructure Evaluation Hardware Network Middleware Database Management Systems Tools and Standards.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3 Operating System Organization.
Tiered architectures 1 to N tiers. 2 An architectural history of computing 1 tier architecture – monolithic Information Systems – Presentation / frontend,
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
New Challenges in Cloud Datacenter Monitoring and Management
What is Concurrent Programming? Maram Bani Younes.
Introducing Enterprise Technologies David Dischiave Syracuse University School of Information Studies “The original iSchool” June 3, 2013 Information School,
U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.
Christopher Jeffers August 2012
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
Meet with the AppEngine Márk Gergely eu.edge. What is AppEngine? It’s a tool, that lets you run your web applications on Google's infrastructure. –Google's.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.
10/16/ Realizing Concurrency using the thread model B. Ramamurthy.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Engr. M. Fahad Khan Lecturer Software Engineering Department University Of Engineering & Technology Taxila.
An application architecture specifies the technologies to be used to implement one or more (and possibly all) information systems in terms of DATA, PROCESS,
CE Operating Systems Lecture 3 Overview of OS functions and structure.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
9 Systems Analysis and Design in a Changing World, Fourth Edition.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Silberschatz, Galvin and Gagne  Operating System Concepts UNIT II Operating System Services.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Creating SmartArt 1.Create a slide and select Insert > SmartArt. 2.Choose a SmartArt design and type your text. (Choose any format to start. You can change.
Detecting, Managing, and Diagnosing Failures with FUSE John Dunagan, Juhan Lee (MSN), Alec Wolman WIP.
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
7/9/ Realizing Concurrency using Posix Threads (pthreads) B. Ramamurthy.
Monitoring Windows Server 2012
Workload Management Workpackage
Chapter 2: System Structures
Distribution and components
Dapper, a Large-Scale Distributed System Tracing Infrastructure
Realizing Concurrency using Posix Threads (pthreads)
Chapter 2: System Structures
AWS Cloud Computing Masaki.
What is Concurrent Programming?
Threads Chapter 4.
Chapter 7 –Implementation Issues
Multithreaded Programming
Chapter 2: Operating-System Structures
What is Concurrent Programming?
Realizing Concurrency using Posix Threads (pthreads)
Realizing Concurrency using the thread model
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
Chapter 2: Operating-System Structures
Design.
Presentation transcript:

Dapper, a Large-Scale Distributed System Tracing Infrastructure Google Technical Report, 2010 Author: B.H.Sigelman, L.A.Barroso, M.Burrows, P.Stephenson, M.Plakal, D.Beaver, S.Jaspan, C.Shanbhag Presenter: Lei Jinjiang

Background Modern Internet services are often implemented as complex, large-scale distributed systems. These applications are constructed from collections of software modules that may be developed by different teams, perhaps in different programming language, and could span many thousand of machines across multiple physical facilities

Background Imagine a single search request coursing through Google’s massive infrastructure. A single request can run across thousands of machines and involve hundred of different subsystems. And oh by the way, you are processing more requests per second than any other system in the world.

Problem That is where dapper comes in! How do you debug such a system? How do you figure out where the problem are? How do you determine if programmers are coding correctly? How do you keep sensitive data secret and safe? How do ensure products don’t use more resources than the are assigned? How do you store all the data? How do you make use of it? That is where dapper comes in!

Dapper Dapper is Google's tracing system and it was originally created to understand the system behavior from a search request. Now Google's production clusters generate more than 1 terabyte of sampled trace data per day.

Requirements and Design Goals (1) Ubiquitous deployment (2) Continuous monitoring Design Goals: (1) Low overhead (2) Application-level transparency (3) Scalability

Distributed Tracing in Dapper Two class of solutions: Black-box vs. annotation-based

Trace trees and spans The causal and temporal relationships between five spans in a Dapper trace tree

Mindful of time skew! Trees and Spans A detailed view of a single span from last Figure Mindful of time skew!

Instrumentation points When a thread handles a traced control path, Dapper attaches a trace context to thread-local storage. Most Google developers use a common control flow library to construct callbacks. Dapper ensures that all such callback store the trace context

Callback In computer programming, a callback is a reference to executable code, or a piece of executable code, that is passed as an argument to other code. This allows a lower-level software layer to call a subroutine (or function) defined in a higher-level layer.

Annotations // C++: const string& request = ...; if (HitCache()) TRACEPRINTF("cache hit for %s", request.c_str()); else TRACEPRINTF("cache miss for %s", request.c_str()); // Java: Tracer t = Tracer.getCurrentTracer(); String request = ...; if (hitCache()) t.record("cache hit for " + request); t.record("cache miss for " + request);

Sampling Low overhead was a key design goal for Dapper, since service operators would be understandably reluctant to deploy a new tool of yet unproven value if it had any significant impact on performance… Therefore, … , we further control overhead by recording only a fraction of all traces.

Trace collection

Out-of-band trace collection Firstly, the in-band-Dapper trace data would dwarf the application data and bias the results of subsequent analyses. Secondly, many middleware systems which return a result to their caller before all of their own backend have returned a final result. Security and privacy considerations

Production coverage Given how ubiquitous Dapper-instrumented libraries are, we estimate that nearly every Google production process supports tracing. There are cases where Dapper is unable to follow the control path correctly. These typically stem from the use of non-standard control-flow primitives. Dapper tracing can be turned off as a production safety measure.

Use of trace annotations Currently, 70% of all Dapper spans and 90% of all Dapper traces have at least one application-specified annotation. 41 Java and 68 C++ applications have added custom application annotations in order to better understand intra-span activity in their sevices.

Trace collection overhead Process count (per host) Data Rate (per process) Daemon CPU Usage (single CPU core) 25 10K/sec 0.125% 10 200K/sec 0.267% 50 2K/sec 0.130% CPU resource usage for the Dapper daemon during load testing The daemon never uses more than 0.3% of one core of a production machine during collection, and has a very small memory footprint. Restrict the Dapper daemon to the lowest possible priority in the kernel scheduler. 426 bytes/span, less than 0.01% of the network traffic in Google’s production environment

Trace collection overhead Sampling frequency Avg. Latency (% change) Avg. Throughput 1/1 16.3% -1.48% 1/2 9.40% -0.73% 1/4 6.38% -0.30% 1/8 4.12% -0.23% 1/16 2.12% -0.08% 1/1024 -0.20% -0.06% The effect of different [non-adaptive] Dapper sampling frequencies on the latency and throughput of a Web search cluster. The experimental errors for these latency and throughput measurements are 2.5% and 0.15% respectively.

Adaptive sampling Lower traffic workloads may miss important events at such low sampling rate(1/1024). Workloads with low traffic automatically increase their sampling rate while those with very high traffic will lower it so that overheads remain under control. Reliability …

Additional sampling during collection The dapper team also need to control the total size of data written to its central repositories, and thus we incorporate a second round of sampling for that purpose. For each span seen in the collection system, we hash the associate trace id as a scalar z, where 0≤z≤1. If z is less than our collection sampling coefficient, we keep the span and write it to the Bigtable. Otherwise, we discard it.

The Dapper Depot API Access by trace id Bulk access: Access to billions of Dapper in parallel Indexed access: The index maps from commonly requested trace feature(host machine, service names) to distinct dapper traces.

User interface

Experiences Using Dapper during development (Integration with exception monitoring) Addressing long tail latency Inferring service dependencies Network usage of differ services Layered and Shared Storage Systems (e.g. GFS)

Other Lessons Learned Coalescing effect Tracing batch workloads Finding a root cause Logging kernel-level information

Thank you!