Distributed Tracing How to do latency analysis for microservice-based applications Reshmi Krishna @reshmi9k.

Slides:



Advertisements
Similar presentations
How We Manage SaaS Infrastructure Knowledge Track
Advertisements

Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
Apache Spark and the future of big data applications Eric Baldeschwieler.
Cross Platform Mobile Backend with Mobile Services James
DYNAMICS CRM AS AN xRM DEVELOPMENT PLATFORM Jim Novak Solution Architect Celedon Partners, LLC
What makes Facebook do what it does? By Gavin Mais.
Rich Internet Applications for the Enterprise Creating RIA from your Oracle database using TURBO Enterprise Web 2.0 Presented By: John Krahulec Bizwhazee.
IMDGs An essential part of your architecture. About me
Big Data Open Source Software and Projects ABDS in Summary I: Layers 1 to 2 Data Science Curriculum March Geoffrey Fox
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Introduction to Hbase. Agenda  What is Hbase  About RDBMS  Overview of Hbase  Why Hbase instead of RDBMS  Architecture of Hbase  Hbase interface.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Introduction to ASP.NET development. Background ASP released in 1996 ASP supported for a minimum 10 years from Windows 8 release ASP.Net 1.0 released.
Cloud Computing: Pay-per-Use for On-Demand Scalability Developing Cloud Computing Applications with Open Source Technologies Shlomo Swidler.
Cloud Design Patterns Sharath Sahadevan,
Ur/Web: A Simple Model for Programming the Web
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
The Holmes Platform and Applications
Configuration & Registry Microservice Deep Dive
Building Enterprise Applications Using Visual Studio®
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
Apache Ignite Data Grid Research Corey Pentasuglia.
Smart Building Solution
Hadoop and Analytics at CERN IT
Connected Maintenance Solution
OpenLegacy Training Day Four Introduction to Microservices
Improving searches through community clustering of information
MVC Architecture, Symfony Framework for PHP Web Apps
Google App Engine Mandeep Singh (37926)
Docker Birthday #3.
WEB SERVICES.
Spark Presentation.
Distributed Tracing Of Microservices
Smart Building Solution
Connected Maintenance Solution
The Transition to Modern Office Add-in Development
Platform as a Service.
NOSQL.
LEO Kinesis More Kafka-like Blaine Nielsen
Introduction to Microservices Prepared for
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
NOSQL databases and Big Data Storage Systems
New Mexico State University
Microsoft Build /8/2018 5:15 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Practical Choreography with Spring Cloud
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
Accelerate application delivery with a Cloud-native mindset
The Application Lifecycle
Lecture 1: Multi-tier Architecture Overview
Near Real Time ETLs with Azure Serverless Architecture
Appcelerator Arrow: Build APIs in Minutes. Connect to Any Data Source
Overview of big data tools
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
Database Software.
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Middleware, Services, etc.
Last.Backend is a Continuous Delivery Platform for Developers and Dev Teams, Allowing Them to Manage and Deploy Applications Easier and Faster MICROSOFT.
CS4433 Database Systems Project.
5 Azure Services Every .NET Developer Needs to Know
School Districts Can Analyze and Report on Data Across Multiple Systems with EdWire, a Powerful Integration Solution that Utilizes Microsoft Azure MICROSOFT.
UFCEUS-20-2 Web Programming
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
TN19-TCI: Integration and API management using TIBCO Cloud™ Integration
Johan Lindberg, inRiver
Web Application Development Using PHP
Presentation transcript:

Distributed Tracing How to do latency analysis for microservice-based applications Reshmi Krishna @reshmi9k

About Me Software Engineer Platform Architect, Pivotal Women In Tech Community Members Twitter : @reshmi9k MeetUp : Cloud-Native-New-York

Agenda Distributed Tracing Tracers and Tracing Systems Zipkin Incorporating distributed tracing into an existing micro service Demo

From Monolith …. Customer Loyalty Web Frontend Payment Notifications A monolith usually looks like a big ball of mud with entangled dependencies, lack of cohesion, direct DB queries instead of using interfaces and APIs. It does NOT do one thing very well. It usually does a lot of things, which become brittle and difficult to reason on. All functionality must be deployed together No Language and framework heterogeneity More likely a failure will cascade resulting in a reliance reduction - brittle - high risk deployment Scale vertically or limited horizontal scaling of everything at once Large team - anti agile Harder to reuse Harder to modify - thousands of lines of hard to understand code Harder to replace - meantime to recovery is limited Getting up to speed Wikipedia: A big ball of mud is a software system that lacks a perceivable architecture. Although undesirable from a software engineering point of view, such systems are common in practice due to business pressures, developer turnover and code entropy. They are a type of design anti-pattern. Loyalty Web Frontend Payment Notifications

To Microservices . Death Star architecture by Adrian Cockcroft As visualized by App Dynamics, Boundary.com and Twitter internal tools

Troubleshooting Latency issues When was the event? How long did it take? How do I know it was slow? Why did it take so long? Which microservice was responsible?

Distributed Tracing Distributed Tracing is a process of collecting end-to-end transaction graphs in near real time A trace represents the entire journey of a request A span represents single operation call Distributed Tracing Systems are often used for this purpose. Zipkin is an example As a request is flowing from one microservice to another, tracers add logic to create unique trace Id, span Id A trace represents the entire journey of a request A span is a basic unit of work Span id is identified by an unique 64-bit id Trace id is identified by a 64-bit id, which the span is part of A span contains timestamped records, any RPC timing data, and zero or more application-specific annotations The trace give u the structure through which you can identify your calls. You can you can think about trace as a tree and the tree nodes as spans. The edges indicate a casual relationship between a span and its parent span. Independent of its place in a larger trace tree, though, a span is also a simple log of timestamped records which encode the span’s start and end time, any RPC timing data, and zero or more application-specific annotations

Visualization - Traces & Spans UI Trace Id : 1, Span Id : 1 Back-Office-Microservice Trace Id : 1, Parent Id : 1, Span Id : 2 Customer-Microservice Trace Id : 1, Parent Id : 2, Span Id : 4 Account-Microservice Trace Id : 1, Parent Id : 2, Span Id : 5

Dapper Paper By Google @reshmi9k This paper described Dapper, which is Google’s production distributed systems tracing infrastructure Design Goals : Low overhead Application-level transparency Scalability Dapper was published in 2010 http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf

Zipkin Zipkin is a distributed tracing system Implementation based on Dapper paper, Google Aggregate spans into trace trees Manages both collection and lookup of the data In 2015, OpenZipkin became the primary fork Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in microservice architectures. It manages both the collection and lookup of this data. Zipkin’s design is based on the Google Dapper paper. Started as a project in first hack week. Initial version of Dapper paper was implemented for Thrift Today it has grown to include support for tracing Http, Thrift, Memcache, SQL and Redis requests. The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.

Initial Zipkin Architecture Tracers collect timing data and transport it over HTTP or Kafka. We use Scribe to transport all the traces from the different services to Zipkin and Hadoop. Scribe was developed by Facebook and it’s made up of a daemon that can run on each server in your system. It listens for log messages and routes them to the correct receiver depending on the category. Once the trace data arrives at the Zipkin collector daemon we check that it’s valid, store it and the index it for lookups. Zipkin was originally built with Cassandra for storage. It was scalable, had a flexible schema, and is heavily used within Twitter. However, this component is now pluggable, and now we have support for Redis, HBase, MySQL, PostgreSQL, SQLite, and H2. Users query for traces via Zipkin’s Web UI or Api.

Tracers Tracers add logic to create unique trace ID Trace ID is generated when the first request is made Span ID is generated as the request arrives at each microservice Example tracer is Spring Cloud Sleuth Tracers execute in your production apps! They are written to not log too much Tracers have instrumentation or sampling policy

Demo : Architecture Diagram Transport Mq/Http/Log Tracers add logic to create unique trace ID Trace ID is generated when the first request is made Span Id is generated as the request arrives at each microservice Example tracer is Spring Cloud Sleuth Tracers execute in your production apps! They are written to not log too much Tracers have instrumentation or sampling policy to manage volumes of traces and spans ZIPKIN Collector Spring Cloud Sleuth APP APP Spring Cloud Sleuth Spring Cloud Sleuth APP Query Server Zipkin UI Span Store Spring Cloud Sleuth APP

Let’s look at some code & Demo

Summary Distributed tracing allows you to quickly see latency issues in your system Zipkin is a great tool to visualize the latency graph and system dependencies Spring Cloud Sleuth integrates with Zipkin and grants you log correlation Log correlation allows you to match logs for a given trace Pivotal Cloud Foundry makes integration of your apps and Spring Cloud Sleuth and Zipkin easier

Links Dapper, Google : http://research.google.com/pubs/pub36356.html Code for this presentation : https://github.com/reshmik/DistributedTracingDemo_Velocity2016.git Sleuth’s documentation: http://cloud.spring.io/spring-cloud-sleuth/spring-cloud-sleuth.html Repo with Spring Boot Zipkin server: https://github.com/openzipkin/zipkin-reporter-java.git Zipkin deployed as an PCF :https://github.com/reshmik/Zipkin/tree/master/spring-cloud-sleuth- samples/spring-cloud-sleuth-sample-zipkin-stream Pivotal Web Services trial : https://run.pivotal.io/ PivotalCloudFoundry on your laptop : https://docs.pivotal.io/pcf-dev/ @reshmi9k