Dynatrace AI Demystified

Slides:



Advertisements
Similar presentations
DynaTrace Platform.
Advertisements

3 4 AUGSEPOCTAUGSEPOCTAUGSEPTOCTAUGSEPOCT Americas 99.99% 99.95%99.97%99.98%99.99% 99.95%99.92% EMEA 99.99% 99.95%99.97%99.98%99.99% 99.95%99.92%
End users Web servers Application servers Data servers ? How do I know I have a problem? How do I isolate the problem? How do I diagnose the problem?
Social network partition Presenter: Xiaofei Cao Partick Berg.
Every edge is in a red ellipse (the bags). The bags are connected in a tree. The bags an original vertex is part of are connected.
Compuware Confidential. Do Not Duplicate THANK YOU APM in the cloud: Are you ready? By: Mike Taylor.
Trace Analysis Chunxu Tang. The Mystery Machine: End-to-end performance analysis of large-scale Internet services.
The NOWIRZ Client Portal gives you access to System Status Voucher Logins Usage Reports Login Page Content Management.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Modern Application Lifecycle Pla n Develop + Test Monitor + Learn Release.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
Fault, Configuration, Performance Management
An Authentication Service Against Dishonest Users in Mobile Ad Hoc Networks Edith Ngai, Michael R. Lyu, and Roland T. Chin IEEE Aerospace Conference, Big.
Hands-On Microsoft Windows Server 2003 Administration Chapter 5 Administering File Resources.
CISC220 Fall 2009 James Atlas Nov 13: Graphs, Line Intersections.
Loupe /loop/ noun a magnifying glass used by jewelers to reveal flaws in gems. a logging and error management tool used by.NET teams to reveal flaws in.
What is a “modern” application? Ulrich (Uli) Homann Chief Architect, Microsoft Services Microsoft Corporation.
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
Towards Highly Reliable Enterprise Network Services via Inference of Multi-level Dependencies Paramvir Bahl, Ranveer Chandra, Albert Greenberg, Srikanth.
Department of Computer Science Engineering SRM University
Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.
Randy Pagels Sr. Developer Technology Specialist DX Team (Developer Experience and Evangelism) Application Insights Availability, Performance and Usage.
Software Testing Damian Gordon.
Information Flow using Edge Stress Factor Communities Extraction from Graphs Implied by an Instant Messages Corpus Franco Salvetti University of Colorado.
Alert Logic Security and Compliance Solutions for vCloud Air High-level Overview.
Automated Social Hierarchy Detection through Network Analysis (SNAKDD07) Ryan Rowe, Germ´an Creamer, Shlomo Hershkop, Salvatore J Stolfo 1 Advisor:
CensorNet Desktop Surveillance Description, Target audience, Positioning Components, Features
COMS E Cloud Computing and Data Center Networking Sambit Sahu
Trust- and Clustering-Based Authentication Service in Mobile Ad Hoc Networks Presented by Edith Ngai 28 October 2003.
1 © 2001, Cisco Systems, Inc. All rights reserved. Cisco Info Center for Security Monitoring.
Alert Logic Security and Compliance Solutions for vCloud Air High-level Overview.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
Model-Centric Smart Grid for Big Data
Using Social Network Analysis Methods for the Prediction of Faulty Components Gholamreza Safi.
Randy Pagels Sr. Developer Technology Specialist DX Team (Developer Experience and Evangelism) Application Insights Availability, Performance and Usage.
Creating SmartArt 1.Create a slide and select Insert > SmartArt. 2.Choose a SmartArt design and type your text. (Choose any format to start. You can change.
+ Logentries Is a Real-Time Log Analytics Service for Aggregating, Analyzing, and Alerting on Log Data from Microsoft Azure Apps and Systems MICROSOFT.
A way to develop software that emphasizes communication, collaboration, and integration between development and IT operations teams.
Session Objectives And Takeaways Agenda Monitor and manage servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+
Global Azure Bootcamp. Telemetry is collected at each tier: server backend, middleware, web service & browser 1 Telemetry arrives in Application Insights.
Big Data Quality Challenges for the Internet of Things (IoT) Vassilis Christophides INRIA Paris (MUSE team)
mit Application Insights und Ruxit
Topics In Social Computing (67810) Module 1 (Structure) Centrality Measures, Graph Clustering Random Walks on Graphs.
© 2014 VMware Inc. All rights reserved. Dynatrace for vCloud Air High-level Overview Dynatrace 10/21/2014.
Fault Localization via Analysis of Network Dependency Victor Bahl, Ranveer Chandra, Albert Greenberg, Dave Maltz, Ming Zhang (MSR Redmond)
1 New metrics for characterizing the significance of nodes in wireless networks via path-based neighborhood analysis Leandros A. Maglaras 1 Dimitrios Katsaros.
 1- Definition  2- Helpdesk  3- Asset management  4- Analytics  5- Tools.
Presented by Edith Ngai MPhil Term 3 Presentation
Current State of the Dasvis Project and Ideas for Moving Forward
Network Fault Analysis based on Machine Learning
TrueSight Operations Management 11.0 Architecture
Microsoft Ignite /11/2018 1:18 AM BRK4017
Announcing DDoS Protection preview for Azure
Improve Troubleshooting & Performance Analysis
Microsoft Ignite /22/2018 3:27 PM BRK2121
Providing Big Data to facilitate Collaboration in a Safety Railway Environment Jens-Peter Brauner, VP Mobility Division, Siemens Ltd. Hong Kong 27th International.
Edge Weight Prediction in Weighted Signed Networks
Microsoft Build /20/2018 5:17 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Dieudo Mulamba November 2017
Securing Cloud-Native Applications Jason Schmitt CEO
7/18/17 Customers Webinar Australian market 5/2/2018.
AKAMAI INTELLIGENT PLATFORM™
ideas to mobile apps in record time,
Last.Backend is a Continuous Delivery Platform for Developers and Dev Teams, Allowing Them to Manage and Deploy Applications Easier and Faster MICROSOFT.
Software metrics.
Station Management System
Anastasia Baryshnikova  Cell Systems 
DATS International Portfolio.
What’s Happening with my App, Application Insights?
DBA Situational Decision Automation Diagram Template
Presentation transcript:

Dynatrace AI Demystified Andreas Grabner, @grabnerandi

Why we built “the new” Dynatrace OneAgent, Smartscape, Root Cause Detection Hypercube Baselining, Anomaly Detection

The idea “Automatic APM” (~2012) Next gen AI based APM solution Detect anomalies automatically Automatically understand dependencies Show correlations between incidents Automatically detect root cause (component) Measure/predict impact Assisted code level root cause analysis

Dynatrace SaaS Dynatrace Managed US East, US West, Ireland, Australia Your data center

One Agent to monitor them all

Dynatrace Full Stack Monitoring

Dependencies between each entity Across all your data centers

Automated End-to-End Tracing

PurePath with Code-Level Details on each request

All Timeseries Data you can wish for  Network Container Cloud Servers Hosts

Everything automatically baselined!

Automated Log Analytics and Change Detection

AI Supported Performance Engineering Your Users Your Apps/Services Dynatrace OneAgent AI Supported Performance Engineering

Insights into the AI

Smart anomaly detection (“Hypercube baselining”) Automatic baselining (ON per default) - reliable (less false positives than competition) due to Special algorithms for different metrics Response time/load time/visually complete Error rate User load (availability) Multidimensional baselining New instances: no learning required! Up to 10k cells per web/mobile app or backend service! #13022 5 Dimensions User action/ service method Region Browser Operating system Connection bandwidth

From events (incidents) to problems Input: Notification sequence of starting and ending events Event correlation: Calculation of impact relationships among all active events Event 2 Event 3 Event 1 Event 4 Event 5 time Event grouping (Problems): Identify events with same root cause Causation: Rank events to identify root cause within each group 1 3 2

Some Slides removed from original presentation because of confidential content

The Big Picture: Root cause ranking Impact calculation only quantifies how individual events are related to each other But we need to evaluate the big picture to isolate the fault domain Big picture: Graph analysis of resulting “impact graph” aka “Dynatrace Problem” Vertices in problem graph ranked based on a custom Eigenvector Centrality algorithm Score of event depends on score of connected events and weights of respective incoming edges Root cause: Events that receive a distinguished score Eigencentrality: Weight of vertex (event) determined by weight of neighbor Eigenvector centrality: Think of page rank It assigns relative scores to all nodes in the network based on the concept that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes. „Problem“ 7 „Problem“ 23 0.1 C E 0.5 0.2 0.7 A 0.3 F D B

Impact (measured and extrapolated!)

2 clicks! Impact (measured and extrapolated!)

Impact (measured and extrapolated!)

Dynatrace AI Demystified Andreas Grabner, @grabnerandi