Demo Modules 3 Steps to diagnose a performance problem

Slides:



Advertisements
Similar presentations
Oracle Enterprise Manager Grid Control: Day in the Life of An Admin Wilson N. López – Solution Specialist.
Advertisements

Introducing Collabion Charts for SharePoint
DynaTrace Platform.
Performance Testing - Kanwalpreet Singh.
© 2013 IBM Corporation October 4, 2013 IT Analytics and Big Data IBM Solutions Paul Smith (Smitty) Service Management Architect.
Keeping our websites running - troubleshooting with Appdynamics Benoit Villaumie Lead Architect Guillaume Postaire Infrastructure Manager.
1 © Fluke networks 2004 Everett WAMonday, May 18, 2015 Application Performance & Network Analysis Improving the end user experience.
Modern Application Lifecycle Pla n Develop + Test Monitor + Learn Release.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
Chapter 11 - Monitoring Server Performance1 Ch. 11 – Monitoring Server Performance MIS 431 – created Spring 2006.
Loupe /loop/ noun a magnifying glass used by jewelers to reveal flaws in gems. a logging and error management tool used by.NET teams to reveal flaws in.
Empowering Business in Real Time. © Copyright 2009, OSIsoft Inc. All rights Reserved. Data Center & IT Monitoring Use Cases Regional Seminar Series Carolyn.
Effective Methods for Analyzing Altiris Performance Sam Saffron | Development Manager | Altiris John Epeneter | Product Manager | Altiris Monitoring.
Copyright © 2007 Quest Software The Changing Role of SQL Server DBA’s Bryan Oliver SQL Server Domain Expert Quest Software.
Practice Insight Instructional Webinar Series Reporting
Module 18 Monitoring SQL Server 2008 R2. Module Overview Monitoring Activity Capturing and Managing Performance Data Analyzing Collected Performance Data.
Introduction and simple using of Oracle Logistics Information System Yaxian Yao
AppMetrics and SCOM Working Together to Maximize the availability of Your applications.
Integrating and managing your Engaging Networks data Top ten data features.
SharePoint Enterprise Aggregation Caching Feature Product Overview Nimrod Geva Product Group Manager, KWizCom
LiveCycle Data Services Introduction Part 2. Part 2? This is the second in our series on LiveCycle Data Services. If you missed our first presentation,
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Learningcomputer.com SQL Server 2008 – Profiling and Monitoring Tools.
Suite zTPFGI Facilities. Suite Focus Three of zTPFGI’s facilities:  zAutomation  zTREX  Logger.
Send all X-Ray’s to All X-Ray’s received by App Man will be scrubbed of any Customer Names or Identity using.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
AppMetrics Solutions Improving the performance of your Application, by giving you unique visibility into the core of your Application
CFM S4 Cloud iQ Guide THE FUTURE OF CASH AUTOMATION.
Building Dashboards SharePoint and Business Intelligence.
© 2013 IBM Corporation IBM Tivoli Composite Application Manager for Transactions Transaction Tracking Best Practice for Workspace Navigation.
Randy Pagels Sr. Developer Technology Specialist DX Team (Developer Experience and Evangelism) Application Insights Availability, Performance and Usage.
Nexthink V5 Demo ITSM – Users Impacted. Situation › It’s Wednesday morning › Last night the infrastructure team we worked hard on a proxy migration We.
AppMetrics for.NET Serviced Components Improving the performance of the Application, by giving you unique visibility and insight into the transaction paths.
Global Azure Bootcamp. Telemetry is collected at each tier: server backend, middleware, web service & browser 1 Telemetry arrives in Application Insights.
Sitecore. Compelling Web Experiences Page 1www.sitecore.net Patrick Schweizer Director of Sales Enablement 2013.
Manufacturing Productivity Solutions Management Metrics for Lean Manufacturing Companies Total Productive Maintenance (T.P.M.) Overall Equipment Effectivity.
1 Terminal Management System Usage Overview Document Version 1.1.
General System Navigation
Chapter 19: Network Management
OPERATING SYSTEMS CS 3502 Fall 2017
Working in the Forms Developer Environment
AESA – Module 8: Using Dashboards and Data Monitors
Data Virtualization Tutorial: Introduction to SQL Script
PLM, Document and Workflow Management
Software Architecture in Practice
Data Virtualization Tutorial… CORS and CIS
Microsoft Ignite /22/2018 3:27 PM BRK2121
3 | Analyzing Server, Network, and Client Health
SQL Server Monitoring Overview
OnContact CRM Customer Relationship Management
Michael Mast Senior Architect
Microsoft Build /20/2018 5:17 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
MTM Tools key to running
Upgrading to Microsoft SQL Server 2014
O.S Lecture 13 Virtual Memory.
Big Data - in Performance Engineering
QAD Operational Metrics Working Exceptionally!
VMware vRealize® Operations™ Management Pack for Pure Storage
SQL Server 2016 Query Data Store
Network Performance Manager
Saravana Kumar CEO/Founder - Kovai Atomic Scope – Product Update.
AIMS Equipment & Automation monitoring solution
Chapter 3 Database Management
What’s Happening with my App, Application Insights?
Power BI for the Consumer
Presentation transcript:

Demo Modules 3 Steps to diagnose a performance problem Analyzing Individual Transactions Exploring Transaction Data Isolating the problem

A. Three steps to diagnose an Application Performance Problem (5 min duration) This is the essential demo highlighting key AppInternals features. It tells a story how we can troubleshoot an application problem in three simple steps.

1) This is the End To End monitoring Dashboard for our Tradefast application. Tradefast is a multi tier .NET application utilizing web services connecting to a SQL Server. With the instrumentation of our APM solution we are able to capture the true / actual end user experience from real users, as well as monitor all the critical components of the application. The unified dashboards allow us to bring multiple performance datasets under the same pane of glass. They are fully customizable and can be viewed in mobile/tablet devices. Dashboards update real time and can be used to present data to different teams. You can easily build a custom experience for Operations, Development, Business or IT Management. The map shows where our users are coming from and what response time they are experiencing.

2) We have noticed that Users in California experience degraded performance. A sizable portion of our end users are experiencing increased application latency, 2.5% of users are experiencing delayed response time. Let us investigate the degradation further by inspecting how individual users are experiencing the home page.

Users from CA accessing the tradefast app Slowest transactions Most of delay appears to be server side 3) We are observing all users that had visited our app TradeFast from California. Every dot represents a page view by an individual user. We want to investigate the big red dots which indicate transactions with severe performance degradation. With our Big Data approach we capture and store every transaction end to end. Having this complete data set allows to troubleshoot and identify root cause as well as perform historical trending. In the current time frame, the category of delay pie chart on the right is indicating the majority of the delays for the Home page are due to slower response times from our servers as indicated by "First request to First byte".

Lets take one of the slowest transactions and look at what may be contributing to delay 4) We can select one of the slowest transaction and get more details for that user and page view. In this case the majority of delay was caused by a server side delay on the backend. The DNS, Connection time, Browser rendering time were all negligible.

5) The Server tab is giving us more details where this backend delay is coming from - Web, Application code, Database or other Remote calls. We can dissect the backend transaction and find the bottlenecks on the server side. Lets find why the server response time is so high on the backend for this transaction.

6) AppInternals can give us a per transaction map of delay for the backend multi tier transaction. We can quickly identify the bottlenecks for this one transaction. Looks like we have a delay in an application method that are the biggest culprits. There are several different data sets we can review for this transaction.

We can do a search on the previously mentioned method and see if it is a problem across the board or just 1 user, 1 server, etc. 7) Now that we know the offending class we can verify whether this is a problem across the board . As you see these degradations are happening periodically and impacting a lot of users (a dozen dots above the 5s response time line)

B. Analyzing Individual Transactions (5 min duration) The goal for this module is to dive deeper into application code and internals and be able to show our depth to Developers. Key things to stress: Web based interface Data that is easy to collaborate with Dynamic maps for every transaction Multi-tier stitching Very rich record about each transaction

Transaction details-lets deep dive Ctsecure and bondrequesthandler Summary of transaction 1) We can investigate individual transactions in depth in the Transaction Details view. We have identified Ctsecure and BondrequestHandler as bottlenecks.

Top calls 2) We can do a broader inspection of the internals of the transaction. We can get a list of the top slowest classes and methods for example, and which tier they are running on. This may be very useful if we want to systematically review performance and work with development to improve performance.

SQL queries 3) Transaction details also captures what SQL statements have been executed and their timing. Often SQL or Remote calls can be the culprit of slow application performance. Once you have narrowed down the list of SQL statements to be optimized, a DBA could help get more insight on how to speed these up.

Exceptions 4) Knowing application performance is critical but we also have to understand if we have other internal failures. In certain cases, we can encounter transactions the run very quickly that are in error state. The example here shows the transaction executed in milliseconds, but failed with a malformed SQL query and the end user received an error message in return. We can identify errors as well as performance bottlenecks.

Performance metrics contributing to user experience 5) We have a snapshot of key system metrics along with the execution of the transaction. This is useful as it can point out system bottlenecks that are impeding application performance. In this case our system CPU is pegged while this transaction is running, most likely contributing to the delays.

We can then take this analysis and create a report to send to relevant teams. 6) Collaboration is critical to achieving an Enterprise APM strategy. Individual transaction performance reports are easy to share. We can get a unique link to the full transaction report and send to our coworkers so they can get involved and help.

C. Exploring Transaction Data (5 min duration) This module provides an introduction on how to access transaction data and quickly narrow down what we are looking for. Key points: Big Data repository – storing all transactions efficiently Collaboration Fast access to billions of records Flexible and simple filtering or results Open ended search Numerous transaction fields to search on

TTW allows us to search a wide variety of data in a very short amount of time. Here we are searching on a URL being accessed by “Sam” 1) Warehouse provides a very intuitive data access mechanism. We can search on different attributes and type ahead gives us a clue of the available options. In this case we are search on a url and a user name which will give us all transactions for the home page by user sam.

We can search on all exceptions thrown. 2) We can also search on transaction that have thrown exception and errored. These are just as important as the slow transactions as they result in poor user experience (user getting 500 or 404 errors).

Stockentity class. Execution times 3) We can search on numerous attributes. In this case we are searching for all transactions that have used the stockentity class. It is very easy to find and analyze application code execution with this information. An application develop can get a list and find areas for improvement very quickly.

Can search on a wide variety of criteria More searching

The Warehouse is a Big Data store, we have very detailed complete records and can scale to billions of records per instance. The store is efficient and uses minimal disk space to store the transaction records.

E. Isolating the problem area (5 min duration) This module will showcase some quick triage scenarios and doing fault domain isolation with AppInternals data in Dashboards. Key points All Web Based real-time and role based Uis Collaboration Cross domain rich data – app, system, network, database Collecting high resolution data every second Easy workflows

Dashboards can be shared by publishing the link Collaboration is key to a successful Enterprise APM implementation. Dashboards can be shared with colleagues through a URL.

Here we see our slow pages metric in violation Here we see our slow pages metric in violation. We can double click to drill into violation. Lets walk through a few scenarios how we can isolate the problem area and focus our analysis in the right direction. I am seeing that one of my key metrics %Slow pages is exceeding the thresholds set. With a double click we can drill into the aggregate and see that it is really the home page showing performance degradations. WE have narrowed it down.

We can see the exact time the event occurred. The performance degradation occurred at 16:33

AT the same time we see 2 other key indicators deviate from their norm. At the exact same time I am seeing two of my other key indicators change. There is a slight increase in Server processing time. It is not network related from this observation. Next I notice that my Application Component charts are showing more processing in certain areas. Lets drilldown into that. I am maximizing the chart so we can inspect.

Ctsecure app code RT is spiking I can see more detail, and it looks like CTSecure Application Code (Other classes) are periodically showing more processing time. Drilling down further with a doubleclick.

CTSecure instance is degrading the most, I can right click and drill down to see all code executing within the CTSecure instance. In the sparkline view it is easy to compare a timeseries chart. I am observing that CtSecure instance is degraded the worst of all instances. I want to understand what individual classes and code may be contributing to the degradation. I right click on ctsecure and drilldown to all the code executing in CTSecure Application Code.

bondRequesthandler appears to be showing the largest delays bondRequesthandler appears to be showing the largest delays. We were able to isolate user experience to an individual method executing within a class of code. Lets look at why this is happening Looks like it is the BondRequestHandler class showing the biggest delays. I was able to isolate user experience, server and not network related, specific instance being the bottleneck, then a specific application class degrading the overall experience.

We can right click on the resource and drill down into system metrics to see if there might be a resource shortage. Looks like this is a periodic issue. My next question is “Why is the BondRequestHandle periodically degrading”. I can drilldown into System Metrics and see if there is a resource shortage.

CPU is spiking, memory is low, RT is up There definitely seems to be shortage of CPU cycles need to process the application workload. CPU is spiking and we have a higher rate of memory processing in this instance. We have to look into optimizing the application code, or adding more CPUs to speed up performance of this app under these workloads.