Download presentation
Presentation is loading. Please wait.
1
.NET Debugging for the Production Environment
Part 5: Debugging High CPU Hangs Should already have open: DebugDiag, NetDebugDemos, Decompiler, TaskManager (no live perfmon since it won’t show Threads for just a single exe (it shows all)). Intro Today we continue our series on .NET Debugging in the production environment, with a focus on debugging high cpu hang issues. Just a reminder to our audience that this entire series on .NET Debugging assumes we’re in the production environment. It’s an entirely different ballgame than if we were in the development environment or some other pre-production environment where we are free to do live debugs of the issue or where it may be much simpler to reproduce the problem. In production, our top priority is to get the application up & running as quickly as possible, and troubleshooting takes second priority. But armed with the right tools and knowledge to tackle the issue, we can gather the data required to find root cause while having a minimal impact on production. Brad Linscott Premier Field Engineering
2
Agenda High CPU hangs Data to collect in production environment
Common debugger commands for high cpu hangs Demo - We’ll start today’s session with a quick review on high cpu hangs – how do we define it and how do we recognize when we have one. Then we’ll re-hash the data we should collect for the problem in our production environment before we recycle the application (or take whatever steps are necessary) to get it running in a state where it’s functioning properly. Then finally we’ll take a look at some of the more common debugger commands typically used when debugging a dump of a high cpu hang before we continue on to the main part of today’s session – a demo of how to debug a high cpu hang.
3
High CPU Hangs When a single process consumes 90% or more cpu
Symptoms may include Application that is non-responsive/very slow to respond If it’s a desktop application consuming all the cpu on the box, then multiple apps on the desktop may be visually slow to respond Quick check – Task Manager, Processes tab First, let’s make sure we all have an understanding of what we mean by “high cpu hang”. It’s when a single process consumes, say, 90%, 95%, or even 100% cpu. A single-threaded application which suffers from high cpu usage will consume only one of the logical cores; a multi-threaded process exhibiting high cpu usage can consume all the cores. Symptoms that an application is suffering from high cpu can include The application is either completely non-responsive or it responds much slower than normal. For example, if you have an application that displays data in a grid control, but the data fills the control so slowly that it appears non-responsive. If a single process consumes all the cpu on the box, then other applications may also be slow to respond. A quick & easy way to verify you’ve got a “runaway process” is to go to the Process tab in Task Manager and sort by the CPU column. This will display the highest consumer(s) of cpu in descending order.
4
Data to Collect for High CPU Hangs
If you can do a live debug, then break into the process during the high cpu and debug it. Otherwise… “Ideal” data Two or more dumps 30 seconds apart during the high cpu hang Perfmon that includes Process & Thread objects Beware of log file size over long time period Minimum required data One dump during the high cpu hang So when we encounter a high cpu hang, what’s the best way to troubleshoot it? Well, in a pre-prod environment you’d debug it live. But you can’t do this in a production environment, so you need to gather data which can be analyzed later. The goal in production is to keep the applications up & running. Ideally, you’d get 2 or more dumps about 30 or so seconds apart. Getting these “staggered” dumps allows you to get see how the threads change between dumps and identify which one(s) consume the cpu. From there, you can analyze the stacks of the high cpu thread(s) to find root cause. A perfmon log that encompasses the time frame of the high cpu. Not only will this tell you the pattern of high cpu usage (spiking, rising steadily, flatlining, etc), but the Thread object can easily identify the problem thread(s) However, be aware that the Thread object severely bloats the size of the log file. On a server with a hundred processes and the average process uses thread, the default collect time of 15 seconds can result in a log size of a half GB, a GB, or more. This can severely impact the time it takes to transfer the log to a computer where you analyze the log and the analysis itself. Relog can help, but the amount of time & energy spent on this can many times outweigh the benefit of the perfmon log in this situation. At the minimum, get a dump of the process consuming the high cpu while it’s exhibiting the problem. This is often the case when a high cpu issue is first encountered, and there’s a bit of “panic” in the environment to “hurry up and get it under control”. Under these circumstances, it’s rare for people to take the time to get a perfmon log for 5 minutes or more, and to remember to get two dumps. However, it’s quite common that a single dump is all that’s needed to find root cause. The point is, it’s better to have data that you may not need (2 dumps and a perfmon log) than to need that data and not have it (e.g., because analysis of a single dump isn’t conclusive).
5
Common Debugger Commands for High CPU Hangs
!runaway – beware of jumping to conclusions To find the offending function(s), use stack & thread commands Any variant of ‘k’ (kb, kn, kp, k, etc.) for native stacks !psscorX.clrstack for managed stack (X=2, 4) Other thread/stack commands Sometimes we need help finding the executing line of code in the offending function u;!u Here are some of the more common commands I use when debugging a dump of a high cpu hang. - !runaway is from the uext extension that comes with the debuggers, and is automatically loaded into your debugging session by default. It shows the quanta in descending order, but make note of who the GC threads are – this may be expected. ~ commands are common, like making different threads the active thread, and looking at stacks of all the threads. If you find a call on the stack you’d like to investigate, looking at the disassembly may help you get closer to the line of source that was running when the dump was taken.
6
Demo: Debugging high cpu hang
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.