Windows Crash Dump Analysis Daniel Pearson David Solomon Expert Seminars.

Slides:



Advertisements
Similar presentations
Advanced Troubleshooting with Debug Diagnostics on IIS 6
Advertisements

Daniel Pearson David Solomon Expert Seminars SVR302.
计算机系 信息处理实验室 Lecture 5 Startup and Shutdown
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 3 Configuring the Windows Server 2008 Environment.
© Neeraj Suri EU-NSF ICT March 2006 Budapesti Műszaki és Gazdaságtudományi Egyetem Méréstechnika és Információs Rendszerek Tanszék Zoltán Micskei
ADM390 Microsoft® Windows® Crash Dump Analysis
Operating System - Overview Lecture 2. OPERATING SYSTEM STRUCTURES Main componants of an O/S Process Management Main Memory Management File Management.
INTRODUCTION OS/2 was initially designed to extend the capabilities of DOS by IBM and Microsoft Corporations. To create a single industry-standard operating.
Chapter 14 Chapter 14: Server Monitoring and Optimization.
V0.01 © 2009 Research In Motion Limited Introduction to Java Application Development for the BlackBerry Smartphone Trainer name Date.
Cs238 Lecture 3 Operating System Structures Dr. Alan R. Davis.
Windows Performance Troubleshooting and Analysis
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Driver Verifier Advancements In Windows 7 Daniel Mihai Principal Software Design Engineer Windows Engineering Tools.
2 Debugging Performance Issues, Memory Issues and Crashes in.net Applications Tess Ferrandez - Norlander Support Escalation Engineer Microsoft Session.
Ch 11 Managing System Reliability and Availability 1.
Troubleshooting Hardware Issues Lesson 5. Objectives 2.
Windows Debugging Demystified
Debugging techniques in Linux Debugging Techniques in Linux Chetan Kumar S Wipro Technologies.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
1 CS503: Operating Systems Part 1: OS Interface Dongyan Xu Department of Computer Science Purdue University.
W INDOWS BLUE SCREEN OF DEATH AFTER CRASH DEBUGGING Alex Mclean Amy Valley Derek Visch.
Host and Application Security Lesson 4: The Win32 Boot Process.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
Basic Input Output System
®® Microsoft Windows 7 for Power Users Tutorial 8 Troubleshooting Windows 7.
Hands-On Microsoft Windows Server 2008
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 11 Case Study 2: Windows Vista Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS3: Concurrency 3.5. Lab Slides & Lab Manual.
DEBUGGING CHAPTER Topics  Getting Started with Debugging  Types of Bugs –Compile-Time Bugs –Bugs Attaching Scripts –Runtime Errors  Stepping.
C ONFUSED, F RUSTRATED, C OME OVER AND SEE US AT THE HELP DESK. We can help you with this and much, much more.
Basic STOP Error (Blue Screen) Troubleshooting Doug Allen Support Professional PSS Premier Setup Team Microsoft Corporation.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.1. Principles of I/O.
Windows Vista Inside Out Chapter 22 - Monitoring System Activities with Event Viewer Last modified am.
Windows 2000 Course Summary Computing Department, Lancaster University, UK.
14 Step-by-Step Instructions for an Upgrade Installation n Prepare for the installation Verify that all devices and applications are Windows 2000 compatible.
SQL Server Crash Dump Analysis A brief tour with WinDbg and other ugly tools Pablo Álvarez Doval Debugging & Optimization Team Lead
Chapter 33 Troubleshooting Windows Errors. STOP Errors  When Microsoft Windows XP encounters a serious problem  And the operating system can't continue.
Unit OS11: Performance Evaluation Lab Manual.
11 INSTALLING AND MANAGING HARDWARE Chapter 6. Chapter 6: Installing and Managing Hardware2 INSTALLING AND MANAGING HARDWARE  Install hardware in a Microsoft.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS3: Concurrency 3.3. Advanced Windows Synchronization.
Lecture 5 Rootkits Hoglund/Butler (Chapters 1-3).
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
| © 2007 LenovoLenovo Confidential Use WinDBG Tool to Analyze BSOD —— Lenovo Service Support Training.
Windows Vista Configuration MCTS : Maintenance and Optimization.
Unit Hardware Troubleshooting
Chapter Objectives In this chapter, you will learn:
Chapter Objectives In this chapter, you will learn:
Operating System Review
Crash Dump Analysis - Santosh Kumar Singh.
Mechanism: Limited Direct Execution
Chapter 2: System Structures
Unit OS11: Performance Evaluation
Unit OS2: Operating System Principles
Files Used in the Boot Process
Fix Windows 7 Blue Screen Error Call Support Number
CONFIGURING HARDWARE DEVICE & START UP PROCESS
Operating System Review
Chapter 3: Windows7 Part 2.
Windows Internals Brown-Bag Seminar Chapter 1 – Concepts and Tools
Chapter 3: Windows7 Part 2.
Operating Systems Chapter 5: Input/Output Management
Unit OS5: Memory Management
Chapter 2: Operating-System Structures
Modern PC operating systems
CSE 451: Operating Systems Autumn 2003 Lecture 2 Architectural Support for Operating Systems Hank Levy 596 Allen Center 1.
SVR422 Windows Hang and Crash Dump Analysis
CSE 451: Operating Systems Winter 2003 Lecture 2 Architectural Support for Operating Systems Hank Levy 412 Sieg Hall 1.
Chapter 2: Operating-System Structures
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Windows Crash Dump Analysis Daniel Pearson David Solomon Expert Seminars

Daniel Pearson Started working with Windows NT 3.51 Three years at Digital Equipment Corporation Supporting Intel and Alpha systems running Windows NT Seven years at Microsoft Senior Escalation Lead in Windows base team Worked in the Mobile Internet sustained engineering team Instructor for David Solomon, co-author of the Windows Internals book series

Agenda Causes of Windows crashes What happens during a crash Configuring Windows crash options Writing a crash dump Automated and manual crash analysis Using Driver Verifier to detect errors Attaching a kernel debugger * * Portions of this session are based on material developed by Mark Russinovich and David Solomon

Why Analyze a Crash? When Windows Error Reporting has no solution or when it blames “a device driver”

Why Does Windows Crash A device driver or part of the operating system incurs an unhandled exception A device driver or part of the operating system explicitly crashes the system due to an unrecoverable condition A page fault occurs at an interrupt request level of dispatch or higher A hardware condition such as a nonmaskable interrupt or faulty memory, disk, etc.

Causes of Windows Crashes 1. 1.Microsoft Corporation Online Crash Analysis research performed in September of 2008.

What Happens During a Crash When a condition is detected that requires a crash, the kernel API KeBugCheckEx is called KeBugCheckEx accepts a bugcheck code that indicates the reason for the crash and four parameters that supply additional information KeBugCheckEx( IN ULONG BugCheckCode, IN ULONG_PTR BugCheckParameter1, IN ULONG_PTR BugCheckParameter2, IN ULONG_PTR BugCheckParameter3, IN ULONG_PTR BugCheckParameter4 );

Inside of KeBugCheckEx KeBugCheckEx performs several functions Disables interrupts Notifies other CPUs to halt execution Notifies registered drivers Writes crash dump information to disk * Restarts the system * * * Only if the system is configured to do so

The Windows Stop Screen

Bugcheck Codes Shared by many components and drivers The Windows Driver Kit currently documents over 250 unique bugcheck codes

Memory Dump Types Small memory dump Records the smallest set of useful information Kernel memory dump * Records only kernel memory, which speeds up the process of writing a crash dump Complete memory dump * Records the entire contents of system memory * * If either a Kernel or Complete memory dump is selected, the system will also create a minidump and store it in the %SystemRoot%\minidump directory

Configuring Debugging Information Options

Writing a Crash Dump Crash dump information is written to the paging file on the boot volume or to a dedicated dump file if specified Too risky to create a new file on the system How does the system know its safe? The boot volume paging file’s on-disk mapping is obtained when the system starts Critical crash components are checksummed When a crash occurs, if the checksum doesn’t match, a memory dump is not written

Why Would You Not Get a Dump? Problems with page file configuration The paging file on the boot volume is too small or one does not exist The system crashed before the paging file was initialized Critical crash components are corrupted Windows didn’t crash! The system spontaneously restarted The system is hung

Analyzing a Crash Dump The Microsoft kernel debuggers can be used to open and analyze a crash dump kd, a command line tool and WinDbg, a GUI tool Available as part of the Debugging Tools for Windows Configure the debugger to point to symbols srv*C:\SYMBOLS*

Automated Analysis When you open a crash dump with WinDbg or kd, the debugger performs basic crash analysis * Displays stop code and parameter information Takes a guess at the offending driver The analysis is the result of the automated execution of the !analyze debugger command !analyze uses the bugcheck parameters and a set of heuristics to determine what component is the likely cause of the crash * * Set the environment variable DBGENG_NO_BUGCHECK_ANALYSIS=1 to disable

Automated Analysis Using !analyze

Memory Corruption Occurs when a driver goes past the end, called an overrun, or the beginning, an underrun, of it’s memory allocation Usually detected when overwritten data is referenced by the kernel or another driver It’s possible there’s a long delay between corruption and detection

Viewing the Effects of Memory Corruption

Crash Transformation For crashes that are difficult to analyze The “victim” crashed the system, not the culprit The debugger points to ntoskrnl.exe, win32k.sys or other Windows components You get many different crash dumps all pointing at different causes Your goal isn’t to analyze difficult crashes … It’s to try to make an “unanalyzable” crash into one that can be easily analyzed

Driver Verifier Useful for identifying code defects in drivers Performs more thorough checks on the system and device drivers as well as simulating failures Support is built into the operating system The requirements for the Windows logo program state that a driver must not fail while running under Driver Verifier

Using Driver Verifier to Catch a Buffer Overrun

Manual Analysis Sometimes !analyze isn’t enough It might not tell you anything useful You want to know in more detail what was happening at the time of the crash Several useful commands and techniques Verify the time of the crash,.time A short uptime value can mean frequent problems Check the stack on each CPU, stacks are read from the bottom to the top !cpuinfo will display a list of all the CPUs Use ~s to switch to a different CPU for investigation k to display the stack

Manual Analysis Several useful commands and techniques Look at memory usage, !vm Make sure memory pools are not depleted or contain errors Use !poolused to identify large users Check the currently running thread, !thread May or may not be related to the crash Check pending I/O requests using !irp List all processes on the system, !process 0 0 Make sure you understand what was running at the time List loaded drivers, lm t n Make sure all the drivers are recognizable and up to date

Manual Analysis of a Crash Dump

Attaching a Kernel Debugger Required for debugging initialization failures and crashes where no dump file is created Requires that the system be started with the debugger enabled to work Support for using a null-modem, IEEE 1394 and USB 2.0 cable as well as virtual machines and over the network in Windows 7 Limited support for local kernel debugging

Attaching a Kernel Debugger to a Live System

Hung Systems Sometimes systems becomes unresponsive Keyboard and mouse frozen Two types of hangs Instant lockup Kernel synchronization deadlock Infinite loop at a high IRQL or a very high priority thread Slowly grinding to a halt Resource depletion

Initiating a Manual Crash Using the keyboard Requires a PS/2 keyboard + registry key HKLM\SYSTEM\CurrentControlSet\Services\i8042prt\ Parameters\CrashOnCtrlScroll Using an NMI button Requires specialized hardware + registry key HKLM\SYSTEM\CurrentControlSet\Control\ CrashControl\NMICrashDump Using the debugger Break in and execute the.crash command

Debugging a Hung System

Additional Information Windows Internals 5 th edition Debugging Tools for Windows documentation Mark Russinovich’s Blog Advanced Windows Debugging Blog Crash Dump Analysis and Debugging Portal

Additional Information David Solomon Expert Seminars offers training on Windows Internals both as public and private workshops and public webinars via the Internet Currently scheduled up and coming classes Public workshop in London, April 12 th – April 16 th Public webinar, April 26 th & April 28 th Public workshop in New York, May 3 rd – May 7 th Public workshop in San Francisco, November 8 th – November 12 th Visit for further course descriptions and up to date informationhttp://