CrashDump Microsoft Corporation
Agenda Problem Unexplainable field phenomena New Developments in Crashdump Solution How to get to ‘The Why’? Opportunity New work and timelines Problem Unexplainable field phenomena New Developments in Crashdump Solution How to get to ‘The Why’? Opportunity New work and timelines Microsoft Corporation
Unexplainable Field Phenomena All of these devices worked normally after reboot. No defects were found by file system scan. All of these devices worked normally after reboot. No defects were found by file system scan Microsoft Corporation3
Unexplainable Field Phenomena Microsoft has been tracking these for years Things aren’t getting better Customers expect a solution from us We have nothing to give them There are 2 theories The device has a flaw The device has been mishandled Microsoft has been tracking these for years Things aren’t getting better Customers expect a solution from us We have nothing to give them There are 2 theories The device has a flaw The device has been mishandled Microsoft Corporation4
Theory #1: The device has a flaw Goal: Address the flaw. Assumption: ATA devices are sophisticated enough to perform their own internal ‘crashdump’. Microsoft is not able to address these issues along. Digression: Microsoft attempt to do this with the millions of crash reports it receives every day In general, user mode crashes are available to partners from through different portals. Partners with kernel mode drivers can download ~50 randomly selected CABs for a given bucket through the WER portal Partners only receive external mini dumps. Full dumps and internal crashes may only be given out by selected groups. Kernel mode crashes typically are driver issues that cause Blue Screens of Death or reset the machine. Analysis of data has found that device failure is a significant source of perceived “driver issues”. Goal: Address the flaw. Assumption: ATA devices are sophisticated enough to perform their own internal ‘crashdump’. Microsoft is not able to address these issues along. Digression: Microsoft attempt to do this with the millions of crash reports it receives every day In general, user mode crashes are available to partners from through different portals. Partners with kernel mode drivers can download ~50 randomly selected CABs for a given bucket through the WER portal Partners only receive external mini dumps. Full dumps and internal crashes may only be given out by selected groups. Kernel mode crashes typically are driver issues that cause Blue Screens of Death or reset the machine. Analysis of data has found that device failure is a significant source of perceived “driver issues” Microsoft Corporation5 Goal: Address the flaw. Assumption: ATA devices are sophisticated enough to perform their own internal ‘crashdump’. Microsoft is not able to address these issues along. Goal: Address the flaw. Assumption: ATA devices are sophisticated enough to perform their own internal ‘crashdump’. Microsoft is not able to address these issues along.
Agenda Problem Unexplainable field phenomena New Developments in CrashDump Solution How to get to ‘The Why’? Opportunity New work and timelines Problem Unexplainable field phenomena New Developments in CrashDump Solution How to get to ‘The Why’? Opportunity New work and timelines Microsoft Corporation
Cloud Services (OCA,SQM, RAC) IHV End user Improved reliability of Windows storage experience 7 Windows devices customer experience data flow MS info Vendor info
Response Example 8
OCA process and workflow 9
OCA’s Expanding Focus Microsoft Corporation10 +Devices +Drivers +ISVs MSFT
Theory #2: The device has been mishandled Goal: Enable proper device handling. Assumptions: Device has background scan information about internal issues, error handling, and results attempted corrections. This background scan information would be useful to manufacturers if there was a method for delivering it from active deployed systems. Background scanning can result in actionable requests from devices, improving robustness, and raising handling issues to the users attention. Goal: Enable proper device handling. Assumptions: Device has background scan information about internal issues, error handling, and results attempted corrections. This background scan information would be useful to manufacturers if there was a method for delivering it from active deployed systems. Background scanning can result in actionable requests from devices, improving robustness, and raising handling issues to the users attention Microsoft Corporation11
Agenda Problem Unexplainable field phenomena New Developments in Crashdump Solution How to get to ‘The Why’? Opportunity New work and timelines Problem Unexplainable field phenomena New Developments in Crashdump Solution How to get to ‘The Why’? Opportunity New work and timelines Microsoft Corporation
How to get to ‘The Why’ How to transport? Time limited Size negotiation Security When to transport? Host triggers Device triggers Dump persistence and recycling What to transport? Bucketization Device CrashDump (flavors?) Background scan info Does DSM affect collection content? How big are we willing to let this feature become? How to transport? Time limited Size negotiation Security When to transport? Host triggers Device triggers Dump persistence and recycling What to transport? Bucketization Device CrashDump (flavors?) Background scan info Does DSM affect collection content? How big are we willing to let this feature become? Microsoft Corporation13
Background Scan Coordination Components Idle time notification Power event notification Background Scan vs. Power policy precedence Host/Device Event synchronization (TimeStamped) Idle time notification Power event notification Background Scan vs. Power policy precedence Host/Device Event synchronization (TimeStamped) Microsoft Corporation14
Background Scan Coordination Considerations Microsoft Corporation15
Background Scan Coordination Considerations Microsoft Corporation16
Background Scan Coordination Considerations Microsoft Corporation17
Background Scan Coordination Considerations Microsoft Corporation18
Agenda Problem Unexplainable field phenomena New Developments in Crashdump Solution How to get to ‘The Why’? Opportunity New work and timelines Problem Unexplainable field phenomena New Developments in Crashdump Solution How to get to ‘The Why’? Opportunity New work and timelines Microsoft Corporation
New work and Timelines Call for feedback, now. Proposal for T13 in June Approval in August Call for feedback, now. Proposal for T13 in June Approval in August Microsoft Corporation20