Troubleshooting SQL Server When You Cannot Access The Machine Vicky Harp Sr. Manager of SQL Server Product Management IDERA
Let’s talk bugs All software has failure conditions Corruption Tampering Most software has bugs Errors in logic Errors in execution
Let’s talk bugs Grey areas Scalability Platform and Hardware Compatibility Language and Regional Support When a customer experiences a problem, it’s no longer a grey area - it’s a bug.
Let’s talk bugs You will have a bug in your database. A customer is going to experience that bug. They will not be able to let you log in to their database. This is going to happen
Troubleshooting without access to the machine is very hard But it’s the best way to troubleshoot!
Why don’t you have access? Security policies Topological constraints Logistics and time zones Language barriers Availability of key personnel Inability to reproduce issue on demand
why shouldn’t you want access?
why shouldn’t you want access? Need for NDAs or security policies on customer data Responsibility for unintended effects No test case enrichment Skillset mismatch Single threaded troubleshooting
It’s better to have scripts More secure More relevant Build up a library which can be used by others Can be re-used in the QA process Usable by wide variety of personnel Easier to train support staff
Build supportability into your apps Distinct application names Metadata information in database Application and database logging Self-tracing or verbose output flags for data access layer Human readable object and column names
Build a support toolbox Create both generic scripts and problem specific scripts Document and maintain these scripts like your production code Use your toolbox during your QA cycle
Build a support toolbox Almost always collect low-cost, useful data Application Metadata Database Topology Fundamental Performance Metrics Scope and Scale Data
Build a support toolbox Make it easy to collect problem-specific data Reproduction Cases Errors Performance Data
Build a support toolbox Create your scripts with security in mind Assume it might be code reviewed Clearly document what the script is doing Do not return potentially sensitive data (you can mask it)
Remote troubleshooting techniques DMV Queries SET STATISTICS IO SET STATISTICS TIME Query Plans Annotated Stored Procedures XEvent Sessions
DMV Queries SQL Server ships with 100s of dynamic management views Most DMVs are lightweight and return almost instantly Very well documented and uncontroversial for users
DMV Queries Practical Uses: Table sizes, rowcounts File and filegroup topology Database status Backup history, job history Performance metrics and blocking chains Index utilization Schema verification
DMV Queries Thoughtfully written scripts will not need modification for different environments DMVs are partially dependent on SQL Server version, but less so with more contemporary releases
DEMO – DMV Queries
SET Statistics IO SET STATISTICS IO ON Verbose and valuable performance data Needs to be paired with the statement to be tuned set statistics io on exec p_myProblemProcedure @input='Repro'
Set Statistics IO set statistics io on exec sp_help Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysschobjs'. Scan count 1, logical reads 41, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'spt_values'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysscalartypes'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Set Statistics IO Look for the tables and worktables with a lot of IO Table Name Scan Count Logical Reads Physical Reads Read Ahead Reads Lob Logical Reads Lob Physical Reads Lob Read Ahead Reads Look for the tables and worktables with a lot of IO Look for tables you don’t expect to be accessed
SET STATISTICS TIME SET STATISTICS TIME ON Returns parse, compile, and execution time Works well in conjunction with SET STATISTICS IO ON
SET STATISTICS TIME select count(*) from MyCustomers where left(CustomerName,1) = 'a' select count(*) from MyCustomers where CustomerName like 'a%‘ SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 5 ms. (1 row(s) affected) SQL Server Execution Times: CPU time = 93 ms, elapsed time = 97 ms. SQL Server Execution Times: CPU time = 32 ms, elapsed time = 30 ms.
DEMO – SET STATISTICS IO / SET STATISTICS TIME
Query Plans Gold standard in diagnosing query performance issues Does not require user to actually execute the problem code Requires some level of expertise to interpret
DEMO – Query PLans
Annotated Stored Procedures Add error statements to help troubleshoot logical flow Helpful for complex procedures where query plans are cumbersome Either ship annotated procedures or have them available for swap-in
DEMO – Annotated Stored Procedures
Xevent Sessions Watch what is happening in your application and interpret it after the fact Can be tuned for a very exact use case or a broader diagnostic one May return sensitive information!
DEMO – Xevent Sessions
A note on reproducing issues You do not have to fully reproduce an issue: you may simulate it in code Hard coding wait times Contrived blocking chains Query hints Alternate tables Simulation and reproduction cases should be archived like source code
Questions?