Download presentation
Presentation is loading. Please wait.
Published byJohn Walsh Modified over 6 years ago
1
Wait-Time Based Performance Management David Waugh Confio Software
1
2
Agenda Methodology Case Study I: Poor Code Design
Case Study II: Locking Problem Case Study III: Network Issue Case Study IV: High CPU Usage Q&A
3
Before we start, do you know…
Working on the Right Problems? Before we start, do you know… The most important problem in your database? Did the vendor really fix the problem in that patch? Which bottlenecks in your database directly impacted end-user service? 3
4
Symptoms of Conventional Tools
Server health, not performance information Statistics not tied to user experience Finger pointing over problem ownership
5
Wait Time Tuning vs. Ratios
5
6
Measure End-User Wait Time
Database processing includes hundreds of steps Identify Wait Time at every step Rank bottlenecks by impact on end user 6
7
Grocery Store Analogy Cashier is the CPU
Customer being checked out is “running” Customers waiting in line are “runnable” Customer 1 Requires Price Check Customer 1 “waits” on “Price Check” Customer 2 is checked out, i.e. “running” Customer 3 is “runnable” Price Check is Completed Customer 1 goes to “runnable”
8
Execution Model CPU 1 Waiter List SPID 60 – Running
SPID 52 – ASYNC_NETWORK_IO SPID 53 – OLEDB SPID 54 – PAGELATCH_IO SPID 57 – LCK_M_S SPID 59 – WRITELOG CPU 1 Queue SPID 51 – Runnable SPID 61 – Runnable
9
Execution Model (cont)
CPU 1 SPID 60 – Running (Needs to perform IO) SPID 51 - Running Waiter List SPID 52 – OLEDB SPID 53 – WRITELOG SPID 54 – PAGELATCH_IO SPID 57 – LCK_M_X SPID 59 – WAITFOR SPID 60 – PAGELATCH_IO CPU 1 Queue SPID 51 – Runnable SPID 61 – Runnable SPID 59 – Runnable
10
Wait Time Tables sysprocesses syscacheobjects (SQL2000) loginame
hostname programname spid dbid waittype lastwaittype sql_handle stmt_start stmt_end cmd syscacheobjects (SQL2000) usecounts pagesused dm_exec_query_stats (SQL2005/8) execution_count total_logical_writes total_physical_reads total_logical_reads sysdatabases dbid name ::fn_get_sql text
11
Sysprocesses Table A MASTER Table Holds SQL Server Process Information
COLUMNS loginame Database user login hostname Name of workstation programname – Name of application spid SQL Server process ID dbid ID of database currently used by process waittype Binary internal column, 0x0000 if not waiting lastwaittype Name of last or current wait type sql_handle Current executing batch or object stmt_start Starting offset of current SQL as specified in sql_handle stmt_end Ending offset of current SQL as specified cmd Command currently being executed
12
SQL Statistics Information
SYSCACHEOBJECTS (SQL2000 – Master Table) COLUMNS: sql – Procedure name or first 128 characters of batch cacheobjtype – Type of object in cache (Executeable or Compiled Plans) usecounts – Number of times cache object has been used pagesused – Number of memory pages used by cache object DM_EXEC_QUERY_STATS (SQL2005/8– Dynamic Management View) sql_handle – Token which refers to the batch or stored procedure execution_count – Number of times the plan has executed since compiled total_logical_writes – For executions of plan since last compiled total_physical_reads – For executions of plan since last compiled total_logical_reads - For executions of plan since last compiled
13
SQL Text & Database Info
::fn_get_sql function Accepts sql_handle (from sysprocesses table) Returns current SQL Text for SPID In 2005/2008, use sys.dm_exec_sql_text instead sysdatabases – Master table Contains one row for each database COLUMNS: dbid name
14
Sample Wait Types WRITELOG LCK_M_S, LCK_M_U, LCK_M_X… NETWORKIO OLEDB
Waiting for a log flush to complete LCK_M_S, LCK_M_U, LCK_M_X… Waiting to acquire locks NETWORKIO Waiting on the network OLEDB Waiting for an OLE DB provider to return data WAITFOR (idle event) Waiting during a WAITFOR command
15
Compliant Tool Types Two Primary Types of Tools Tracing Tools
Tools that focus on one session at a time often by tracing the process Examples: SQL Server Profiler, … Continuous DB Wide Monitoring Tools Tools that focus on all sessions by sampling Examples: Confio Ignite, Idera, PDW (2008) … Both have a place in the organization
16
Tracing Tracing with waits complies
High Overhead Point in time data only Use cautiously due to session statistics skew 95 of 100 sessions are running well 5 out of 100 have spent 99% of time waiting for locked rows If you trace one of the “95” sessions, it appears as if you have no locking issues (and spend time trying to tune other items that may not be important) If you trace one of the “5” sessions, it appears as if you could fix the locking problems and reduce your wait time by 99%
17
Tracing (cont) Very precise - may be only way to get some statistics
Variable information is available Can provide detailed analysis even deeper than just waits Ideal if a known problem is going to occur in the future Difficult to see trends over time
18
Continuous Monitoring Tools
24/7 sampling provides real-time and historical perspective Allows DBA to go back in time I had a problem at 3:00 pm yesterday Not the level of detail provided by tracing Most of these tools have trend reports that allow communication with other groups What is starting to perform poorly? What progress have we made while tuning?
19
Performance Intelligence (PI) - Wait Time Methodology
Four Key Principles of PI SQL View: All statistics and information at SQL statement level Time View: Measure Time, not number of times something occurred Full View: Measure every wait individually to isolate source of problems Historical View: Store data long term to spot trends, anomalies, relationships and easier analytics
20
Case Study I Poor Code Design
21
Problem Observed Situation: Developer noticed long processing times when updating data in test db Production DBAs would not let code go to production that was taking this long Existing database tools not giving enough information to resolve issues in a timely fashion. Used wait time methodology to determine what the process was waiting for
22
Original Problem Problem Code – WRITELOG Waits 4.5 hrs waiting on CPU
23
What does Performance Data tell us?
Which SQL: Insert Loop Which Wait Type: WRITELOG How much time: 7+ Minutes
24
Code Review Inserted 70,000 rows in 7:34 DECLARE @i INT SET @i = 1
< 70000 BEGIN BEGIN TRANSACTION INSERT INTO [jpetstore].[dbo].[product]( [productid], [category], [name], [descn]) VALUES / 1000), 'PROD' + ',''), 'PROD' + ','')) + 1 COMMIT END
25
“WRITELOG” Description
Occurs while waiting for a log flush to complete. Common operations that cause log flushes are checkpoints and transaction commits. Solutions Commit data less often Add additional IO bandwidth to the disk subsystem where the transaction log is stored. Move non-transaction log IO from the disk. Move the transaction log to a less busy disk. Reduce the size of the transaction log has also helped in some cases
26
Resolution Inserted 70,000 rows in 0:05 vs. 7:34 DECLARE @i INT
= 1 BEGIN TRANSACTION < 70000 BEGIN INSERT INTO [jpetstore].[dbo].[product]( [productid], [category], [name], [descn]) VALUES / 1000), 'PROD' + ',''), 'PROD' + ','')) + 1 END COMMIT
27
Case Study II Locking Problem
28
Problem Observed Situation: Web Application performance unsatisfactory
Database performance causing excessive delays for end users Existing database tools not giving information to resolve issues in a timely fashion. DBA Team concerned because escalations and finger pointing was occurring
29
Offending SQL Statements
GetState SQL – 8 hours wait time
30
Wait Types During Problem
49% in LCK_M_U Wait 27% of time waited on WRITELOG
31
What does Performance Data tell us?
Which SQL: GetState Which Wait Type: LCK_M_U (49%) WRITELOG (27%) How much time: 8 Hours of wait time per day
32
“LCK_M_U” Description
Update Lock. Normally occurs when attempting to update a row that is locked by another session. Resolved by: DBA's and Developer's Solutions For shared locks, check Isolation level for transaction. Keep transaction as short as possible. Check for memory pressure, which causes more physical I/O, thus prolonging the duration of transactions and locks.
33
“WRITELOG” Description
Occurs while waiting for a log flush to complete. Common operations that cause log flushes are checkpoints and transaction commits. Solutions Add additional IO bandwidth to the disk subsystem where the transaction log is stored. Move non-transaction log IO from the disk. Move the transaction log to a less busy disk. Reduce the size of the transaction log has also helped in some cases Commit data less often
34
Results Found Locking & Logging Wait Problems to be 76% of the total wait time Solutions Deleted Obsolete Rows from Table. Reduced wait time for this procedure from 8 hours on March 9th to 30 minutes after removing the data. Rebuilt Indexes Resized transaction logs
35
Results Observed
36
Case Study III Network Issues
37
Problem Observed Situation: Microsoft Access application was performing very poorly Some screens in Access took several minutes to return to the user Access was hitting tables in SQL Server database Access developers blamed SQL Server DBAs Classic finger-pointing scenario
38
Problem Details 4.5 hrs waiting on CPU 2 hrs waiting on CPU
39
Offending SQLs
40
What does PI Data tell us?
Which SQL: PatImage Which Resource: NETWORKIO How much time: Hours of wait time per day
41
PatImage Details
42
“NETWORKIO” Description
Occurs on network writes when the task is blocked behind the network. May be blocked waiting for client to receive data. Verify that the client is processing data from the server Resolved by: Network Administrators or Developers Solutions: If abnormally high, check that a component of the network isn't malfunctioning. Otherwise, may need to speed up the client to accept or process data faster.
43
Resolution Query is waiting on NETWORKIO
Call the Network Admin, Right? Not so fast – review the query again Query has no WHERE clause Access was sending this query to SQL Server getting every row in the PatImage table Access then joined it to another table queried in a similar fashion Access did the joins instead of SQL Server
44
Case Study Four High CPU Usage
45
Problem Observed Situation: Encountering High CPU Usage during the day
Database performance causing excessive delays for external customer. Existing database tools not giving enough information to resolve issues in a timely fashion. Management wanted to purchase new server with more powerful CPUs.
46
Offending SQL 4.5 hrs waiting on CPU 2 hrs waiting on CPU
47
What does PI Data tell us?
Which SQL: WebLoad_Itemstyle (closeout & bike sqls ) Which Resource: CPU How much time: Hours of wait time per day
48
“CPU” Description The database is typically using the CPU and/or memory (not necessarily waiting to use the CPU). Solutions Memory Scans Queries that have high waits on CPU may be reading more data from memory than necessary. (Full Table / Inefficient Index scans) Try to issue fewer queries. It is also possible to cache data in the application that may require fewer queries against the database. Check to see if other database activity, such as large batch jobs, can be scheduled for another time. These types of jobs may cause significant memory and/or CPU contention.
49
High Logical Reads
50
Results Found CPU Usage / Wait Problem to be 100% of the total wait time. Solutions Review current index usage to reduce the amount of data being read in memory. Move other batch processes that are running at same time to other timeslots when CPU not high.
51
Conclusion Conventional Tuning focuses on “system health” and can lead to finger-pointing and confusion Wait Types tuning implemented according to Confio PI Methods is the best way to tune Continuous DB-wide monitoring tool 4 Key Principles sql, time, resource (wait type), historical views Questions & Answers
52
Who is Confio Founded 2002 by a DBA to change how database performance is managed All products developed and supported in Boulder, CO Hundreds of customers worldwide, including: 54
53
What We Do Develop best-practice database performance solutions
Focus on end-user Wait-Time Incorporate Performance Intelligence Create Igniter Suite to complement existing tools Deliver single solution for Microsoft, Oracle, Sybase & IBM databases Download FREE evaluation at 55
54
Contact Info David Waugh 56
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.