Wait-Time Based Oracle Performance Management Prepared for Ohio Oracle User Group Presented by Dean Richards Senior Engineer, Confio Software.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Advanced Oracle DB tuning Performance can be defined in very different ways (OLTP versus DSS) Specific goals and targets must be set => clear recognition.
Chapter 9. Performance Management Enterprise wide endeavor Research and ascertain all performance problems – not just DBMS Five factors influence DB performance.
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Rollback Segments Nilendu Misra (MAR99)
The Architecture of Oracle
IO Waits Kyle Hailey #.2 Copyright 2006 Kyle Hailey Waits Covered in this Section  db file sequential read  db file scattered.
Enqueue Waits : Locks. #.2 Copyright 2006 Kyle Hailey Wait Tree - Locks Waits Disk I/O Library Cache Enqueue Undo TX 6 Row Lock TX 4 ITL Lock HW Lock.
Wait-Time Based Oracle Performance Management
Buffer Cache Waits. #.2 Copyright 2006 Kyle Hailey Buffer Cache Waits Waits Disk I/O Buffer Busy Library Cache Enqueue SQL*Net Free Buffer Hot Blocks.
Oracle Architecture. Instances and Databases (1/2)
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Performance And Tuning – Lecture 7 Copyright System Managers LLC 2007 all rights reserved.
INTRODUCTION TO ORACLE DATABASE ADMINISTRATION Lynnwood Brown System Managers LLC Introduction – Lecture 1 Copyright System Managers LLC 2007 all rights.
Operating System Support Focus on Architecture
1 - Oracle Server Architecture Overview
Memory Management 2010.
Harvard University Oracle Database Administration Session 5 Data Storage.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Backup & Recovery 1.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
1 Robert Wijnbelt Health Check your Database A Performance Tuning Methodology.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
7202ICT Database Administration Lecture 7 Managing Database Storage Part 2 Orale Concept Manuel Chapter 3 & 4.
Wait-Time Based Oracle Performance Management Prepared for UNYOUG Presented by Matt Larson CTO, Confio Software.
Extents, segments and blocks in detail. Database structure Database Table spaces Segment Extent Oracle block O/S block Data file logical physical.
Improving Efficiency of I/O Bound Systems More Memory, Better Caching Newer and Faster Disk Drives Set Object Access (SETOBJACC) Reorganize (RGZPFM) w/
Oracle9i Performance Tuning Chapter 1 Performance Tuning Overview.
Oracle Tuning Considerations. Agenda Why Tune ? Why Tune ? Ways to Improve Performance Ways to Improve Performance Hardware Hardware Software Software.
Oracle Tuning Ashok Kapur Hawkeye Technology, Inc.
9 Storage Structure and Relationships. 9-2 Objectives Listing the different segment types and their uses Controlling the use of extents by segments Stating.
DB Performance Ana Stanescu CIS764 - Fall 08 KSU.
Oracle9i Performance Tuning Chapter 12 Tuning Tools.
15 Copyright © 2006, Oracle. All rights reserved. Performance Tuning: Summary.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Database structure and space Management. Segments The level of logical database storage above an extent is called a segment. A segment is a set of extents.
#.6 Sampling Kyle Hailey
1 Chapter 13 Parallel SQL. 2 Understanding Parallel SQL Enables a SQL statement to be: – Split into multiple threads – Each thread processed simultaneously.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
10 Managing Rollback Segments Objectives Planning the number and size of rollback segments Creating rollback segments using appropriate storage.
Chapter 5 Index and Clustering
1 Copyright © 2005, Oracle. All rights reserved. Following a Tuning Methodology.
Preface 1Performance Tuning Methodology: A Review Course Structure 1-2 Lesson Objective 1-3 Concepts 1-4 Determining the Worst Bottleneck 1-5 Understanding.
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
1 Chapter 9 Tuning Table Access. 2 Overview Improve performance of access to single table Explain access methods – Full Table Scan – Index – Partition-level.
for all Hyperion video tutorial/Training/Certification/Material Essbase Optimization Techniques by Amit.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Same Plan Different Performance Mauro Pagano. Consultant/Developer/Analyst Oracle  Enkitec  Accenture DBPerf and SQL Tuning Training Tools (SQLT, SQLd360,
This document is provided for informational purposes only and Microsoft makes no warranties, either express or implied, in this document. Information.
Oracle Database Architectural Components
Practical Database Design and Tuning
Performance Management
Wait-Time Based Performance Management David Waugh Confio Software
Indexing Structures for Files and Physical Database Design
Database structure and space Management
Database Performance Tuning and Query Optimization
Practical Database Design and Tuning
Transactions, Locking and Query Optimisation
Troubleshooting Techniques(*)
Chapter 11 Database Performance Tuning and Query Optimization
Presentation transcript:

Wait-Time Based Oracle Performance Management Prepared for Ohio Oracle User Group Presented by Dean Richards Senior Engineer, Confio Software

2 Who is Dean Richards?  Senior DBA/Engineer for Confio Software  Former DBA performance consultant with Oracle for five years  Specialize in performance tuning using Oracle Wait Event interface  Review performance of Oracle databases at hundreds of customers each year

3 Agenda  Confio Performance Intelligence  Case Study One: Hot Block Issue  Case Study Two: Full Table Scans  Case Study Three: Inefficient Indexes  Q&A

4 Working the Right Problems?  After spending an agonizing week tuning Oracle to minimize I/O operations, management typically rewards you with: A. An all expense paid vacation B. A free lunch C. Crumbs from the kitchen D. Reward? Nobody even noticed!

5 Why Does This Happen?  Many tools measure system health  Assumption: If I make the database healthy, users benefit  Symptoms DBA finds “big” problem and fixes it, users report no impact Lots of data to review and things to fix, not sure which to do first Unclear view of performance leads to Finger-pointing Developer or vendor It’s your Code! It’s your Database! IT staff

6 Confio Performance Intelligence  Three Key Principles 1. SQL View: All statistics at SQL statement level 2. Full View: Separately measure every resource (Oracle wait events) to isolate source of problems 3. Time View: Measure Time, not number of times a resource is utilized

7 Focuses on User Wait-Time 1. Identify each bottleneck affecting the user 2. Rank bottlenecks by user impact 3. Implement proven suggestions 4. Set correct expectations on impact of fix 5. Show proof the fix helped users

8 SQL View Principle  Example: ‘CEO’ measuring ‘employee’ output  Averaging over entire company gives no useful data  Must measure each job separately  DBA must manage database similarly  Measure and identify bottlenecks for each SQL independently

9 Time View Principle  Example: ‘CEO’ counting ‘tasks’ vs. ‘time to complete’  Counting system statistics not meaningful  Must measure Time to complete  System stats (buffer size, hit ratios, I/O counts) do not identify where database customers are waiting  Identify and optimize Wait Time for each SQL as best indicator of performance

10 Full View Principle  Example: ‘CEO’ measuring results with blind spot hiding key processes  Without direct visibility, valuable info is lost  Must have visibility to every process step  Distinctly identify and measure each Oracle resource for each distinct SQL

11 Compliant Performance Tool Types Two Primary Types of Tools  Session Specific Tools Tools that focus on one session at a time often by tracing the process Examples: OraSRP Profiler (open source), Hotsos Profiler, tkprof  Continuous DB Wide Monitoring Tools Tools that focus on all sessions by sampling Oracle Examples: Confio Ignite, Symantec i3  Both tools have a place in the organization

12 Tracing  Use cautiously due to session statistics skew 95 out of 100 users are running well 5 out of 100 have spent 99% of time waiting for locked rows If you trace one of the “95” sessions –No locking problems at all –May spend time trying to tune other items that may not be important If you trace one of the “5” sessions –Severe locking problems –Appears that you could fix the locking problems and reduce your wait time by 99%

13 Tracing (cont)  Advantages Very precise - may be only way to get some statistics Bind variable information is available Can provide detailed analysis even deeper than wait events  Disadvantages Only works if a known problem is going to occur in the future (and known session) Difficult to see trends over time

14 Continuous DB Wide Monitoring Tools  24/7 sampling provides real-time and historical perspective  Allows DBA to go back in time I had a problem at 3:00 pm yesterday  Use Performance Intelligence - trend reports, graphs, etc to communicate with other groups What is starting to perform poorly? What progress have we made while tuning?

15 Oracle Wait Interface  V$SESSION_WAIT (X$KSUSECST) SID (join to v$session) EVENT P1, P1RAW, P2, P2RAW, P3, P3RAW STATE = ‘WAITING’ – currently waiting on event STATE = ‘WAITED…’ – currently on CPU (or in queue)  Oracle 10g added this info to V$SESSION

16 Oracle Sessions  V$SESSION (X$KSUSE) SID USERNAME SQL_HASH_VALUE –Join to V$SQL PROGRAM MODULE / ACTION –DBMS_APPLICATION_INFO PLAN_HASH_VALUE –Join to V$SQL_PLAN

17 Base Query SELECT sid, username, program, module, action, machine, osuser, sql_hash_value, … decode(state, ‘WAITING’, event, ‘CPU’) event, p1, p1raw, p2, …, SYSDATE FROM V$SESSION s WHERE s.status = ‘ACTIVE’ AND event NOT IN ( );

18 Additional Information  V$SESSION service_name, machine, client_info row_wait_obj#, blocking_session  Go back later to get Sql_text from v$sql SQL stats from v$sqlarea Execution plan from v$sql_plan Object info from dba_objects

19 Active Session History  V$ACTIVE_SESSION_HISTORY Data warehouse for session statistics Oracle 10g and higher Data is sampled every second Holds at least one hour of history Never bigger than: –2% of SGA_TARGET –5% of SHARED_POOL (if automatic sga sizing is turned off)  WRH$_ACTIVE_SESSION_HISTORY Above table gets flushed to this table

20 Oracle 11g Wait Events  928 wait events in 11.1  871 wait events in 10GR2

21 Top Wait Time (52 Customers)  db file sequential read - 28%  db file scattered read- 27%  CPU- 12%  direct path read / write- 11%  buffer busy waits- 5%  log file sync- 3%  library cache lock- 2%  log buffer space- 2%

Case Study One Hot Block Issue

23 Problem Observed  Critical situation: application performance unsatisfactory All coming into and going out of the company was tracked in order to find: –Viruses –Espionage –Legal reasons However, was getting behind not getting to end-users for several hours Declared top priority in company

24 Wait Events During Problem Buffer busy waits Query that is doing “real” work Log file waits

25 What do we know?  Which SQL: DDL or Commits SQL hash_value=0  Which Resource: buffer busy waits log file waits  How much time: 163 Hours of wait time per day

26 “buffer busy waits” Description  Buffer is being read into cache by another session and this session is waiting for that process to complete. In Oracle 10g buffer busy waits are further refined and this becomes “read by other session”  Buffer is already in the cache but in an incompatible mode, i.e. another session is changing it.

27 “buffer busy waits” Description  P1 – file number information  P2 – block number information SELECT owner, segment_name, segment_type FROM dba_extents WHERE file_id = &P1 AND &P2 BETWEEN block_id AND block_id + blocks -1  Gives information about the object being waited for

28 “buffer busy waits” Description  Waiting on Data Blocks Tune Inefficient Queries Eliminate Hot Blocks  Waiting on Segment Headers Optimize PCTFREE / PCTUSED Use Multiple Freelists Use Larger Extents Optimize Application

29 “buffer busy waits” Analysis

30 Results  Found hot block problem “buffer busy waits” was waiting for Block #2 in the file “…staging01.dbf” The processing code was creating a series of staging tables, every time it executed  Solutions Started using temporary tables vs. create/drop distinct tables each time the process ran

Case Study Two DB File Scattered Reads

32 Problem Observed  Problem: Login taking 12 minutes for each user when they started their day High wait accumulation from 6:30 – 8:30 am 240 Users X 12 Minutes = 48 Hours Every Day 6 employees wasted time per day $400,000+ wasted per year  Applied Confio methods to problem identification Identify Wait Time, offending SQL, offending Resource

33 Wait Events During Problem

34 Investigation

35 What do we know?  Which SQL: LoginLookup UpdateInventory  Which Resource: scattered read buffer busy waits  How much time: 48+ Hours Every Day

36 Hypotheses: Oracle Interpretations Key Questions: 1.Is full table scan necessary? 2.What causes a full table scan for this SQL Statement? Two Alternative paths for optimization: I.Eliminate Full Table Scan 1.Add Index / Collect Histograms 2.Update Statistics 3.Utilize Query Hints II.Full Table Scan Required - Improve response time 1.Parallelized Reads 2.Optimize I/O Subsystem 3.Optimize Application

37 I. Unnecessary Full Table Scan? Solutions: 1. Add / Modify index(es) on the table 2. Update table and/or index statistics if proper index not being used 3. Add hint to use existing index

38 Full Table Scan is Needed Two alternative paths for optimization: I. Eliminate Full Table Scan There isn’t a need to read the whole table, so we need to find the right shortcut II. Improve response time We need to read most or all of the table anyway, so let’s just figure out how to do it faster

39 Solutions: 1. Use Parallel Reads 2. Set Database Parameters 3. Improve I/O Speed 4. Optimize the application II. Improve Response Time for Db File Scattered Reads

40 1. Use Parallel Reads = Faster FTS  Parallel Reads Can be set at the table level (use with caution) Alter table customer parallel degree 4; Normally used by hinting in the SQL Statement select /*+ FULL(customer) PARALLEL(customer, 4) */ customer_name from customer;  A delicate tradeoff sacrifice the performance of others for the running query.  Not necessarily efficient Just may be faster. Parallel Reads may actually do twice the work of a sequential query but have four workers, thus finishing in half the time while using 8x resource

41 2. Set database parameters  DB_FILE_MULTIBLOCK_READ_COUNT specifies the maximum number of blocks read in one I/O operation during a sequential scan Impacts the optimizer Reduces number of I/Os required For OLTP, typically between 4 to 16 Optimizer will more likely to FTS if set too high  Ensure that the database read requests are synced up with the O/S.  This gets tricky if different block sizes are used in different tablespaces

42 3. Improve I/O speed  Get your SA involved  Investigate I/O sub-system Iostat, vmstat, sar, … for potential problems Monitor during high activity  Investigate contention at the disk/controller level. Learn which disks share common resources Use more disks to spread I/O and reduce hot spots  Investigate caching and current memory usage

43 4. Optimizing the Application  Review application – do you have access to code for changes? No – look into stored outlines  Techniques to Optimize a statement: Reduce the number of calls for a SQL –Caching the data in the application –Creating a summary table (perhaps via a materialized view) –Eliminating the need for the data Retrieve Less Data with each statement –Add fields to the WHERE clause Combine SQLs for fewer calls –Combine several SQLs with different bind variables into one large statement that retrieves all the data in one shot

44 Results  Added indexes to underlying tables  Added Materialized View Full Table Scan Fixed

Case Study Three DB File Sequential Reads

46 Problem Observed  Data Warehouse loads were taking too long  Noticed high wait times on “db file sequential read” wait event  DBAs were confused – why are data loads “reading” data  Applied Confio Method to problem identification Identify Wait Time, offending SQL, offending Resource

47 Investigation

48 Investigation for an INSERT Statement Sequential read time by object for SQL

49 What do we know?  Which SQL: Load Process  Which Resource: DB File Sequential Read  How much time: 5 hour+ 90% of wait time

50 Investigating db file sequential reads  Often considered a “good” read  DB file sequential reads normally occur during index lookups  Often a single-block read although it may retrieve more than one block. P1 – file id P2 – block id Join to DBA_EXTENTS (see buffer busy waits)

51 Hypotheses: Oracle Interpretations of Sequential Reads Causes of excessive wait times: I. Reading too many index leaf blocks II. Low cardinality first column index III. Not finding block in buffer cache forces disk read IV. Slow disk reads V. Contention for certain blocks

52 I. Reading too many index and table blocks (cont) 1. Rebuild Fragmented Indexes alter index rebuild [online]; 2. Compress Indexes alter index rebuild compress; Uses more CPU 3. Multi-column indexes Avoid the table lookup Will create a larger index 4. Pre-sort Table data

53 II. Low cardinality first column index  If first column of index is low / medium cardinality, much time is spent scanning the index leaf blocks  The additional columns do not lower the number of leaf blocks that are read.  Solutions: Use a leading column with better cardinality Compress the index

54 III. Not finding block in buffer cache forces disk read  Db File sequential reads occur because the block is not in the buffer cache.  How do we make sure more blocks are already in the cache?  Solutions 1.Increase the size of the buffer cache(s) 2.Put the object in a cache where it is less likely to get flushed out

55 IV. Slow disk reads  With databases, it often comes down to this – the disk just needs to be faster  Put certain objects on the fastest disk  O/S file caching using special software that makes normal files perform like raw files  Increase Storage System Caching – such as an EMC cache

56 Results  Many sessions were loading data and all were updating low cardinality indexes  Modified index and noticed a 50% performance improvement in an INSERT  Customer is also analyzing global vs. local indexes  Reviewing usage of bitmap indexes  Removed unused indexes  Enhanced the disk subsystem

Case Study Four Enqueue

58 Problem Observed  Problem: High Wait on Production Database Accumulated wait 11 hours (40,000 sec) during 4:00 – 8:00 am End users in UK were complaining loudly  Applied RMM approach to problem identification: Identify Wait Time, offending SQL, offending Resource

59 Investigation Huge waits on locking

60 Enqueue (enq: xxx in 10g) enq: TX  Locks held for the life of a transaction until a COMMIT or ROLLBACK. enq: TM  Locks being held when foreign key constraints are not indexed properly. enq: ST  Locks held during dynamic space allocations. enq: HW  Serialization for the allocation of space beyond the high water mark Causes

61  Generally due to application or table setup issues  Is acquired when a transaction initiates an UPDATE Row is locked by the session Others may select from the row (read consistency) Others wanting to UPDATE same row must wait  Lock is held until a COMMIT or ROLLBACK is issued TX enqueue

62 Waits caused by “normal” active transactions  Just issue a COMMIT or ROLLBACK  Determine what the true unit of work is TX enqueue

63 Waits due to Insufficient 'ITL' slots in a Block  The ITL (interested transaction list) is an area at the top of each data block where Oracle keeps track of which rows are locked by which transaction  Every transaction wanting to change a block requires a slot in the ITL list of the block  The number of ITL slots is controlled by INITRANS, initial number of slots at block creation MAXTRANS, total allowable slots over time ITL list will expand to allow MAXTRANS only if space is available TX enqueue

64 Waits due to rows being covered by the same BITMAP index fragment  Bitmap indexes allow one index entry to cover many rows within a table  If two sessions try to insert or update the same key value the second session has to wait TX enqueue

65  Caused by space management operations Can happen with small extent sizes, allocation of temporary segments for sorting May get an ORA indicating a timeout  Unnecessary Sorting Disk sorting requires space management and thus contention on the ST enqueue Eliminate as much disk sorting as possible Evaluate SORT_AREA_SIZE or PGA_AGGREGATE_TARGET parameters ST enqueue

66  General TEMP tablespace advice SMON cleanup of temporary space Set PCTINCREASE equal to zero stop cleanup/coalesce Set temporary tablespaces as TEMPORARY  SMON in parallel environment SMON Cleanup operations are magnified  Hanging or slow system Side effect of many processes on the ST enqueue ST enqueue

67  Acquired to move the HW mark  High volume of inserts across concurrent session could cause a wait on this contention  Recreate / modify the object with larger extents  Pre-allocate extents ALTER TABLE … ALLOCATE EXTENT  V$LOCK.ID1 is the tablespace number.  V$LOCK.ID2 is the relative dba of segment header of the object for which space is being allocated. HW enqueue

68 What is blocking session waiting on?  Idle Session  DB File Scattered Reads  Another session

69 Wait Events for Development  Tuning SQL for optimal performance  Debug/test/integrate/pilot process  Understand impact on existing database  Understand Oracle impact on application performance  View into production for better development prioritization and feedback  Reduce finger-pointing

70 Conclusion  Conventional Tuning focus on “system health” and lead to finger-pointing and confusion  Wait event tuning implemented according to Confio Performance Intelligence is the best way to tune  Two compliant tools types Tracing tools Continuous DB-wide monitoring tools

71 About Confio Software  Developer of Performance Tools  Igniter Suite Ignite for Oracle, DB2, SQL Server, Sybase Ignite for Java  Packaged, easy-to-use implementation of Wait Time Performance Tuning  Based in Colorado, 100’s of worldwide customers  Free trial at

72 Thank you for coming Dean Richards Stop by the Confio Booth Contact Information ext. 116 Company website