How STATSPACK Was Used to Solve Common Performance Issues Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems

Slides:



Advertisements
Similar presentations
CHAPTER 4 Tablespaces and Datafiles. Introduction After installing the binaries, creating a database, and configuring your environment, the next logical.
Advertisements

Tuning a Very Large Data Warehouse Pichai Bala. About Me Working in the IT industry for the past 17 years Working in Oracle since Working in Data.
1 Chapter 16 Tuning RMAN. 2 Background One of the hardest chapters to develop material for Tuning RMAN can sometimes be difficult Authors tried to capture.
Office of the Accountant General (A&E) Andhra Pradesh Hyderabad
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Overview of performance tuning strategies Oracle Performance Tuning Allan Young June 2008.
IO Waits Kyle Hailey #.2 Copyright 2006 Kyle Hailey Waits Covered in this Section  db file sequential read  db file scattered.
Exadata Distinctives Brown Bag New features for tuning Oracle database applications.
Copyright © 2002 VERITAS Software Corporation. All Rights Reserved. VERITAS, VERITAS Software, the VERITAS logo, and all other VERITAS product names and.
Buffer Cache Waits. #.2 Copyright 2006 Kyle Hailey Buffer Cache Waits Waits Disk I/O Buffer Busy Library Cache Enqueue SQL*Net Free Buffer Hot Blocks.
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Performance And Tuning – Lecture 7 Copyright System Managers LLC 2007 all rights reserved.
12 Copyright © 2005, Oracle. All rights reserved. Proactive Maintenance.
Agenda Overview of the optimizer How SQL is executed Identifying statements that need tuning Explain Plan Modifying the plan.
Chapter 14 Chapter 14: Server Monitoring and Optimization.
1 - Oracle Server Architecture Overview
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Backup and Recovery Part 1.
1 Tuning PL/SQL procedures using DBMS_PROFILER 20-August 2009 Tim Gorman Evergreen Database Technologies, Inc. Northern California Oracle.
DB Audit Expert v1.1 for Oracle Copyright © SoftTree Technologies, Inc. This presentation is for DB Audit Expert for Oracle version 1.1 which.
Simplify your Job – Automatic Storage Management Angelo Session id:
Executing Explain Plans and Explaining Execution Plans Craig Martin 01/20/2011.
Introduction and simple using of Oracle Logistics Information System Yaxian Yao
12 Copyright © 2007, Oracle. All rights reserved. Database Maintenance.
Troubleshooting SQL Server Enterprise Geodatabase Performance Issues
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
2 Copyright © 2006, Oracle. All rights reserved. Performance Tuning: Overview.
MySQL Would You Like Some Transactions With That Table?
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Oracle DataGuard Concepts and Architecture
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
1 Robert Wijnbelt Health Check your Database A Performance Tuning Methodology.
Part II : Waits Events Kyle Hailey
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
1 Using Statspack in Oracle8i and 9i to Identify Problems Ian Jones Database Specialists, Inc.
Performance Diagnostics using STATSPACK data 18-May 2006 Tim Gorman SageLogix, Inc. N. CA Oracle Users Group.
Oracle9i Performance Tuning Chapter 1 Performance Tuning Overview.
The Self-Managing Database: Guided Application and SQL Tuning Mohamed Ziauddin Consulting Member of Technical Staff Oracle Corporation Session id:
The Persistence of Memory (Issues) Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems NoCOUG Brian Hitchcock April.
Oracle Tuning Considerations. Agenda Why Tune ? Why Tune ? Ways to Improve Performance Ways to Improve Performance Hardware Hardware Software Software.
Oracle Tuning Ashok Kapur Hawkeye Technology, Inc.
Copyright © Oracle Corporation, All rights reserved. 1 Oracle Architectural Components.
Performance Dash A free tool from Microsoft that provides some quick real time information about the status of your SQL Servers.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Oracle9i Performance Tuning Chapter 12 Tuning Tools.
1 06/05/08 Statspack Kyle Hailey
Oracle tuning: a tutorial Saikat Chakraborty. Introduction In this session we will try to learn how to write optimized SQL statements in Oracle 8i We.
7 Copyright © 2005, Oracle. All rights reserved. Managing Undo Data.
Siebel CRM Unicode Conversion – The DBA Perspective Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems DCSIT Technical.
ESRI User Conference 2004 ArcSDE. Some Nuggets Setup Performance Distribution Geodatabase History.
Mtivity Client Support System Quick start guide. Mtivity Client Support System We are very pleased to announce the launch of a new Client Support System.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
Root Cause and Other DBA Urban Legends Brian Hitchcock OCP 10g DBA Sun Microsystems SunFed.
Week 2 Lecture 1 Creating an Oracle Instance. Learning Objectives  Learn the steps for creating a database  Understand the prerequisites for creating.
Analysing Indexes SQLBits 6 th October 2007 © Colin Leversuch-Roberts Kelem Consulting Limited September 2007.
Tools for Analyzing Problems in Oracle Applications Jeff Slavitz (415)
Ad Hoc User or Application Cost-based Data Dictionary Statistics Rule-based Execution Plan Asks the question: All people and their grades in a list giving.
Oracle9i Performance Tuning Chapter 4 Tuning the Shared Pool Memory.
8 Copyright © 2006, Oracle. All rights reserved. Tuning the Shared Pool.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
Improve query performance with the new SQL Server 2016 query store!! Michelle Gutzait Principal Consultant at
This document is provided for informational purposes only and Microsoft makes no warranties, either express or implied, in this document. Information.
Oracle Database Architectural Components
1 Copyright © 2005, Oracle. All rights reserved. Oracle Database Administration: Overview.
Chapter 21 SGA Architecture and Wait Event Summarized & Presented by Yeon JongHeum IDS Lab., Seoul National University.
DB Issue Trouble Shooting Guideline
How STATSPACK Was Used to Solve Common Performance Issues
Steve Hood SimpleSQLServer.com
Root Cause and Other DBA Urban Legends
Presentation transcript:

How STATSPACK Was Used to Solve Common Performance Issues Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems Session id: Dedicated to Pramitha Chowrira, the Goddess of the Rockies, Mike Waldron, the Student who became the Master, Sheryl Driscoll, who leads purely for the glory and Ann Bischoff, who took me in trade for a developer...

What STATSPACK Is  Set of SQL and PL/SQL  Collects performance data from v$ tables  Stores collected data in separate tables  Each collection of data is a ‘snapshot’  Reports deltas in data between snapshots  Supports ad-hoc SQL queries of the snapshot data

STATSPACK Details  Works for onwards  Gathers data for a single instance  Snapshot levels – Determine how much data is collected – Defaults are fine  Snapshot interval of 15 minutes suggested  Long report periods miss transient events – Reports over an instance restart are not valid

STATSPACK -- Good  Free (very cool!)  Gathers a wide range of data – You don’t know what you’re looking for at first – Root cause isn’t usually obvious  Standard process to collect performance data – Gathers the same data on all instances – Easy to share with vendors, support groups

STATSPACK -- Not Perfect  Gathers a wide range of data – Ocean of data – Any information? – Easy to get lost  Does not tell you what the problem is – Shows what is happening in the instance – You need to figure out if this is a problem or not  Does not tell you the solution  Does not tell you that you are done tuning...

How to Interpret Output?  Requires experience with your system – No single way to analyze output – Must have history of your system – Look for possible problem areas – Trial and error to change problem behavior – Only you can tell if you have a performance problem

STATSPACK Report Sections – Instance Summary, Efficiency – Top 5 Wait Events, Wait Events – SQL Ordered by Gets, Reads, Executions – Instance Activity Stats – Tablespace IO Stats Ordered by IOs, Tblspc-file – Buffer Pool Statistics – Rollback Segment Stats, Storage – Latch Activity, Sleep, Miss Sources – Dictionary Cache Stats – Library Cache Activity – SGA Memory Summary, Breakdown – init.ora Parameters

Documentation of output  No comprehensive documentation – Oracle 8i Reference  Appendix A -- Wait Events defined  Appendix B -- Enqueue Names defined  Appendix C -- Statistics Descriptions – Database Performance Guide and Ref  Chapters 21-23, Supplied packages, how to use – $ORACLE_HOME/rdbms/admin/spdoc.txt – ORACLE High-Performance Tuning with STATSPACK, Donald K. Burleson, Oracle Press ISBN  No explanation of – What output means for your system

Configuration Used  Oracle  Snapshots every 15 minutes – snapshots taken continuously  Default STATSPACK snapshot ‘level’  Application loads and analyzes web site click stream data – Lots of data – More data all the time – We don’t know what vendor code looks like

Actual Use  4 Performance issues in 2002 – Case 1) Reports Running Slow  STATSPACK output didn’t show the problem – Case 2) Vendor Demo Slow  STATSPACK output didn’t show the problem – Case 3) Data Load Slow  STATSPACK output led to 18x speedup (1800%) – Case 4) Data Load Time Varies  STATSPACK output led to the root cause

Case 1) Reports Running Slow  Vendor code allows users to setup reports – Vendor code generates SQL for report  Long run interferes with next day’s data load  STATSPACK captures SQL – Generate explain plan(s)  Report SQL doesn’t generate where clause properly to use partition pruning  Vendor refuses to change their code  We simply removed the reports – Performance issue ‘resolved’

Case 2) Vendor Demo Slow  Due to issues like Case 1) – New vendor sets up demo, data load slow – Data load runs twice as fast at vendor – Statspack output doesn’t show anything obvious – Compare configuration of vendor and our dbs – Vendor has only one redo log file per group – We had two redo log files per group – We drop one file per group, performance issue resolved

Case 3) Data Load Slow  First time loading new type of web log data  No baseline to compare with – Classical performance tuning doesn’t always apply to the real world  Data load so slow no time for daily reporting  Must run faster or the data won’t be loaded  We don’t know if this load will run faster  Do we have a ‘performance’ issue? – Yes, data load must run faster to be useful – No, perhaps this is as fast as it can be...

Case 3) Data Load Slow  SQL -- Highest Gets per Exec SQL ordered by Gets for DB: BHDATA04 Instance: BHDATA04 Snaps: > End Buffer Gets Threshold: > Note that resources reported for PL/SQL includes the resources used by all SQL statements called within the PL/SQL code. As individual SQL statements are also reported, it is possible and valid for the summed total % to exceed 100 Buffer Gets Executions Gets per Exec % Total Hash Value , , SELECT t526.keyvalueid FROM bh_lqueryvalue t526 WHERE t526.query infoid = :ph0 ORDER BY t526.keyvalueid ASC

Bad SQL?  SQL shouldn’t cost much – Select looking for one row – Table has two indexes – Explain plan shows ‘index full scan’ – Should show ‘index range scan’ – Explain plan with hint to force one index – Verify cost of each index – Optimizer is choosing wrong index!  Drop the costly index! – Indexes added by vendor ‘to be safe’...

Explain Plan Force Index1 SQL> truncate table plan_table; Table truncated. SQL> explain plan set Statement_Id = 'TEST' for SELECT /*+ INDEX(t526 X_LQRYVL_QUERYIDKYVL) */ t526.keyvalueid FROM bh_lqueryvalue t526 WHERE t526.queryinfoid = 100 ORDER BY t526.keyvalueid ASC; Explained. Plan Table | Operation | Name | Rows | Bytes| Cost | Pstart| Pstop | | SELECT STATEMENT | | 3 | 24 | 1 | | | | INDEX RANGE SCAN |X_LQRYVL_ | 3 | 24 | 3 | | |  Cost 4

Explain Plan Force Index2 SQL> explain plan set Statement_Id = 'TEST' for SELECT /*+ INDEX(t526 X_LQYVAL_KYVLQRYID) */ t526.keyvalueid FROM bh_lqueryvalue t526 WHERE t526.queryinfoid = 100 ORDER BY t526.keyvalueid ASC; Explained. Plan Table | Operation | Name | Rows | Bytes| Cost | Pstart| Pstop | | SELECT STATEMENT | | 3 | 24 | 2334 | | | | INDEX FULL SCAN |X_LQYVAL_ | 3 | 24 | 9333 | | | SQL>  Cost 11,667

Solution  After dropping costly index – data load time was 18 hours, became 1 hour – 18:1 improvement (1800%)  Why did optimizer choose wrong index? – No idea, Oracle requested running 18 hour data load to gather instance data – Business users said “NO!” – Indexes created by vendor, no need for both indexes  Know when to quit tuning!

What About Wait Events?  Popular DBAs – Wait Events are all that matters  I want (desperately) to be popular too… – Return to Case 3) – Examine Top 5 Wait Events section – Try to understand what is causing the wait time

Case 3) Wait Events  Top 5 Wait Events Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time PX Deq: Execution Msg 335 1, latch free 1, control file parallel write db file sequential read log file parallel write  PX -- parallel query issues? – Contact Oracle Tech Support

Ask the Experts  Oracle Tech Support – Many wait events in STATSPACK output should be ignored (a bug perhaps? Or an RFE?) – Requests the full STATSPACK report – Tells me that the latch free wait event must be addressed – set session_cached_cursors = 100 – Performance improvement will be ‘significant’ – Data load now takes 19 hours (about 10% worse)

What Happened?  How could the experts miss the bad SQL?  Wait events are important – If the total wait time is the largest problem  In this case – Bad SQL dominated the overall run time  Wait event analysis – Many events should be ignored – Need to determine how much of total run time is due to wait events only  Go back and fix the SQL issue

Total CPU Time  Total CPU time is s of milliseconds – 10 milliseconds is 1 centisecond (cs) = 0.01 sec – > cs = seconds – Report interval was 4.60 minutes (276 seconds) – confused? How many cs left until Happy Hour?  From Instance Activity Stats Instance Activity Stats for DB: BHDATA04 Instance: BHDATA04 Snaps: Statistic Total per Second per Trans CPU used by this session 27, ,441.0 CPU used when call started 27, ,475.0 …

Total Wait Time  Look at all Wait Events Wait Events for DB: BHDATA04 Instance: BHDATA04 Snaps: > cs - centisecond - 100th of a second -> ms - millisecond th of a second -> ordered by wait time desc, waits desc (idle events last) Avg Total Wait wait Waits Event Waits Timeouts Time (cs) (ms) /txn PX Deq: Execution Msg , latch free 1,740 1, ###### control file parallel write db file sequential read log file parallel write enqueue refresh controlfile command PX Deq: Msg Fragment PX Deq: Parse Reply control file sequential read 2, ###### log file sync PX Deq: Signal ACK PX Deq: Join ACK PX Deq: Execute Reply SQL*Net more data to client file open db file parallel write PX Idle Wait 4,316 4, , ###### SQL*Net message from client 2, , ###### SQL*Net message to client 2, ###### > Total Wait Time cs

Real Total Wait Time  Remove idle events – MetaLink Note: PQ Wait Events – STATSPACK report should filter out idle events – Database Performance Guide and Ref  Explains more about this ‘feature’

Real Total Wait Time  Idle Events Wait Events for DB: BHDATA04 Instance: BHDATA04 Snaps: > cs - centisecond - 100th of a second -> ms - millisecond th of a second -> ordered by wait time desc, waits desc (idle events last) Avg Total Wait wait Waits Event Waits Timeouts Time (cs) (ms) /txn PX Deq: Execution Msg , <----- remove latch free 1,740 1, ###### control file parallel write db file sequential read log file parallel write enqueue refresh controlfile command PX Deq: Msg Fragment PX Deq: Parse Reply control file sequential read 2, ###### log file sync PX Deq: Signal ACK <----- remove PX Deq: Join ACK PX Deq: Execute Reply SQL*Net more data to client file open db file parallel write PX Idle Wait 4,316 4, , ###### <----- remove SQL*Net message from client 2, , ###### <----- remove SQL*Net message to client 2, ###### > Total Wait Time 1397 cs

Total Response Time  Total CPU Time + Total Wait Time – cs cs = cs  Total Wait Time – 1397/28872 = 0.05 – 5% of Total Response Time  Wait Time was never an issue!  If you don’t remove the idle events – /( ) = 97%

Bad SQL Rules  Slow data load time – Time due to Bad SQL – Time due All Others  Including Wait Events

Case 4) Data Load Time Varies  Data Load Time – Normal 6.5 hours, Long 16 hours – Varies randomly, no pattern  Generate STATSPACK report – Normal, Long – Compare reports – Look for differences between reports  Tablespace IO Stats Section – Normal tablespaces accessed – Long tablespaces accessed

Problem and Solution  Vendor data load shouldn’t touch all tables  What process would access all tables?  Production db supported by another group – We aren’t allowed to connect as ‘oracle’ – Can’t see what they might be running (cron?)  Turns out – Production DBAs decided we needed full exports – We weren’t notified – Stop the exports, performance issue goes away!

What About Cache Hit Rates?  Back to the subject of experts – Remember when it was cool to discuss hit rates?  For Case 4) – Compute buffer cache hit ratio from tables – Tables larger than physical memory – Can’t have all pages in memory at once – Buffer cache hit ratio won’t be 100%  Even if we had 100% – Bad SQL (index) was the real problem – Buffer cache hit ratio wasn’t relevant

Select Buffer Cache Hit Ratio  Data Load without Exports running

Total Wait Time?  For Case 4), 15 minute report interval – Wait Time is 28% of total time Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time PX Deq: Table Q Normal 495,566 1,017, <-- remove slave wait 25, , PX Deq Credit: send blkd 406, , <-- remove PX Deq: Execution Msg 33, , <-- remove latch free 42, , cs total time = = Wait time is / = 28% Instance Activity Stats for DB: BHDATA01 Instance: BHDATA01 Snaps: Statistic Total per Second per Trans CPU used by this session 1,422,921 1, CPU used when call started 2,237,915 2, <-----

Review  For the 4 issues we had, STATSPACK output – Was useful for all 4  Provided standard set of data for all involved – Fixed 2 issues  Provided the data that led to the root cause  Verified that the fix was working  Performance improvements were substantial – Tuning process much faster with STATSPACK – Same process worked for all 4 issues – Decided not to look for further improvements  Wait Time analysis might be useful...

oraperf.com Analyzer  Website oraperf.com – Submit STATSPACK report – Analyzer reviews report – Generates detailed analysis  CPU time  Wait time – Gives specific advice – Not perfect, but it is fast and free!  Has same issues with idle wait events as STATSPACK report

oraperf.com  Who or what is oraperf.com? – From the website... Oraperf.com is run by Anjo Kolk. Anjo has worked for over 16 years at Oracle ( ). While at Oracle he worked in different countries and different departments. Many people generate utlbstat/utlestats and statspack reports, but don't know how to interpret the data. People that do look at these are reports also mostly looking at the wrong information and end up making the wrong tuning decisions. That is why the reports are analyzed based on the YAPP method. The YAPP method will show what component of the total response time should be tuned first. YAPP-Method -- Yet Another Performance Profiling Method

oraperf.com -- Case 3)  Upload report from slow data load  Analyzer shows – Response time  91.63% CPU Time  8.37% Wait Time – Advice?  Reduce the number of buffer gets or executions  Wait time – Matters only as a % of total response time

oraperf.com -- Case 3)  Upload report from fast data load  Analyzer shows – Response time  5.16% CPU Time  94.84% Wait Time – Advice?  Tune PX Deq: Execution Msg event ­But this is an idle event...  Non-idle wait time is only about 25% total time

oraperf.com -- Case 3)  Conclusion – oraperf.com analyzer provides another tool for performance tuning – Well worth using if only to compute  Response Time  CPU Time  Wait Time ­Check for idle wait events...

No Excuses  Install STATSPACK  Generate two snapshots  Generate standard report  Upload to oraperf.com  Review advice  Fast, free performance analysis!

Installing STATSPACK  Create separate tablespace  Create PERFSTAT user  Execute SQL script to create tables  Setup job to execute snapshots  Setup process to purge data over time  Set timed_statistics = TRUE – Not required, but needed to get wait time data

Installing STATSPACK  As user ‘SYS’ create tablespace perfstat datafile '/xxx/xxx/perfstat_01.dbf' size 500M; cd $ORACLE_HOME/rdbms/admin sqlplus Enter value for default_tablespace: perfstat Enter value for temporary_tablespace: temp

Generate Standard Report  Report SQL supplied by Oracle sqlplus execute statspack.snap Enter value for begin_snap: 1 Enter value for end_snap: 2 Enter value for report_name: testing

Select STATSPACK Data  Query the tables directly select to_char(snap_time,'yyyy-mm-dd HH24') mydate, new.name buffer_pool_name, (((new.consistent_gets-old.consistent_gets)+ (new.db_block_gets-old.db_block_gets))-(new.physical_reads-old.physical_reads)) / ((new.consistent_gets-old.consistent_gets)+ (new.db_block_gets-old.db_block_gets)) bhr from perfstat.stats$buffer_pool_statistics old, perfstat.stats$buffer_pool_statistics new, perfstat.stats$snapshot sn where new.snap_id > and new.snap_id < and new.name = old.name and new.snap_id = sn.snap_id and old.snap_id = sn.snap_id-1; Based on SQL from ORACLE High-Performance Tuning with STATSPACK Donald K. Burleson Oracle Press ISBN

Buffer Cache Hit Ratio Case 4)  Output of SQL on previous slide yr. mo dy Hr BUFFER_POOL_NAME BHR DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT DEFAULT 1.00

Space Used  Snapshot size varies with – Number of tablespaces – Number of SQL statements captured  Db1 21 tablespaces --> 0.15 Mb/snapshot  Db2 376 tablespaces --> 0.37 Mb/snapshot  Assuming a snapshot every 15 minutes – 96 snapshots per day – Db1 --> 14.4 Mb/day – Db2 --> 35.6 Mb/day

Removing Snapshot Data  Oracle supplied SQL – SQL removes snapshot data for a range of snapshot id numbers  Example (listing of all existing snapshots)... Specify the Lo Snap Id and Hi Snap Id range to purge ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enter value for losnapid: 4001 Using 4001 for lower bound. Enter value for hisnapid: 5000 Using 5000 for upper bound. Deleting snapshots commit; Note: large deletes may fill rollback segments

Summary  STATSPACK – Free, easy to install, easy to run – Output can be very useful or confusing – Real-world use has resulted in big performance gains – Useful for all instances – Standard way to gather performance data

Reminder – please complete the OracleWorld session survey Thank you.