Upgrading Oracle Database 9i to 10g for Siebel using SQL Performance Analyzer (SPA) Sameer Marwa – Infogig Consulting Khaled Yagoub – Oracle Development Ravi Sharma – Oracle Consulting
Outline The Beginning and Challenges Proposed Solution SQL Performance Analyzer (SPA) & Traditional Approach Modified Test Approach Strategy & Results: Production Run Lessons learned & Future plans Q/A
Business Requirements The Beginning Business Requirements Upgrade SIEBEL database containing critical customer data, orders and service requests, from 9i to 10g And by the Way Provide ability to failback to 9i with no data loss Upgrade to be completed within the maintenance window (6hrs) No screw up like the last time (last CBO refresh resulted in DB outages!)
Devil is in the Details Challenges: Very large environment: 10TB sized database on dual node HP Superdome servers cluster (96 CPUs each), 25+ Application servers High transaction volume: 15k concurrent users Large number of SQL to analyze: 100k unique SQL (Initial estimates indicated that 50% of these would have different execution plans with new 10g CBO) Modifications/enhancements to the SIEBEL application often changes 20-30% of the SQL Minor application release was planned 4 weeks before the 10g Upgrade
Devil is in the Details Challenges: Expensive downtime: $$$ per hour Lengthy Database/Application restart: Each restart requires 2 hours ($$$ per restart) Hosted environment: multiples teams added to the complexity
Get Jack out of the Box Proposed Solution Build a new 10g environment while maintaining the 9i system, allowing a failback to 9i in case of a problem : The 10g DB maintained in sync with the 9i environment using GoldenGate Use SQL Performance Analyzer (SPA) to mitigate SQL related concerns (End to end analysis should take no more than 2 business days)
(SQL Workload: SQL text + binds) What is SPA? SQL Tuning Set (SQL Workload: SQL text + binds) Utility that predict the impact of system changes on SQL workload response time SQL statements are captured and stored in SQL Tuning Set (STS) SQL performance data before and after change is generated using: Method 1: EXECUTE SQLs Method 2: GENERATE PLANs Method 3: CONVERT SQLSET Regressed SQL are identified by comparing the execution statistics (buffer gets, cpu time, etc) SQL plans + stats SQL plans + stats Pre-change SQL Trial Post-change SQL Trial Compare SQL Performance Analysis Report
What is SPA? Replays individual SQL but not concurrent SQL workload Common System change scenarios Database upgrades and patches Database parameter changes Schema changes Statistics gathering Implementation of tuning recommendations OS/hardware changes Integrated with SQL tuning set, SQL plan baselines, SQL tuning advisor, and Enterprise Manager to provide an End-to-end solution
SPA: Traditional Approach for 9i to 10g Upgrade* Use SQL_TRACE to capture SQL statements on 9i production system Trace database sessions to capture application related SQL with related bind variables and execution statistics Build SQL Tuning Set (STS) from SQL_TRACE files Build SQL Trial Baseline: 9i SQL plans and statistics Convert SQL tuning set into a SPA SQL Trial Build Post-upgrade SQL Trial: 10g SQL plans and statistics Using EXECUTE method to remotely execute SQL against the 10g database Compare baseline to post-upgrade SQL trials Identify SQL STATEMENTS that regressed (modifiable thresholds) in 10g Tune the regressed SQL statements Using SQL profiles, Stored Outlines, rewrite the SQL, change the CBO statistics Verify fixes by re-running SPA and comparing baseline to post-tuning SQL trials *OTN: Testing Performance Impact of an Oracle Database 9i/10g Release 1 to Oracle Database 10g Release 2 Upgrade with SPA
Challenges With the Traditional Approach Potential impact of SQL_TRACE to transaction response time Very difficult determine relevant sessions to trace Potential high number of sessions to trace and capture a representative set of application SQL statements Potential impact (CPU and disk space for trace files) of tracing difficult to quantify Expect a very large number of SQL statements to test with
SPA Approach with Few Modifications SPA Traditional Approach SPA Modified Approach Manually query v$sql_xxx views to capture SQLs and plans on 9i production system Manually build STS from v$sql_xxx resutls Convert STS to build SQL Trial Baseline: 9i Build post-upgrade SQL Trial using GENERATE PLAN method: plans only Compare 9i and 10g plans to identify Changed Plans Manually analyze changed plans to identify critical plan differences Use SQL_TRACE on 9i prod. to capture BINDS for only the critical SQL subset Use SPA EXECUTE method for the critical subset of SQL to identify regressions Compare … … etc. Use SQL_TRACE to capture SQL statements on 9i production system Build SQL Tuning Set (STS) from SQL_TRACE files Build SQL Trial baseline: 9i SQL plans and statistics Build post-upgrade SQL Trial: 10g SQL plans and statistics using EXECUTE Method Compare baseline to post-upgrade SQL trials to identify regressed SQL Tune the regressed SQLs Verify fixes
SPA Approach with Few Modifications: Setup 9i Production DB Cluster 10g Test / Production DB Cluster Manual Process SPA Process Remotely Get Exec. Plans for All SQLs using DB Link Compare with 9i Plans Execute SQL with Bind Values Find Regressed SQLs Get Exec. Plans for All SQLs Get Bind Values for Critical SQLs SPA Server 11g (4 CPU win2k)
Production Run: Details V$SQL V$SQLTEXT_WITHNEWLINES V$SQL_PLANS 9i Production Database Table PLANS Table STATEMENTS 11g SPA Server Query Export Step 1: Captured SQLs with related Plans and execution statistics Query v$sql, v$sqltext_withnewlines, and v$sql_plans Select the following statement types: SELECT, UPDATE and DELETE Ignore INSERT since then do not have sub-queries and hence will have same performance on 10g as on 9i Create two tables to save the results: STATEMENTS: to store SQL text (CLOB), parsing schema name, number of executions, elapsed time, buffer gets, etc. PLANS: to store SQL plan lines Export the resulting two tables to the 11g DB used to run SPA
Production Run: Details Step 2: Built SQL Tuning Set (STS) from 9i SQL and related plans in the 11g SPA database Import tables PLANS and STATEMENTS into SPA Server Create an new empty SQL tuning set Run a script to manually populate the SQL tuning set from tables PLANS and STATEMENTS Import 11g SPA Server Table PLANS Table STATEMENTS Query SQL Tuning Set in 11g 115K Unique SQL
Production Run: Details Code Snippet declare stscur dbms_sqltune.sqlset_cursor; Begin dbms_sqltune.create_sqlset(‘ALL_SQL_STS'); open stscur for select sqlset_row( sql_text => STATEMENTS.sql_text, parsing_schema_name => STATEMENTS.parsing_schema_name, executions=> STATEMENTS.executions, elapsed_time=> STATEMENTS.elapsed_time, priority => STATEMENTS.seq, cpu_time=> STATEMENTS.cpu_time, plan_hash_value => 4294967295, optimizer_cost => (select cost from PLANS where sql_seq = STATEMENTS.seq and id = 0), sql_plan => (select cast(collect( sql_plan_row_type(NULL,NULL,NULL,NULL, operation, options, object_node,object_owner, object_name, NULL,NULL,NULL,optimizer, search_columns, id, parent_id, NULL, position, cost,cardinality, bytes, other_tag,partition_start, partition_stop, partition_id,distribution, cpu_cost,io_cost, temp_space,access_predicates,filter_predicates, null,null,null, null)) as sql_plan_table_type) from PLANS where sql_seq = STATEMENTS.seq)) from STATEMENTS; dbms_sqltune.load_sqlset('STATEMENTS', stscur); end;
Production Run: Details Step 3: Built SQL Trial Baseline: 9i SQL Text, plans and statistics Use Enterprise Manager (EM) or DBMS_SQLPA package to convert SQL tuning set into a SPA SQL Trial Step 4: Built Post-upgrade SQL Trial: 10g SQL with only execution plans Specify DB Link to connect to remote 10g system Use SPA GENERATE PLAN method to collect execution plans for all SQLs in the STS SQL Tuning Set in 11g Input SPA in 11g 3. Convert 4. Build Trial Baseline Trial Post-upgrade Trial DB Link Remotely Generate Plans 10g Database System
Production Run: Details Code Snippet declare tname varchar2(30); begin tname := dbms_sqlpa.create_analysis_task(task_name => ’10gUpgrade', sqlset_name=> ‘ALL_SQL_STS', description =>'identify changed plans for 10g Upgrade'); dbms_sqlpa.execute_analysis_task(task_name => '10gUpgrade', execution_name=> '9i_plans', execution_type => 'CONVERT SQLSET'); dbms_sqlpa.execute_analysis_task( task_name => '10gUpgrade', execution_name=> '10g_plans', execution_type => 'EXPLAIN PLAN', execution_params=> dbms_advisor.arglist('DATABASE_LINK', ’10G_TEST_DB')); dbms_sqlpa.execute_analysis_task( task_name=> '10gUpgrade', execution_name=>'compare plans', execution_type=>'COMPARE', execution_params = dbms_advisor.arglist('COMPARISON_METRIC', 'OPTIMIZER_COST')); end;
Production Run: Details Step 5: Compared baseline and post-upgrade SQL trials and Analyze SPA report SPA compares 9i and 10g execution plans SPA identifies SQL with plan changes Step 6: Manually analyzed changed plans to identify critical plan differences Used custom SQL queries on SPA views to identify SQL statement with potential performance violation: SQL executions plans that have different JOIN driving tables in 10g SQL plan involving certain undesirable operations such as INDEX FAST FULL SCAN, FULL TABLE SCAN, etc. Comparison and analysis report 9i Baseline SQL Trial 10g SQL Plans 11g SPA Server SPA DBA Views DBA/USER_ADVISOR_PLANS DBA/USER_ADVISOR_OBJECTS DBA/USER_ADVISOR_FINDINGS
Reality Check: Findings From Plan Comparison Bad News Very Large Set of SQL Captured 115k unique SQL captured from the 9i production database Large Number of Changed Plans Close to 40% (38k) of the SQL had different plans on 10g Still Unmanageable ! Close to 35% (9k) of the changed-plans had driving step differences It was very difficult to capture BIND VALUES for the critical set Number of SQL Statements
Reality Check: Findings From Plan Comparison Good News SQL with # executions < 10/day =108.9k 10/day < #execs < 70/day = 2.9k #execs > 70/day = 3.2k = 2.8% 99% of all SQL executions & 69% of all buffer gets Of the 9K SQL with driving step difference, only 347 SQL were executed more than 70 times/day! Number of SQL with high executions (>70 times /day) is 3.2K SQL statements with executions >70 times /day represents 99% of ALL SQL executions They represents 69% of ALL buffer gets Of the SQL with executions < 70/day, we had only 37 SQL with FULL SCANS on large objects (with no such operation in 9i)
Strategy for Production Run Limit the amount of work while still covering majority of the SQL workload Focus on SQL with high executions (>70/day) SQL (SELECT, UPDATE & DELETE only) To address concerns with SQL with low executions but undesirable operations, include 37 SQL with FULL SCANS on large objects (with no such operations in 9i) Capture BIND VALUES for most often executed SQL using SQL_TRACE SQL with # executions < 10/day =108.9k 10/day < #execs < 70/day = 2.9k #execs > 70/day = 3.2k = 2.8% 99% of all SQL executions & 69% of all buffer gets
Number of SQL Statements Production Run: Details Step 7: Used SQL_TRACE on 9i production to capture BINDS for only Highly executed SQL Attempt to capture BINDS using SQL_TRACE for the highly executed SQL statements Export the resulting trace files into SPA 11g server Use SPA to convert SQL trace files into a SQL tuning set All SQL statements in the STS have bind values Some of highly executed SQL statements might might not be in the STS because they were not captured by SQL_TRACE Create a separate STS for SQL without bind values Number of SQL Statements
Production Run: Details Step 8: Used SPA in GENERATE PLAN method to remotely collect 10g execution plans for SQL without BIND VALUES Used SPA in EXECUTE method to remotely collect 10g execution plans and statistics for SQL with BIND VALUES Step 9: SQL without BIND VALUES Used SPA plan comparison method to identify SQL Statement with different execution plans in 10g (labeled changed-plan set) Used custom SQL queries on SPA views to identify SQL statements with potential performance violation e.g., SQL executions plans that have different driving tables in 10g SQL with BIND VALUES Used SPA to identify SQL with regressed performance in 10g using buffer_gets as comparison metric
Production Run: Details Code Snippet for EXECUTE Method declare tname varchar2(30); begin tname := dbms_sqlpa.create_analysis_task(task_name => ’10gUpgradeExecute', sqlset_name=> ‘HIGH_EXEC_SQL_WITH_BINDS', description =>'identify sql regressions for 10g Upgrade'); dbms_sqlpa.execute_analysis_task(task_name => tname, execution_name=> '9i_exec', execution_type => 'CONVERT SQLSET'); dbms_sqlpa.execute_analysis_task( task_name => '10gUpgrade', execution_name=> '10g_exec', execution_type => ‘TEST EXECUTE', execution_params=> dbms_advisor.arglist('DATABASE_LINK', ’10G_TEST_DB')); dbms_sqlpa.execute_analysis_task( task_name=> tname, execution_name=>'compare plans', execution_type=>'COMPARE', execution_params = dbms_advisor.arglist('COMPARISON_METRIC', ‘BUFFER_GETS')); end;
Driving Step Difference 159 SQL Production Run: Result Details GENERATE PLAN Method SQL with BIND VALUES without 1.4 K SQL 1.8 K SQL SPA 11g 10g Test DB Changed Plans 1.6k SQL Driving Step Difference 159 SQL EXECUTE Method SQL with BIND VALUES without 1.4 K SQL 1.8 K SQL SPA 11g 10g Test DB Regressed SQLs 96 SQL
Production Run: Final Results Step 10: Manually IDENTIFIED and TUNED the FINAL subset of potentially problematic SQL statements 96 Regressed SQL identified using EXECUTE method 159 SQL with driving steps difference identified using GENERATE PLAN method 37 SQL with FULL SCANS on large objects (with no corresponding operation in 9i) Developed 36 Stored Outlines to tune regressed SQL Used SPA to validate tuning changes Deployed tuning changes in 10g database 96 Regressed SQL 159 Driving Step changed SQL 37 FULL SCAN SQL Manual Analysis 36 Stored Outlines
Go-Live with 10g! Only 6 SQL STATEMENTS had to be tuned post go-live on 10g! These SQL STATEMENTS had same execution plans as in 9i but different plan during execution on 10g (due to bind peeking!) System stable for 4 weeks with no database issues encountered. That is, until the next Siebel release was introduced into production!)
Lessons Learned: SPA Advantage ! Very convenient to test with a large number of SQL statements Remote SQL Plan generation of 100k SQL took approximately 6 hours Plan comparison of 100k plans under 30 minutes! Flexible architecture to perform what-if analysis SQL workload and SQL trials are stored in the database which make it easy to query, build, and keep history of all SQL performance tests Database views with details of SPA inputs and results makes it easy to perform custom analysis and generate custom reports DBA/USER_ADVISOR_OBJECTS DBA/USER_ADVISOR_PLANS DBA/USER_ADVISOR_FINDINGS
Lessons Learned: But Customize Testing Approach Find a process to capture SQL and executions statistics from the production database that fits your environment: For 9i use SQL_TRACE to capture bind values for critical set (subset) of SQL statements For 10g use SQL Tuning Sets with appropriate filters and capture execution frequency Large number of SQL will have changed execution plans in 10g Prioritize the plan changes (driving table differences, unexpected operation in the SQL plan, etc) that is relevant for your environment Execution method analysis (executing SQL with bind data) is a better indicator of SQL performance post change Execution method can still produce a large list of SQL (with regressed performance) to analyze Use comparison metric judiciously and/or use custom filters
We Are on 10g, Now What? No more SQL_TRACE! Use SPA to analyze the impact of new indices to be deployed into 10g production system Use SPA to analyze the impact of refreshed CBO Statistics: Gather CBO Statistics in the 10g test system Capture SQL, related plans and execution statistics from the 10g production database using SQL Tuning Sets (STS) Import the production STS into the 11g SPA DB Use SPA (EXECUTE method) against the new CBO Statistics to identify regressed SQL
Q & A