Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recent-Rows-First Case Study February, 2010 ©2010 Dan Tow, All rights reserved SingingSQL Presents SingingSQL.

Similar presentations


Presentation on theme: "Recent-Rows-First Case Study February, 2010 ©2010 Dan Tow, All rights reserved SingingSQL Presents SingingSQL."— Presentation transcript:

1

2 Recent-Rows-First Case Study February, 2010 ©2010 Dan Tow, All rights reserved dantow@singingsql.com www.singingsql.com SingingSQL Presents SingingSQL Presents:

3 The Standard SQL-Tuning Problem Statement Find SQL that needs tuning. Take the SQL that needs tuning as a functional spec for what rows the application needs at that point in its flow of control. Find some database change (usually a new index) or some transformation of the SQL that absolutely guarantees (assuming no wrong-rows bug in Oracle, which is a safe assumption!) returning the same rows as the original SQL spec, only with a faster execution plan, and confirm that it is faster after the change.

4 Finding the Fastest Plan, Normally The fastest execution plan usually is the one that touches the fewest rows, and this, in turn, is the one that discards the fewest rows, and that discards those rows that must be discarded the earliest, before wasting work on unneeded joins. This usually starts with the best filter condition (the one reaching the smallest fraction of its table), and reaches every large table with the fastest path (usually a filter- column index for the first table and nested loops to a join-key index for the later large tables). The goal is normally a query that runs in minutes, at the most, without requiring parallel threads.

5 A Hideously Nasty Problem of Rare Proportions New user report timed out (snapshot-too- old error) consistently after hours of runtime, with a process planned to run during business hours, so it should not use parallel threads. Several tables were on the order of a half- billion rows, so even gathering stats or trying out alternatives was very time- consuming. Sampling strategies helped analysis (e.g., test joining only every hundredth row)

6 The SQL Core, Anonymized SELECT... FROM LB, IT, HB, LA, HA, CA, PGI WHERE LB.DATE_T BETWEEN '1-NOV-2009' AND '6-NOV-2009' AND LB.A_ID = '9' AND LB.ID = IT.E_ID AND IT.B_ID = HB.ID AND IT.ID1 = LA.I_ID AND HB.C_NUM = LA.SC_NUM AND LA.L_STATUS_ID <> 10096 AND LA.ATTRIBUTE8 = 'Mid-Term' AND LA.ATTRIBUTE9 = 'TO' AND LA.E_PRICE > 0 AND LA.SUM_DET = 'D' AND HA.HEADER_ID = LA.HEADER_ID AND HA.ATTRIBUTE3 = 'Y' AND HA.STATUS_ID = 10047 AND CA.SITE_ID = HA.INVOICE_SITE_ID AND CA.SITE_CODE = 'BILL_TO' AND PGI.COUNTRY_CODE = CA.COUNTRY AND UPPER(PGI.THEATER) = UPPER('EURO')

7 The SQL Diagram IT HB LB 0.0008 (recent rows) LA 0.0013 HA 0.028 CA PGI 0.13

8 The SQL Diagram IT HB LB 0.0008 (recent rows) LA 0.0013 HA 0.028 CA PGI 0.13 Standard (heuristic- best) join order: LB, IT, HB, LA, HA, CA, PGI Too slow – 4+ hours

9 The SQL Diagram IT HB LB 0.0008 LA 0.0013 (no index) HA 0.028 CA PGI 0.13 0.008 combined 0.00012 combined 698M 231 1M 479M 2.7M 703M

10 The SQL Diagram IT HB LB 0.0008 LA 0.0013 (no index) HA 0.028 CA PGI 0.13 0.008 combined 0.00012 combined 698M 231 1M 479M 2.7M 703M Best join order: HA, CA, PGI, LA, HB, IT, LB But, there is a catch! (Takes 2:40)

11 The SQL Diagram IT HB LB 0.0008 LA 0.0013 HA 0.028 CA PGI 0.13 0.008 combined 0.00012 combined 698M 231 1M 479M 2.7M 703M Takes 2:40, but at the rate it begins (based on those sampling tests), it should take half as long! Explanation: Rollback overhead!

12 The SQL Diagram IT HB LB 0.0008 LA 0.0013 HA 0.028 CA PGI 0.13 0.008 combined 0.00012 combined 698M 231 1M 479M 2.7M 703M Takes 2:40, but at the rate it begins, it should take half as long! Explanation: Rollback overhead! Solution: Hit the recent rows first, as these are more likely to have rollback! – Takes 1:20!

13 The SQL Fix, Anonymized SELECT /*+ (leading(ha_iv la hb it lb) use_nl(ha la hb it lb) index(it IT_N2) */ … FROM LA, (SELECT /*+ leading(HA) index_desc(HA HA_PK) first_rows */ … FROM HA, CA, PGI WHERE HA.HEADER_ID > 0 /* True always, drives desc range scan */ AND HA.ATTRIBUTE3 = 'Y' AND HA.STATUS_ID = 10047 /* two filters, together, 29K/1M */ AND CA.SITE_ID = HA.INVOICE_SITE_ID AND CA.SITE_CODE = 'BILL_TO' AND PGI.COUNTRY_CODE = CA.COUNTRY AND UPPER(PGI.THEATER) = UPPER('EURO') AND ROWNUM > 0) HA_IV, WHERE

14 Conclusions Most optimized queries drive first to recent rows, first, automatically. Recent rows are hottest – most likely to require rollback. The best way to avoid rollback overheads for these queries is simply to find the fastest plan – with less runtime, there will be fewer rows requiring rollback.

15 Conclusions Sometimes, optimized queries drive from time-independent conditions. The best way to avoid rollback overheads for these is to speed up the query, resulting in fewer row changes during the query runtime!

16 Conclusions Sometimes, optimized queries drive from time-independent conditions. In rare cases, these queries run so long that when they reach hot rows, rollback overheads are high. In these cases, driving to recent rows first, independent of any filter on the driving table (something like a reverse full table scan) can avoid rollback overheads because the last rows reached (old rows) are the least likely to require rollback.

17 Questions?


Download ppt "Recent-Rows-First Case Study February, 2010 ©2010 Dan Tow, All rights reserved SingingSQL Presents SingingSQL."

Similar presentations


Ads by Google