Download presentation
Presentation is loading. Please wait.
Published byMartha Turner Modified over 9 years ago
1
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005
2
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?2 Gathering Statistics The way from ANALYZE TABLE to DBMS_STATS Increasing number of parameters and options changing defaults CBO more intelligent a “small change” can have a big effect... and all that stuff only to get some redundant information
3
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?3 Complexity Trap In the chain of dependent objects the quantity is increasing
4
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?4 Are My Statistics Adequate? Way 1 - observe (they are not if you encounter problems) This is reactive approach Way 2 - test it (define test cases and check them) define invariant the invariant must respect the dynamic of the database!
5
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?5 Basis Scenario Motto: fix the problem in its origin Limit the scope to the basic table access only (full scan, index access) This reduces the complexity significantly and can expose the real origin of problems in the execution plan
6
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?6 Simple Example select * from ta, tb where ta.rand = tb.rand and ta.norm = 141 and ta.rand <= 770 and (tb.norm = 1 or tb.norm = -1) --------------------------------------------------------- | Id | Operation | Name | Rows | --------------------------------------------------------- | 0 | SELECT STATEMENT | | 65073 | |* 1 | HASH JOIN | | 65073 | |* 2 | TABLE ACCESS BY INDEX ROWID| TA | 2771 | |* 3 | INDEX RANGE SCAN | TA_INDEX | 3598 | |* 4 | TABLE ACCESS FULL | TB | 23580 | --------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("TA"."RAND"="TB"."RAND") 2 - filter("TA"."RAND"<=770) 3 - access("TA"."NORM"=141) 4 - filter(("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770)
7
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?7 Verify it! Re-estimate For each table access Get owner, object name and alias Option FULL – get filter predicate Option (b-tree) INDEX – get access and filter predicate on table and index line enclose in brackets and connect with AND form a re-estimation query execute the query
8
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?8 Re-estimation – how does it work? Input from execution plan |* 2 | TABLE ACCESS BY INDEX ROWID| TA | 2771 | |* 3 | INDEX RANGE SCAN | TA_INDEX | 3598 | |* 4 | TABLE ACCESS FULL | TB | 23580 | --------------------------------------------------------- 2 - filter("TA"."RAND"<=770) 3 - access("TA"."NORM"=141) 4 - filter(("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770) Generated re-estimation query select count(*) from "TB" where (("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770) ; -- gives 60317 rows select count(*) from "TA" where ("TA"."NORM"=141) and ("TA"."RAND"<=770) ; -- gives 1 row Run the re-estimation query to get the real cardinality, compare it with the original estimation
9
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?9 Re-estimation – Summary Run the re-estimation query to get the real cardinality, compare it with the original estimation Note, that the TA access cardinality is overestimated The TB access cardinality is underestimated
10
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?10 Re-estimation Basic Idea The simplification is based on the assumption that all problems in the execution plan are caused directly or indirectly by the cardinality assumption in the base table access. Examples Cardinality underestimation can lead to NL join with big inner table Cardinality overestimation can switch a NL join to a SORT MERGE join improper assumption of cardinality 1 can lead to CARTESIAN join
11
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?11 Access and Filter Predicates Access Predicate – controls the index access Filter Predicate – additional filter condition of index of table Availability of predicates Oracle Version Predicates available 8on demand only (in trace) 9current status (for cached cursors) V$SQL_PLAN 10historical overview for a controlled period of time – AWR (view DBA_HIST_SQL_PLAN )
12
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?12 Hypothesis By selecting a limited set of objects and corresponding predicates we can try to find an answer to some fundamental questions: are my statistics precise enough? are my statistics up-to-date? does a histogram help? is the sample size OK? is dynamic sampling relevant? is the partition design suited for the application? ready for migration?
13
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?13 Are My Statistics Precise Enough? Comparing the estimated and re-estimated cardinality we obtain the relative and absolute difference Predicate - where (("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770) ACCE EST_CARD RE_EST_CARD ABS_DIFF REL_DIFF ---- ---------- ----------- ---------- ---------- same 23580 60317 36737 61 Predicate - where ("TA"."NORM"=141) and ("TA"."RAND"<=770) ACCE EST_CARD RE_EST_CARD ABS_DIFF REL_DIFF ---- ---------- ----------- ---------- ---------- same 2771 1 -2770 -277000
14
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?14 Are My Statistics Up-to-date? By observing the historical view we obtain the dynamic aspect of predicate selectivity
15
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?15 Processing Two basic steps are extracting and processing of the predicate information for execution plans example administrative task operative task
16
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?16 Simple Model
17
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?17 Does a Histogram Help? Histogram candidates: column is used in predicate bad estimation access with different values real cardinality is in wide range Test the influence of histogram size on the precision of selectivity estimation using re-estimation queries.
18
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?18 Histogram / 2 Without a histogram (SIZE=1) the cardinality is underestimated (average). If the histogram size is higher than the number of distinct values the estimation is very precise. For columns with a high number of distinct values the estimation remains imprecise. Target precision
19
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?19 Sample Size Re-estimate a predicate on statistics gathered with different estimate_percent parameter observe the influence of the sample size Possible results: relatively small percentage returns good results when increasing estimate percent the error is reduced there is marginal effect of the sample size on the precision Q: is there one optimal single sample size for an object?
20
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?20 Re-estimation and Dynamic Sampling add a dynamic sampling hint to the re-estimation query trace with 10053 event select /*+ dynamic_sampling(tc,2) */ * from tc where norm300 = 0; Generated DS query SELECT /* some hints */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ FULL("TC") some other hints */ 1 AS C1, CASE WHEN "TC"."NORM300"=0 THEN 1 ELSE 0 END AS C2 FROM "TC" SAMPLE BLOCK (1.076923, 1) SEED (1) "TC" ) SAMPLESUB Total rows in sample rows satisfying the predicate Sample size controlled by level
21
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?21 Re-estimation and Dynamic Sampling / 2 Results of DS query level : 2 sample pct. : 50.806452 actual sample size : 54147 filtered sample card. : 6018 orig. card. : 111358 single table dynamic sel. est. : 0.11114189 DS blocks / table blocks Total rows in sample rows satisfying the predicate Computed selectivity filtered / actual rows (= satisfying / total rows)
22
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?22 Re-estimation and Dynamic Sampling / 3 Dynamic sampling has a neat positive side effect Level 10 performs exactly the re-estimation no need to execute the re-estimation query, parsing with DS level 10 returns exact result Lower levels of DS are relatively cheap and can give some hints
23
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?23 Dynamic Sampling – Levels Dynamic sampling levels are absolute (except for level 10) Even for a moderate table the estimate percent of DS could be too low DS uses block sampling only Block size 8KB
24
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?24 Re-estimation – Final Approach For a selected predicate build re-estimation query parse it with different dynamic sampling hints (e.g. 0, 2, 5 and 10) peek in the PLAN_TABLE and get the cardinality and access option Predicate where part_key = 3 and GROW_SHIF > 0 and shif = 0 ACCE EST_CARD EST_DS2_CARD EST_DS5_CARD RE_EST_CARD DIFF DIFF_PREC ---- ---------- ------------ ------------ ----------- ---------- ---------- same 7982 94 116 89 -7893 -8869 Predicate where part_key = 2 and GROW = 100 ACCE EST_CARD EST_DS2_CARD EST_DS5_CARD RE_EST_CARD DIFF DIFF_PREC ---- ---------- ------------ ------------ ----------- ---------- ---------- same 4281 13 2 0 -4281 -428100
25
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?25 Estimation of Lower Levels of Dynamic Sampling Two extreme cases can be observed while re- estimation on lower levels of DS. Dynamic sampling is extremely precise even on lower levels Dynamic sampling doesn’t “work” until a high level Real Cardinality Estimated Cardinality DS Level 0 510
26
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?26 Evolution of Dynamic Sampling Consider that dynamic sampling itself is subject of evolution dynamic sampling on a predicate with index: 9i rel 2 – FULL TABLE scan with SAMPLE clause 10g rel 1 – INDEX ACCESS with ROWNUM <= 2500 10g rel 2 – FULL TABLE scan with SAMPLE clause
27
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?27 Re-estimation and Partitioning Constant pstart and pstop values Add predicates for range and list partitions / subpartitions --------------------------------- ----------------- | Operation | Name | | Pstart| Pstop | --------------------------------- ----------------- | SELECT STATEMENT | | | | | | PARTITION RANGE SINGLE| | | 2 | 2 | | PARTITION LIST SINGLE| | | 3 | 3 | | TABLE ACCESS FULL | T2S | | 6 | 6 | ---------------------------------- ---------------- Verify granularity parameter Verify partition pruning
28
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?28 Predicates in Parallel Plans For parallel execution plans no additional processing by generating re-estimation queries is required | Id | Operation | Name | Rows | ---------------------------------------------------- | 0 | SELECT STATEMENT | | 82647 | | 1 | PX COORDINATOR | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 82647 | |* 3 | HASH JOIN | | 82647 | | 4 | PX BLOCK ITERATOR | | 3519 | |* 5 | TABLE ACCESS FULL | TA | 3519 | | 6 | BUFFER SORT | | | | 7 | PX RECEIVE | | 23580 | | 8 | PX SEND BROADCAST | :TQ10000 | 23580 | |* 9 | TABLE ACCESS FULL| TB | 23580 |
29
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?29 Bitmap Index Access Generating of re-estimation queries for bitmap index access must respect the access logic 1 | TABLE ACCESS BY INDEX ROWID | T1 | 13 | 2 | BITMAP CONVERSION TO ROWIDS| | | 3 | BITMAP AND | | | 4 | BITMAP INDEX SINGLE VALUE| IND2_T1 | | 5 | BITMAP INDEX SINGLE VALUE| IND1_T1 | | Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("T1"."GROUP2_ID"=28) 5 - access("T1"."STATUS"='X') Generated predicate ("T1"."GROUP2_ID"=28) AND ("T1"."STATUS"='X') The bitmap index cardinality is missing in the execution plan
30
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?30 Ready for Upgrade or Crash? How can the behavior of an application in migrated environment be predicted? A sophisticated but expensive method is to compare execution plans Re-estimation method can be used as a lightweight timely warning system
31
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?31 Some Limitations Limited usage on predicates with bind variables and parse time partition pruning Predicates stored in ACCESS_PREDICATES and FILTER_PREDICATES are not always complete A single predicate is limited to the length of 4000 bytes Further Work Weighted function for the result of re-estimation Relevance to SQL profile
32
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?32 Summary Method to verify the “status” of database statistics Introduction of “predicate mining” A green bar for database statistics Dynamic sampling may be used as a verification method
33
www.db-nemec.com J.NemecAre Your Statistics Bad Enough?33 References Christian Antognini, CBO: A Configuration Roadmap, Hotsos Symposium 2005 Wolfgang Breitling, Using DBMS_STATS in Access Path Optimization, UKOUG 2004 Jonathan Lewis, Strategies for Statistics, UKOUG 2004 metalink 72539.1 Interpreting Histogram Information 114671.1 Gathering Statistics for the Cost Based Optimizer 236935.1 Global statistics - An Explanation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.