Copyright © 2006 Quest Software Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions

1 About the Author … Domain Expert & Product Architect for Quest Software Oracle Background: Worked with Oracle databases for over two decades (starting with version 4) Work history includes time at both “Oracle Education” and “Oracle Consulting” Academic Background: Several Oracle Masters certifications BS, MS and PhD in Computer Science MBA (general business) Several insurance industry designations Key Interests: Data Modeling Database Benchmarking Database Tuning & Optimization "Star Schema" Data Warehouses Oracle on Linux – and specifically: RAC on Linux Articles for: Oracle’s Technology Network (OTN) Oracle Magazine, Oracle Informant PC Week (eWeek) Articles for: Dell Power Solutions Magazine The Linux Journal www.linux.com www.orafaq.com This presentation draws heavily on these areas

2 Books by Author … Coming in 2008 …

3 Agenda Partitioning Benefits Partitioning History Partitioning Options Partitioning Advisor (if you’re licensed) Typical Data Warehousing Environment TPC-H “Data Warehouse” Benchmark Results TPC-H with Various Partition Strategies What about OLTP Environments and the TPC-C/E Lessons Learned (and their relevance/application) Questions & Answers

Partitioning Benefits: Facts Manageability –Classic “Divide & Conquer” technique –More granular storage allocation options –Keeps otherwise time consumptive options viable Availability –More granular online/offline options –More granular rebuild/reorganization options –More granular object level backup/restore options Capacity Management –Enables a “Tiered Storage Architecture” approach –More granular storage cost management decision points Performance –Partition Pruning –Partition-Wise Joins 4

Partitioning Benefits: Opinion (Mine) Manageability40% Availability20% Capacity Management20% Performance20% Don’t over-sell/over-expect the performance aspect Need to experiment for best approach for a database Better to take longer at the start to get right, because very often it’s far too expensive to change afterwards Examples demonstrate very positive performance, but better to be conservative and error on the side of caution – then be very pleasantly surprised… 5 Why to Partition

Partition Pruning (Restriction Based) 6 From Docs: In partition pruning, the optimizer analyzes FROM and WHERE clauses in SQL statements to eliminate unneeded partitions when building the partition access list. This enables Oracle Database to perform operations only on those partitions that are relevant... “Divide and Conquer” for performance –Sometimes can yield order of magnitude improvement –But once again, best not to oversell and/or over-expect Some Potential Issues to be aware of: –SQL*Plus Auto-Trace can sometimes miss partition pruning –“Old Style” Explain Plans via simple SELECT has issues too –Best to always use DBMS_XPLAN and/or SQL_TRACE Note: Trace file analysis much easier these days – SQL Developer + free Hotsos plug-in, metalink trace analysis scripts, Quest Toad DBA

Partition-Wise Join (Multi-Object Based) 7 From Docs: Partition-wise joins reduce query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel. This significantly reduces response time & improves the use of both CPU & memory resources. Different Flavors: –Full – Single to Single –Full – Composite to Single –Full – Composite to Composite –Partial – Single –Partial – Composite Indexing Strategy Counts –Local Prefixed/Non-Prefixed –Global All of these affect the explain plan

Picture Worth 1000 Words (from Docs) 8 Simple Mantra: Subdivide the work into equally paired chunks, then perform all that work using many parallel processes Make sure not to over-allocate CPU’s – remember there will also be concurrent workload

Partitioning History (from Oracle 11G training+) 9 Oracle 7Partition Views – really more of a cheat  Oracle 5Before Tablespaces – we had partitions

Partitioning Options – Part 1 10 IOT’s can be partitioned as well in later versions of Oracle, so the basic choices are even more complex than this…

Partitioning Options – Part 2 11 Prior to 11G: Oracle White Paper: 2007 Partitioning in Oracle Database 11g

Partitioning Options – Part 3 12 Post 11G: Oracle White Paper: 2007 Partitioning in Oracle Database 11g Very exciting new options…

Partitioning Advisor (if you’re licensed) 13 Advisor Central -> SQL Advisors -> SQL Access Advisor

Typical Data Warehouse Architecture 14 TPC-H

Typical Environments 15 OLTPODSOLAPDM/DW Business Focus OperationalOperational Tactical TacticalTactical Strategic End User Tools Client Server Web Client ServerClient Server Web DB Technology Relational CubicRelational Trans CountLargeMediumSmall Trans SizeSmallMedium Large Trans TimeShortMediumLong Size in Gigs10 – 20050 – 400 400 - 4000 Normalization3NF N/A0NF Data Modeling Traditional ER N/ADimensional We’ll come back to this picture

TPC-H Benchmark Industry Standard “Data Warehouse” Benchmark URL: www.tpc.org/tpchwww.tpc.org/tpch Spec: http://tpc.org/tpch/spec/tpch2.7.0.pdfhttp://tpc.org/tpch/spec/tpch2.7.0.pdf 8 Tables 22 Queries (answer complex business questions) Database scaling: –Factor = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000 –Size GB = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000 16

TPC-H Data Model 17 5 25 SF * 10,000 SF * 200,000 SF * 800,000 SF * 150,000 SF * 6,000,000 SF * 1,500,000 Partitions Sub-Partitions

TPC-H Permits Partitioning … 18 But what to do, what to do ???

Disclosure Reports 19 http://tpc.org/tpch/results/tpch_perf_results.asp

Disclosure Report – Lots of Info 20 This is where people document exactly what advanced database feature and storage parameters they used – info is invaluable

Disclosure Report – Appendix B 21

Sample Expensive Query 22

Example Explain Plan 23

24 Example Explain Plan Explain complete. Plan hash value: 2545634784 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 42533 | 5648K| | 641K (1)| 01:57:41 | | 1 | SORT GROUP BY | | 42533 | 5648K| 105M| 641K (1)| 01:57:41 | |* 2 | HASH JOIN | | 715K| 92M| | 631K (1)| 01:55:51 | | 3 | TABLE ACCESS FULL | H_NATION | 25 | 725 | | 3 (0)| 00:00:01 | |* 4 | HASH JOIN | | 715K| 72M| | 631K (1)| 01:55:51 | | 5 | TABLE ACCESS FULL | H_SUPPLIER | 100K| 781K| | 646 (1)| 00:00:08 | |* 6 | HASH JOIN | | 720K| 68M| 68M| 631K (1)| 01:55:44 | |* 7 | HASH JOIN | | 751K| 59M| 232M| 589K (1)| 01:48:10 | |* 8 | HASH JOIN | | 3004K| 197M| 4984K| 485K (1)| 01:28:57 | |* 9 | TABLE ACCESS FULL| H_PART | 100K| 3808K| | 11805 (1)| 00:02:10 | | 10 | TABLE ACCESS FULL| H_LINEITEM | 60M| 1716M| | 342K (1)| 01:02:52 | | 11 | TABLE ACCESS FULL | H_ORDER | 15M| 200M| | 72122 (1)| 00:13:14 | | 12 | TABLE ACCESS FULL | H_PARTSUPP | 8000K| 122M| | 25945 (1)| 00:04:46 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("S_NATIONKEY"="N_NATIONKEY") 4 - access("S_SUPPKEY"="L_SUPPKEY") 6 - access("PS_SUPPKEY"="L_SUPPKEY" AND "PS_PARTKEY"="L_PARTKEY") 7 - access("O_ORDERKEY"="L_ORDERKEY") 8 - access("P_PARTKEY"="L_PARTKEY") 9 - filter("P_NAME" LIKE :SYS_B_2)

Method of Attack Since many data warehouses are utilized for data mining, we can’t always know every possible query likely to run – thus aggregate measure for success Thus we’ll compare the benchmark’s weighted performance scores for the TPC-H using various partitioning schemes (all within spec of course…) Goal will be to find the best overall partitioning … Then we’ll examine some specific explain plans … 25

10G Sample Test cases 10G Simple Approach (Just Huge Tables) –Range: ORDER (order date) –Hash: LINEITEM (order key) 10G Basic Approach (Single Level Partitions) –Range: ORDER (order date) –Hash: LINEITEM (order key) –List: CUSTOMER (nation key) –Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) 10G Complex Approach (Composite Partitions) –Range-Hash: ORDER (order date & cust key) –Multi-Hash: LINEITEM (part, supp & order keys) –List: CUSTOMER (nation key) –Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) 26

11G Sample Test cases 27 11G Simple Approach (+Interval) –Interval-Hash: ORDER (order date & cust key) –Multi-Hash: LINEITEM (part, supp & order keys) –List: CUSTOMER (nation key) –Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) 11G Basic Approach (+Virtual) –Interval-Hash: ORDER (virtualized order date & cust key) –Multi-Hash: LINEITEM (part, supp & order keys) –List: CUSTOMER (nation key) –Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) 11G Complex Approach (+REF) –Interval-Hash: ORDER (virtualized order date & cust key) –REF: LINEITEM (order key) –List: CUSTOMER (nation key) –Hash: PART, SUPPLIER and PARTSUPP (part & supp keys)

Is that It? No – just six very obvious high-level scenarios Your selections and actual mileage will vary Experimentation usually yields the best results Always trust “empirical results” over conjecture So improved response-time beats better explain plan Remember, DW’s usually have unpredictable queries So don’t tune for just a few queries, look for the best overall and/or more generic performance solution 28

Intermediate Results can be Misleading  29 TPC-H Power score seemingly implies that every partitioning schema is incrementally better TPC-H Throughput score seems to show that non- partitioned is equal to the best

Final Results tell the Real Truth 30 TPC-H Query/Hour score shows that some partitioning schemes better, and some not TPC-H $/Query/Hour score confirms the inverse in terms of dollars per unit of work

Why such seemingly Opposite Results ??? 31 Run times and explain plans apply to single measurable operation Even aggregate & averaged run times don’t relate the entire truth! Need answer based upon sound mathematics (reliable & repeatable)

TPC-C Benchmark Historical Industry Standard “OLTP” Benchmark URL: www.tpc.org/tpchcwww.tpc.org/tpchc Spec: http://tpc.org/tpcc/spec/tpcc_current.pdfhttp://tpc.org/tpcc/spec/tpcc_current.pdf Probably the most used & widely quoted benchmark But suffers from overly simplistic design & code logic Generally considered unreliable with modern RDBMS But still a decent rough “sounding board” for many …. Being replaced by the newer TPC-E (later slides) 32

TPC-C Data Model Clustered Base Scaling Unit # Terminals/Warehouse (i.e. concurrent users) Sub-PartitionsPartitions

TPC-E Benchmark Emerging Industry Standard “OLTP” Benchmark URL: www.tpc.org/tpchewww.tpc.org/tpche Spec: http://tpc.org/tpce/spec/TPCE-v1.5.1.pdfhttp://tpc.org/tpce/spec/TPCE-v1.5.1.pdf Very new and still evolving – but highly promising Not too many published TPC-E results as of yet … Design not compromised by RDBMS features Much more realistic (i.e. real world) in nature Nowhere near as easy as the old TPC-C test  34

35 TPC-E Data Model

36 TPS is Moot, Average Response Time is King But wait: adding cluster &partitioning yields negative – why ??? Look to Stats Pack, AWR and ADDM Reports to investigate… tpmC 584 582 579 578

DISTRICT Table needs clustered Single block reads 32 ms!

Clustering worked, it made SQL the #1 performance issue – as was expected Single block read 5 ms Partitioning did not shine through just yet, possibly skewed by the first issue Suggest fix one major item per test iteration, so made choice to address this 1 st Single block read 32 ms 

39 Switch MEMORY_TARGET to SGA/PGA_TARGETS Notice that manual memory management resulted in 13% gain !!! But wait, there’s more (isn’t that almost always the case) … -13%

Now it’s all just SQL, so time for SQL Tuning Advisor & SQL Tuning Sets

41 SQL Tuning Sets & 11G Results/Client Cache Going to stop – Have quickly reached the “good enough” point -9%

Architecture Findings 42 OLTPODSOLAPDM/DW Business Focus OperationalTactical Strategic End User Tools Client Server Web DB Technology Relational Trans CountLargeSmall Trans SizeSmallLarge Trans TimeShortLong Size in Gigs10 – 200400 - 4000 Normalization3NF0NF Data Modeling Traditional ER Dimensional Mostly Partition Elimination Mostly Partition Wise Join Design to eliminate per object Design to parallelize across objects

Other Interesting Findings 43 64-bit scales much more reliably than 32-bit, even when on same hardware and using the <= 4GB memory model If you know the application code’s nature and it’s well definable, manual memory management may be better Using manual SGA/PGA targets with floors yields much more scalable results and also more predictable patterns Partitioning is not an automatic bonus, must experiment to identify the optimal partitioning scheme per situation Don’t forget older technologies like clusters, they still can add a positive to the overall equation in certain cases Don’t forget SQL Tuning Sets & SQL Advisor, or the “explain plans” may not in fact be the best obtainable Don’t forget 11G’s Result and Client Caches

Thank you Presenter: Bert Scalzo E-mail: Bert.Scalzo@Quest.comBert.Scalzo@Quest.com Questions and Answers … Note: these slides should be available on Open World web site, but I’ll also make sure to post them on my company’s web site: www.toadworld.com/Experts/BertScalzosToadFanaticism/tabid/318/Default.aspx

Copyright © 2006 Quest Software Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions

Similar presentations

Presentation on theme: "Copyright © 2006 Quest Software Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright © 2006 Quest Software Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions

Similar presentations

Presentation on theme: "Copyright © 2006 Quest Software Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions"— Presentation transcript:

Similar presentations

About project

Feedback