Download presentation
Presentation is loading. Please wait.
Published byHubert Kevin Price Modified over 9 years ago
1
Oracle BIWA SIG Basics Worldwide association of 2000 professionals interested in Oracle Database-centric business intelligence, data warehousing, and analytical products, features and options. Membership is FREE Open forum to foster success in use and development of Oracle BIWA products. BIWA’s goals include sharing: “best practices” and “novel and interesting use cases” of Oracle BIWA-centric technology. See Mission Statement and Charter at oraclebiwa.org.oraclebiwa.org
2
Next Oracle BIWA SIG Conference BIWA Training Days at COLLABORATE 10 – IOUG Forum “Get Analytical with BIWA Training Days” April 18-22, 2010 Mandalay Bay Convention Center, Las Vegas, Nevada REGISTER with Offer Code “BIWA2010” for IOUG Member Discount Rate See oraclebiwa.org for details and links
3
Top 5 Tips on: Reducing Storage Cost while Improving Performance Jean-Pierre Dijcks Data Warehouse Product Management
4
Agenda Tip 1: Appropriate Hardware Tip 2: Tier your Storage Tip 3: Partition your Data Tip 4: Compress your Data Tip 5: Think, Plan and Design Q&A
5
Agenda Tip 5: Think, Plan and Design Tip 1: Appropriate Hardware Tip 2: Tier your Storage Tip 3: Partition your Data Tip 4: Compress your Data Software Forces a Paradigm Shift Q&A
6
Tip #5: Think, Plan and Design Understand the requirements Data retention rates What to do with the older data What are you doing with the newer data Performance requirements for all data (not just the latest stuff) Plan for the worst (kind of) What is the performance goal and can you still achieve that in 6 months or 2 years What is the data retention rate and can you deal with this at double your data size? Design the system to still work tomorrow
7
Tip #5: Think, Plan and Design Understand or learn about the trends Hardware Low-price commodity servers High refresh rates of components (CPUs etc.) Ever growing sizes, speeds at ever dropping prices Software More aligned with hardware Push down into storage of data intensive tasks Consolidation and more workloads are thrown at software
8
Tip #5: Think, Plan and Design An interpretation of the meaning of these trends is: We will see self-provisioning of vast resources by (end) users This will be achieved by a flexible grid of resources being made available More people will get and use more compute power More and more workloads are run on the “same” hardware Integrated software services will provide the value add for these users and make consolidation work… This has major implications for all of us… I think…
9
Tip #1: Balance your Hardware Driver: Flexibility in Performance Solid State Disks Flash Cards and Disks Solid State Disks Flash Cards and Disks 2TB 10K RPM SATA Disks Other high capacity media 2TB 10K RPM SATA Disks Other high capacity media ILM Upward Downward Memory Etc… Cost Speed Higher Lower
10
Tip #1: Balance your Hardware Driver: Flexibility in Performance SATA drives Flash Technology Memory SAS drives Off-line Data Archives < 10% of your data < 50% of your data 100% of your data Disclaimer: Illustration purposes only! 60% of your queries 35% of your queries 5% of your queries
11
Tip #2: Tier your Storage Driver: Cost and Performance Speed and Cost SATA drives Flash Technology Memory SAS drives % of capacity Performance 0 10 5 85 % of capacity Capacity 99.75 0 0.25 0 Disclaimer: Illustration purposes only! => cost indicator
12
Tip #2: Tier your Storage Driver: Cost and Performance Speed and Cost SATA drives Flash Technology Memory SAS drives % of capacity Balanced Perform 0 5 1 94 Disclaimer: Illustration purposes only! => cost indicator % of capacity Balanced Capacity 98 1.5 0.5 0
13
Tip #1 and Tip #2 Balance and Flexibility Create a grid of compute and storage resources Allow for a hierarchy of storage solutions within the grid Balance the hardware to: Achieve acceptable performance for the majority workload Achieve great performance for mission critical actions Achieve a reasonable price / performance balance Do not size just for performance, nor just for capacity Very diverse workloads + same hardware = need for flexibility
14
Tip #3: Partition your Data Impact: Performance and Ease of Maintenance Maintenance: Easier to work on smaller chunks of data Allows specification of separate management and performance strategies on a smaller chunk Performance: In maintenance operations (as shown above) By reducing the data volume to scan A potential way of allowing parallel operations to optimize data processing
15
The Concept of Partitioning Simple yet powerful Large Table Difficult to Manage Partition Divide and Conquer Easier to Manage Improve Performance Composite Partition Higher Performance More flexibility to match business needs SALES JanFeb SALES JanFeb Europe USA
16
Q : What was the total sales amount for May 20 and May 21 2009? Select sum(sales_amount) From SALES Where sales_date between to_date(‘05/20/2009’,’MM/DD/YYYY’) And to_date(‘05/22/2009’,’MM/DD/YYYY’); Sales Table 5/17 5/18 5/19 5/20 5/21 5/22 Only the 2 relevant partitions are read Partition for Performance Partition Pruning
17
Both tables have the same degree of parallelism and are partitioned the same way on the join column (cust_id) Range partition May 18 th 2008 Sub part 2 Sub part 3 Sub part 4 Sub part 1 Parallel Processing Partition Wise Join Customer Sales Sub part 2 Sub part 3 Sub part 4 Sub part 1 Sub part 2 Sub part 3 Sub part 4 Sub part 1 A large join is divided into multiple smaller joins, each joins a pair of partitions in parallel Select sum (sales_amount) From Sales s, Customer c Where s.cust_id = c.cust_id; Sub part 2 Sub part 3 Sub part 4 Sub part 1
18
18 Order_dateShip_dateCust _ID Prod _ID Amount 03-SEP-200919-SEP-2009100753293210,000.00 03-SEP-200905-SEP-200920098 20,000.00 03-SEP-200907-OCT-2009100892001015,000.00 03-SEP-200901-OCT-2009201001000035,000.00 03-SEP-200919-OCT-2009803003000010,000.00 03-SEP-200903-NOV-200910000 203040,000.00 Exadata Storage Index Transparent I/O Elimination with No Overhead Exadata Storage Indexes maintains summary information about table data in memory Stores MIN and MAX values of filter columns Typically one index entry for every MB of disk Eliminates disk I/Os if MIN and MAX can never match “where” clause of a query “Negative index” Completely automatic and transparent MIN ship_date = ’01-OCT-2009’ MAX ship_date = ’03-NOV-2009’ Select * from orders where ship_date < ’30-SEP-2009’ Only first set of rows can match MIN ship_date = ’19-SEP-2009’ MAX ship_date = ’07-OCT-2009’
19
Tip #3: Partition your Data Impact: Performance and Ease of Maintenance This does not mean “Apply Partitioning”, there is more that Oracle can do to allow better performance: Partitioning – of course… To improve scan speeds and maintenance operations Storage Indexes – To improve Scan Speeds Smart Scans – To reduce data moved around Breaking up a large data set delivers both performance and ease of maintenance
20
Tip #3: Partition your Data Driver: Flexibility in Performance SATA drives SAS drives Off-line Data Archives < 75% of your data 100% of your data Disclaimer: Illustration purposes only! 95% of your queries 5% of your queries Improve scan rates to leverage slower storage tiers = Downward Mobility
21
Tip #4: Compress your Data Impact: Cost and Performance Compression in Oracle: “Data Warehouse” compression 2 – 3x compression ratio Included in DB license “OLTP” compression 3 – 4x compression ratio Database Option – Advanced Compression Exadata Hybrid Columnar Compression 10 – 50x compression ratio Included in Oracle Exadata license
22
Tip #4: Compress your Data Hybrid Columnar Compression Data is grouped by column and then compressed Query Mode for data warehousing Optimized for speed 10X compression typical Scans improve proportionally Archival Mode for infrequently accessed data Optimized to reduce space 15X compression is typical Up to 50X for some data
23
Tip #4: Compress your Data Usage Matrix Apply compression on a per Partition or higher level Change compression over the lifetime of a Partition or Table Both EHCC and DW compression will start to “decompress” when data is updated WorkloadPreferredPossible Bulk Load (write once, read many times) EHCC – QueryDW Operational DW (write, update, delete, update, read) OLTP--- Archive (static data, read once in a while) EHCC – ArchiveDW
24
EHCC Archive Mode EHCC Query Mode OLTP Compression Tip #4: Compress your Data Applying Compression across Partitions Day 1Day 2Day 8Day 9Day 10 Month 7Month 8
25
Tip #4: Compress your Data Improving performance and lowering cost SATA drives Flash Technology Memory SAS drives < 30% of your data 100% of your data Disclaimer: Illustration purposes only! 75% of your queries 25% of your queries Move more data onto high performance storage tiers = Upward Mobility
26
Software forces a Paradigm Shift Applying software changes the balance Adding software into the mix fundamentally changes the way we use and think about storage Software driven partitioning of data changes the cost per scanned TB in relation to total data volume Compression changes the cost per TB stored significantly Compression changes the cost per scanned TB in relation to non- compressed data
27
Compounding Returns Less Storage, Better Performance 1 TB with compression 10 TB of user data Requires 10 TB of IO 100 GB with partition pruning 20 GB with Storage Indexes 5 GB with Smart Scans Sub-second Response times
28
Conclusion Software and Hardware is the Solution Building a hierarchy of storage solutions allows you to be: More Flexible Deliver better performance for lower cost Partitioning and compression are technologies that change the hardware status-quo Partitioning allows slower HW deliver better performance Compression allows faster HW to hold more data and delivers better performance With today’s technology you can improve performance while reducing storage cost
29
A Q &
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.