Oracle Database Compression with Oracle Database 12c Kevin Jernigan Senior Director Product Management System Technologies, Oracle
The following is intended to outline our general product direction The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Program Agenda Data Growth Challenges New in Oracle Database 12c Compression Capabilities Additional Capabilities Benefits Summary
Data Growth Challenges IT must support fast-growing amounts of data Explosion in online access and content Government data retention regulations Performance often declines as data grows Automate the compression and movement of data Reduce storage costs Improve performance Need to manage data growth Without hurting performance Without growing costs With minimal administrative intervention
Advanced Compression Option Benefits Transparent: 100% Application Transparent Smaller: Reduces Footprint CapEx: Lowers server & storage costs for primary, standby, backup, test & dev databases … OpEx: Lowers heating, cooling, floor space costs … Additional ongoing savings over life of a database as database grows in size Faster: Transactional, Analytics, DW Greater speedup from in-memory: 3-10x more data fits in buffer cache & flash cache Faster queries Faster backup & restore speeds End-to-end Cost / Performance Benefits across CPU, DRAM, Flash, Disk & Network
New in Oracle Database 12c
Heat Map – visualize use of data Insight helps identify opportunity & drive automation Table Level Heat Map Partition Level Heat Map 101010101110101001101011100001010001011011101010100101001001000010001010101 Row Level Heat Map
Heat Map Heat Map of the Database High Performance Active Recently inserted, actively updated Frequent Access Infrequently updated, frequently queried for reporting Occasional Access Infrequent access for query and updates Dormant Retained for long term analytics and compliance with corporate policies and regulations Heat Map of the Database Database level Heat Map shows when tables and partitions are being used Block level Heat Map shows access at the row level Shows both Reads and Writes Distinguishes Row access from Full Table Scans Shows Last Access as well as Histogram of access over time High Performance “In-Memory” technology Very low CPU overhead
Automatic Data Optimization As data ages: Activity declines Volume grows Older data primarily for reporting Compliance & Reporting Reporting OLTP 10x compressed 15x compressed alter table … add policy … compress for query after 3 months of no modification … compress for archive after 1 year of no modification This Quarter This Year Prior Years Row Store for fast OLTP Compressed Column Store for fast analytics Archive Compressed Column Store for max compression As data cools down, Automatic Data Optimization automatically converts data to columnar compressed Online
Automatic Data Optimization (ADO) Declarative Policy Specification: Condition Action ALTER TABLE employee ILM ADD POLICY ROW STORE COMPRESS ADVANCED ROW AFTER 3 DAYS OF NO MODIFICATION; Conditions are time period after creation, access, modification of data Actions can be Compression Tiering or Tablespace Tiering Policies are inherited from the tablespace or table New tables inherit from tablespace; can also be applied to existing tables New partitions (including interval partitions) inherit from table Immediate and background policy execution Row level policies are executed periodically Users can configure the frequency of execution Segment level policies are executed in maintenance windows Policies can be extended to incorporate Business Rules Users can add custom conditions to control placement e.g. 3 months after the ship date of an order
Automatic Data Optimization Simple Declarative SQL extension ALTER TABLE sales ILM add… Active Frequent Access Occasional Dormant OLTP Compressed (2x-4x) Affects ONLY Candidate Rows Cached in DRAM & FLASH compress for Advanced Row after 2 days of no update Warehouse Compressed (6x-10x) High Performance Storage compress for Query Low after 1 week of no update Low Cost Storage tier to SATA Tablespace Archive Compressed (10x-15x) Archival Storage compress for Archive High after 6 months no access
Advanced Network Compression Compresses network data to be transmitted at the sending side and then uncompressed at the receiver Reduces the size of the session data unit (SDU) transmitted over a data connection Reducing the size of data reduces the time required to transmit the SDU On narrow bandwidth connections, with faster CPU, it can significantly improve performance SQL query response becomes faster due to the reduced transmission time
Flashback Data Archive History Tables Flashback Data Archive - FDA Transparently tracks changes when they are made Tracked in separate history tables History tables are secure, tamper-proof FDA History Table Optimization Advanced Row Compression Advanced LOB Compression and Deduplication Compression Tiering Use Cases Governance / regulatory / compliance (GRC) Substitute for app development Error correction Replacement for CDC and DW
Additional ACO Capabilities Storage Snapshot Optimization Storage snapshots can be recovered in one step No need to put database in online backup mode Supports full recovery or point-in-time Online Move Partition (any compression level) ALTER TABLE ... MOVE PARTITION ONLINE enables non-blocking online DDL DML operations continue to run uninterrupted on the partition that is being moved Supports all compression formats - BASIC, ACO or HCC Oracle Database Enterprise Edition Data Partitioning Option Advanced Compression Option Hybrid Columnar Compression 2x to 4x Compression Typical Active Less Active Historical
Compression Capabilities
Compression Techniques COMPRESSION TYPE: SUITABLE FOR: Basic Compression Read only tables and partitions in Data Warehouse environments or “inactive” data partitions in OLTP environments Advanced Row Compression Active tables and partitions in OLTP and Data Warehouse environments Advanced LOB Compression and Deduplication Non-relational data in OLTP and Data Warehouse environments Advanced Network Compression and Data Guard Redo Transport Compression All environments RMAN/Data Pump Backup Compression Index Key Compression Indexes on tables for OLTP and Data Warehouse Hybrid Columnar Compression – Warehouse Level Read only tables and partitions in Data Warehouse environments Hybrid Columnar Compression – Archive Level “Inactive” data partitions in OLTP and Data Warehousing environments Basic Compression Basic compression, first made available with Oracle9i Database Enterprise Edition, is used to compress read-only data that may be found in a data warehouse, or in the non-active partitions of a large partitioned table underneath OLTP applications. The data is compressed as it is loaded into the table (or moved into the partition), with typically around a 2 to 4 times compression ratio achieved. SecureFiles Compression The Advanced Compression option of Oracle Database 11g also supports compression of data stored as large objects (LOBs) in the database. This is especially important for environments that mix structured data with unstructured data underneath rich applications – for example, XML, spatial, audio, image or video information stored in Oracle Database 11g can also be compressed. Compression ratios are dependent on the type of media being compressed, however three levels of compression are provided to allow the desired level of compression to be achieved based on available CPU resources Index Compression Oracle Database 11g not only compresses row data, but also compresses the indexes associated with these rows. Indexes can be compressed independent of whether the underlying table data is compressed or not. Index compression can significantly reduce the storage associated with large indexes in large database environments. Backup Compression Oracle Database 11g can also compress backups of data stored in the rows and indexes in databases, independent of whether the data in these rows or indexes have in turn been compressed.
ACO Compression Capabilities Advanced Row Compression Compress data partitions/tables Advanced LOB Compression Advanced LOB Deduplication Compress and deduplicate unstructured data RMAN/DataPump Compression Compress backups Data Guard Redo Transport Compression Advanced Network Compression Compress network traffic Flashback Data Archive History Table Oracle Database Enterprise Edition Data Partitioning Option Advanced Compression Option Flashback Data Archive History table 2x to 4x Compression Typical Active Less Active Historical 18
Advanced Row Compression Partition/table/tablespace data compression Support for conventional DML Operations (INSERT, UPDATE) 2x to 4x compression ratio’s typical Significantly eliminates/reduces write overhead of DML’s Batched compression minimizes impact on transaction performance “Database-Aware” compression Does not require data to be uncompressed – keeps data compressed in memory Reads often see improved performance due to fewer I/Os and enhanced memory efficiency
Real World Compression Results - ERP Database 10 Largest Tables Storage Utilization MB 3x Smaller Table Scan Performance (seconds) Time 2.5x Faster DML Performance Less than 3% Overhead
Why Hybrid Columnar Compression Traditionally, data has been organized in a row format Alternative approach: store data in a column format Storing column data together dramatically increases compression Exadata storage optimized to maximize query performance on HCC-compressed tables Takes advantage of the processing power, memory and Infiniband network that are integral parts of the Exadata storage server Hybrid Columnar Compression extended to Pillar Axiom and Sun ZFS Storage Appliance (ZFSSA) storage
About Hybrid Columnar Compression Hybrid Columnar Compressed Tables Compressed tables can be modified using conventional DML operations Useful for data that is bulk loaded and queried How it Works Tables are organized into Compression Units (CUs) CUs are multiple database blocks Within Compression Unit, data is organized by column instead of by row Column organization brings similar values close together, enhancing compression Compression Unit Column 1 Column 2 Column 3 Column 4 Column 5 10x to 15x Reduction
Logical Compression Unit Compression Units Compression Unit Logical structure spanning multiple database blocks Data organized by column during data load Each column compressed separately All column data for a set of rows stored in compression unit CU HEADER BLOCK HEADER C3 C4 C1 C2 C7 C5 C6 C8 Logical Compression Unit
Query and Archive Compression Query Compression 10x average storage savings 10x reduction in Scan IO Archive Compression 15x average storage savings Up to 70x on some data For cold or historical data Optimized for Speed on Exadata Optimized for Space Smaller Warehouse Faster Performance Reclaim 93% of Disks Keep Data Online
Query Compression Levels Query Compression provides two levels of compression HIGH typically provides a 10x reduction in storage LOW typically provides a 6x reduction Optimized to increase scan query performance Default is HIGH Maximizes storage savings and query performance benefits
Archive Compression Archive Compression provides two levels of compression HIGH typically provides a 15x reduction in storage LOW typically provides a 10x reduction Best approach for Information Lifecycle Management (ILM) and data archival Minimum storage footprint No need to move data to tape or less expensive disks Data is always online and always accessible Intended for tables or partitions that store data that is rarely accessed Run queries against historical data (without recovering from tape) Update historical data Supports schema evolution (add/drop columns)
Customer Compression Examples Data Warehouse Customers (Warehouse Compression) Top Financial Services 1: 11x Top Financial Services 2: 24x Top Financial Services 3: 18x Top Telco 1: 8x Top Telco 2: 14x Top Telco 3: 6x Scientific Data Customer (Archive Compression) Top R&D customer (with PBs of data): 28x OLTP Archive Customer (Archive Compression) SAP R/3 Application, Top Global Retailer: 28x Oracle E-Business Suite, Oracle Corp.: 23x Custom Call Center Application, Top Telco: 15x
Other Compression Capabilities
Advanced LOB Compression/Deduplication LOBS typically experience a reduction of 2x to 3x times in size Automatically avoids compressing data that would not benefit from compression Useful for content management, email applications and data archival applications No adverse impact on read operations Often improves read performance for cache data Enables storage of a single physical image for duplicate data Significantly reduces space consumption Dramatically improves writes and copy operations
DataPump Compression Metadata compression since Oracle Database 10g Compression for table data during exports No need to decompress before import Single step compression of both data and metadata Compressed data directly hits disk resulting in reduced disk space requirements Internal tests reduced dump file size up to 75% Application transparent Complete DataPump functionality available on compressed files
RMAN Backup Compression Dramatically reduce the storage requirements for backup data Backup data is compressed before it is written to disk or tape and doesn’t need to be uncompressed before recovery Three levels of RMAN Compression: LOW, MEDIUM, and HIGH Compression Level LOW Best suited when backup is constrained by CPU Compression Level MEDIUM Balance between CPU usage and compression ratio Compression LEVEL HIGH Best compression ratio and highest CPU utilization Best suited when backup is constrained by network or I/O
Data Guard Redo Transport Compression Compress network traffic between primary and standby databases Lower bandwidth networks (<100Mbps) 15-35% less time required to transmit 1 GB of data Bandwidth consumption reduced up to 35% High bandwidth networks (>100 Mbps) Compression will not reduce transmission time But will reduce bandwidth consumption up to 35%
Oracle Database Compression Benefits Transparent: 100% Application Transparent Smaller: Reduces Footprint Faster: Transactional, Analytics, DW End-to-end Cost / Performance Benefits across CPU, DRAM, Flash, Disk & Network New in Oracle Database 12c Automation: Heat Map and ADO Network compression Additional storage optimizations
Questions?