Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compression and Storage Optimization IDS xC4 Kevin Cherkauer

Similar presentations


Presentation on theme: "Compression and Storage Optimization IDS xC4 Kevin Cherkauer"— Presentation transcript:

1 Compression and Storage Optimization IDS 11.50.xC4 Kevin Cherkauer

2 What Is Compression? Ability to store data rows in compressed format on disk Saves up to 90% of row storage space Ability to estimate possible compression ratio Fits more data onto a page Fits more data into buffer pool Reduces logical log usage

3 What Is Storage Optimization?
Ability to consolidate free space in a table or fragment to the end Ability to return this free space to the dbspace Space returned can then be used by any table in the dbspace

4 Compression Concepts Lempel-Ziv (LZ) based algorithm – static dictionary, built by random sampling Frequently repeating patterns replaced with 12-bit symbol numbers Any byte that does not match a pattern is also replaced with a 12-bit reserved symbol number Patterns can be up to 15 bytes long Max possible compression = 90% (15 bytes replaced with 1.5 bytes = 12 bits)

5 Compression Symbols 12-bits means 4,096 symbols
256 reserved symbols for bytes that match no pattern 3,840 pattern symbols Patterns > 7 bytes use up two symbol numbers Thus not all patterns can be compressed Dictionary tries to capture the “best” patterns (frequency x length) Non-matching bytes grow by 50% (8 bits replaced by 12 bits)

6 Data Affects on Compression
Data with frequently repeating long patterns is the most compressible Long runs of 0’s or blanks are very compressible Noise-like data is poorly or not at all compressible: Encrypted data Data already compressed by another algorithm Data without long repeating patterns Avoid putting a “noise-like” column between other columns that have frequent patterns – disrupts potential column-spanning patterns

7 Performance Impact of Compression
IO-bound workloads Compression may improve performance by reducing IOs (both data page and logical log) More data fits on a page, so more in buffer pool Log records are smaller, so less logging For CPU-bound workloads Additional CPU used to compress and expand rows Should not be a large impact

8 HDR, ER, CDC (DataMirror) and Compression
All are supported on compressed tables HDR Tables will be compressed on secondary iff they are compressed on primary ER Compression status of tables is independent between source and target, specified by user CDC Compression of targets is a function of what the target database supports and what use specifies

9 OAT Interface Compression and Storage Optimization can be managed via the OAT graphical interface Details in talk following this one

10 Things That Cannot Be Compressed
Out-of-row data (e.g. blobs) Indexes Catalog tables Temp tables Partition tables Dictionary tables Tables in the following databases: sysuser,sysmaster,sysutils,syscdr,syscdcv1

11 Dictionary Storage Each compressed (non-fragmented) table or table fragment has its own compression dictionary Dictionary consumes ~75K – 100K per fragment Thus compressing tiny tables is not recommended All dictionaries for tables/fragments in a given dbspace are stored in a special hidden dictionary table in that dbspace Union of all dictionary tables exposed by sysmaster syscompdicts_full table – includes binary dictionary; access restricted to user “informix” syscompdicts view – globally accessible; omits binary dictionary for security

12 Compression Operations
create_dictionary Creates compression dictionary Any rows inserted or updated after will be compressed Previously existing rows will not be compressed compress Does implicit create_dictionary if no dictionary exists Compresses all previously existing rows Table fully accessible to other queries estimate_compression Estimate compression ratio a brand-new dictionary could get If already compressed, estimate current compression ratio (else 0) Also shows the estimated gain to be had by making a new dictionary (difference between first and second estimates)

13 Compression Operations 2
uncompress, uncompress_offline Uncompress every row in the table/fragment Deactivate the compression dictionary “uncompress” – table is fully accessible “uncompress_offline” – table is XLOCKed, no query access purge_dictionary Delete old inactive dictionaries Separate command because ER, DataMirror might need old dictionaries

14 Storage Optimization Operations
repack, repack_offline Move rows within a table or fragment to consolidate free space at the end “repack” – table is fully accessible “repack_offline” – table is XLOCKed, no query access shrink Return any free space at end of table or fragment to the dbspace Normally done after a repack

15 Admin API Interface All compression and storage optimization operations are invoked via the IDS Admin API built-in UDRs execute function task(…); execute function admin(…); Example execute function task(“table compress repack shrink”, “table_name”, “database_name”, “owner_name”); Enables OAT graphical interface Enables remote execution (DBA does not need to log directly in to the target machine)

16 Summary Compression and Storage Optimization can save disk space and thus $$$ For IO-bound workloads Compression can also improve performance Compression reduces logging Compression fits more data into the buffer pool Storage Optimization allows space saved by compression to be reclaimed from tables and fragments of tables

17 Thanks!


Download ppt "Compression and Storage Optimization IDS xC4 Kevin Cherkauer"

Similar presentations


Ads by Google