Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2011 IBM Corporation June 22, 2016 Compression.

Similar presentations


Presentation on theme: "© 2011 IBM Corporation June 22, 2016 Compression."— Presentation transcript:

1 © 2011 IBM Corporation June 22, 2016 Compression

2 © 2011 IBM Corporation 2 Compression

3 © 2011 IBM Corporation 3 What Is Informix Compression?  Ability to store data rows in compressed format on disk.  Saves up to 90% of row storage space.  Ability to estimate a possible compression ratio.  Fits more data onto a page.  Fits more data into the buffer pool.  Reduces logical log usage.  AND – This feature may very well save our customers lots of money, by reducing the physical storage size of their data.

4 © 2011 IBM Corporation 4 What Is Storage Optimization?  Ability to consolidate free space in a table or fragment to the end.  Ability to return this free space to the dbspace.  Space returned can then be used by any table in the dbspace.

5 © 2011 IBM Corporation 5 Compression Concepts  Lempel-Ziv (LZ) based algorithm – static dictionary, built by random sampling.  Frequently repeating patterns replaced with 12-bit symbol numbers.  Any byte that does not match a pattern is also replaced with a 12-bit reserved symbol number.  Patterns can be up to 15 bytes long.  Max possible compression = 90% (15 bytes replaced with 1.5 bytes = 12 bits).

6 © 2011 IBM Corporation 6 Compression Symbols  12-bits means 4,096 symbols: –256 reserved symbols for bytes that match no pattern. –3,840 pattern symbols.  Patterns > 7 bytes use up two symbol numbers.  Thus not all patterns can be compressed.  Dictionary tries to capture the “best” patterns (frequency x length).  Non-matching bytes grow by 50% (8 bits replaced by 12 bits).

7 © 2011 IBM Corporation 7 Data Affects on Compression  Data with frequently repeating long patterns is the most compressible: –Long runs of 0’s or blanks are very compressible.  Noise-like data is poorly or not at all compressible: –Encrypted data. –Data already compressed by another algorithm. –Data without long repeating patterns.  Avoid putting a “noise-like” column between other columns that have frequent patterns – disrupts potential column-spanning patterns.

8 © 2011 IBM Corporation 8 Performance Impact of Compression  IO-bound workloads: –Compression may improve performance by reducing IOs (both data page and logical log). –More data fits on a page, so more in buffer pool. –Log records are smaller. So there is less logging.  For CPU-bound workloads: –Additional CPU used to compress and expand rows. –Should not be a large impact.  Backups of compressed objects will take less time.

9 © 2011 IBM Corporation 9 HDR, ER, CDC (DataMirror) and Compression  All are supported on compressed tables.  HDR, SDS, RSS –Tables will be compressed on the secondary if they are compressed on the primary. All compression initialization occurs only on the primary.  ER –Compression status of tables is independent between source and target, specified by DBA.  CDC –Compression of targets is a function of what the target database supports and what the use specifies.

10 © 2011 IBM Corporation 10 When is Data Compressed/Decompressed?  Compressed on insert: –It is not compressed at the client, over the wire or within the SQL layer.  The page data is decompressed at the time of retrieval.

11 © 2011 IBM Corporation 11 Things That Cannot Be Compressed  Out-of-row data (e.g. blobs).  Indexes.  Tables with less than 2000 rows.  Catalog tables.  Temp tables.  Partition tables.  Dictionary tables.  Tables in the following databases: sysuser,sysmaster,sysutils,syscdr,syscdcv1.

12 © 2011 IBM Corporation 12 Compression and Storage Optimization  The green shading in the figure below depicts un-compressed table data.  A COMPRESS operation creates a compression dictionary (if there is none) and compresses all the records that exist within the table page. All new data moved into the table will also be compressed.  REPACK consolidates the data on the page.  SHRINK consolidates all the packed data from multiple pages and releases the empty datapages that was obtained via consolidation back to the dbspace for further use.

13 © 2011 IBM Corporation 13 OAT Compression - General  Compression and Storage Optimization can be managed via the OAT graphical interface.

14 © 2011 IBM Corporation 14 Navigation to the Compression Interface  From the main menu of the Open Admin Tool: –Space Administration  Storage  Instance_name  Tables and Indexes  Database_name –Click Actions and the Tables and Indexes window opens up ….

15 © 2011 IBM Corporation 15 OAT – Table and Indexes Interface (1)  The Tables and Indexes Screen shows a list of available tables and their storage status. Usage in green is available storage, Usage in red is used storage. The display has been enhanced to reflect extent counts. Highlight a table and click the check box. Click Actions ……

16 © 2011 IBM Corporation 16 OAT – Table and Indexes Interface (2) ..and two choices appear on the drop down menu here. Click Optimize Space and a set of options will appear for you to examine….

17 © 2011 IBM Corporation 17 OAT – Table and Indexes Interface (3)  The Optimize Space wizard allows you to Compress, Uncompress, Repack, Shrink, or Defragment Extents. Click the operation(s) you want to perform on as many tables as appear in the Tables and Fragments Selected window. Click OK at bottom to start.

18 © 2011 IBM Corporation 18 OAT – Task Status Interface  The Task Status log records the results of compress, uncompress, repack, shrink and defragmentation operations on tables and fragments of tables.

19 © 2011 IBM Corporation 19 OAT – Server Optimization Policies  Click the check box under Policies, and enable the screen for data entry and changes. Clicking each operation allows policy changes for thresholds under Value for Compress, Repack, Shrink and Defragment operations. A scheduler accompanies the operations ……. Click Update Policies at bottom to change.

20 © 2011 IBM Corporation 20 OAT – Estimate Compression Operations  To estimate compression savings on a table: –Move the mouse over a table’s Space Usage in Tables and Indexes. –A yellow pop up box showing the estimated compression appears. –Check the box next to the table name. –On the Actions tab above, click Update Data. –A pop up window should appear showing the status of the operation. –Pink storage color should appear in the Space Usage column. If it does not, refresh the screen.

21 © 2011 IBM Corporation 21 Compress a Table (1)  Here is a table called itm_txn, with 111128 records on 3864 pages. What does it look like compressed……?

22 © 2011 IBM Corporation 22 Compress a Table (2)  Click the button next to Actions and a drop down menu appears. Click Optimize Space ……

23 © 2011 IBM Corporation 23 Compress a Table (3)  Check the Compress option below. Click OK when done.  Compression – the compression of table data pages

24 © 2011 IBM Corporation 24 Compress a Table (4)  The new page size of the compressed table is shown: 3864 pages after compression (compressed column for the table is checked in green). The data has been compressed within the original allocated table pages.

25 © 2011 IBM Corporation 25 Repack a Table  Sometimes, compression operations leave partial unfilled pages that need to be moved. Repack them to make partial pages full.  Repack moves data to the beginning of the table data pages, freeing those pages at the end. You can compress, repack and shrink in any combination simultaneously.

26 © 2011 IBM Corporation 26 Shrink a Table (1)  Shrink operations are performed in OAT similar to Compress and Repack; Click the box for Shrink and the Click OK at the bottom….. Shrink takes unused allocated table data pages at the end of the table and returns them to the dbspace for use by another object.

27 © 2011 IBM Corporation 27 Shrink a Table (2)  Partial pages were removed, and the table size shrank to 1213 pages a 68.61% storage reduction with this data pages set…. your mileage will vary.

28 © 2011 IBM Corporation 28 Decompress a Table  The data, once compressed, should decompress to the same or very close to the original size (3863 pages out of 3864 here), just in case you may want to move the data to another O/S, for example.

29 © 2011 IBM Corporation 29 Compression within OAT  Compression operations: –Compression – the compression of table data pages.  Decompression operation –Decompress – the decompression of table data pages: Do this when moving data across O/S, for example.  Estimate Compression –An estimated compression ratio of the total table storage occurs: Colorized in pink on the screen. Within popup text boxes when the cursor passes over the OAT Compression column.

30 © 2011 IBM Corporation 30 Server Optimization within OAT  Repack – moves freed table data pages to the end of the table or fragment.  Shrink – returns any freed table or fragment space to the dbspace.  Defragment – any table with multiple extents will have their extent counts reduced online, potentially to 1 in many cases.

31 © 2011 IBM Corporation 31 Dictionary Storage (1)  Each compressed table or table fragment has its own compression dictionary.  The dictionary consumes ~75K – 100K per fragment.  Thus compressing tiny tables is not recommended.  The union of all dictionary tables for all dbspaces is in sysmaster: –syscompdicts_full table – includes binary dictionary; access restricted to user “informix”. ** –syscompdicts view – globally accessible; omits binary dictionary for security. ** See Appendix

32 © 2011 IBM Corporation 32 Dictionary Storage (2)  All dictionaries for the tables/fragments in a given dbspace are stored in a special hidden dictionary table in that dbspace, which can be seen via oncheck –pe:

33 © 2011 IBM Corporation 33 Admin API Interface – Compression Operations  All compression and storage optimization operations are invoked via the Informix Admin API built-in UDRs: –execute function task(…); –execute function admin(…);  Example: –execute function task(“table compress repack shrink”, “table_name”, “database_name”, “owner_name”);  Enables remote execution (DBA does not need to log directly into the target machine).

34 © 2011 IBM Corporation 34 Admin API Compression Operations – Create Dictionary  create_dictionary –Creates compression dictionary. –Any rows inserted or updated after will be compressed. –Previously existing rows will not be compressed. –Example: execute function task(“table create_dictionary”, “test_data”, “stores”, “root”);

35 © 2011 IBM Corporation 35 Admin API Compression Operations – compress  compress –Does an implicit create_dictionary if no dictionary exists. –Compresses all previously existing rows. –Table fully accessible to other queries. –Example: execute function task(“table compress”, “test_data”, “stores”, ”root”);

36 © 2011 IBM Corporation 36 Admin API Compression Operations – estimate compression  estimate_compression –Estimates the compression ratio for a brand-new dictionary. –If already compressed, estimates current compression ratio (else 0). –Also shows the estimated possible gain by making a new dictionary (difference between first and second estimates). –Example: execute function task(“table estimate_compression”, “test_data”, “stores”, “root”);

37 © 2011 IBM Corporation 37 Admin API Compression Operations – uncompress  uncompress, uncompress_offline –Uncompresses every row in the table/fragment. –Deactivates the compression dictionary. –“uncompress” – table is fully accessible. –“uncompress_offline” – table is XLOCKed, no query access. –Example: execute function task(“table uncompress”, “test_data”, “stores”, “root”);

38 © 2011 IBM Corporation 38 Admin API Compression Operations – purge dictionary  purge_dictionary –Delete old inactive dictionaries. –Separate command because ER, DataMirror might need old dictionaries. –Example: execute function task(“table purge_dictionary”, “test_data”, “stores”, “root”);

39 © 2011 IBM Corporation 39 Admin API Storage Optimization Operations - repack  repack, repack_offline –Move rows within a table or fragment to consolidate free space at the end. –“repack” – table is fully accessible. –“repack_offline” – table is XLOCKed, no query access. –Example: execute function task(“table repack”, “test_data”, “stores”, “root”);

40 © 2011 IBM Corporation 40 Admin API Storage Optimization Operations - shrink  shrink –Returns any free space at end of table or fragment to the dbspace. –Normally done after a repack. –Frees entire extents. Last extent only partially. –Example: execute function task(“table shrink”, “test_data”, “stores”, “root”);

41 © 2011 IBM Corporation 41 Admin API Fragment operations  Table fragments may also be compressed or decompressed: –You must supply a partition number of the fragment you want to conduct the operations on. –execute function task("fragment uncompress","14680071");

42 © 2011 IBM Corporation 42 Auto Compress/Repack/Shrink and Defrag (1)  Beginning in 11.7, the purpose of this task, called auto_crsd, is to automatically compress, repack, shrink and defragment tables / fragments in the database server that meet qualifying criteria.  Each operation can be enabled, disabled and configured via specific entries in the sysadmin:ph_threshold table.  Task can be executed thru the Open Admin Tool as well.

43 © 2011 IBM Corporation 43 Auto Compress/Repack/Shrink & Defrag (2)  By default this task is disabled. To enable the task, simply update the sysadmin:ph_task table and set the tk_enable column to ‘T'.  Example: DATABASE sysadmin; UPDATE ph_task SET tk_enable = ‘ T' WHERE tk_name = 'auto_crsd';

44 © 2011 IBM Corporation 44 Auto Compress/Repack/Shrink and Defrag (3)  Compress  To enable auto compression, update the table sysadmin:ph_threshold and change the value to ‘T' for the AUTOCOMPRESS_ENABLED row. DATABASE sysadmin; UPDATE ph_threshold SET value = ‘ T' WHERE name = 'AUTOCOMPRESS_ENABLED';  Default value = ‘F’

45 © 2011 IBM Corporation 45 Auto Compress/Repack/Shrink and Defrag (4)  Compress (cont'd)  The criteria for auto compressing a table/fragment is the number of rows. The default is 50000.  The name of this sysadmin:ph_threshold parameter is 'AUTOCOMPRESS_ROWS'  To change the value, update sysadmin:ph_threshold DATABASE sysadmin; UPDATE ph_threshold SET value = '10000' WHERE name = 'AUTOCOMPRESS_ROWS';

46 © 2011 IBM Corporation 46 Auto Compress/Repack/Shrink and Defrag (5)  Repack  To enable auto repack, update the sysadmin:ph_threshold table and change the value to ‘T' for the AUTOREPACK_ENABLED row. DATABASE sysadmin; UPDATE ph_threshold SET value = ‘ T' WHERE name = 'AUTOREPACK_ENABLED';  Default value is ‘F’

47 © 2011 IBM Corporation 47 Auto Compress/Repack/Shrink and Defrag (6)  Repack (cont'd)  The criteria for auto repacking a table/fragment is when the percentage of non-contiguous storage exceeds the user defined threshold. The default is 50.  The name of this sysadmin:ph_threshold parameter is 'AUTOREPACK_SPACE'.  To change the percentage value, update sysadmin:ph_threshold. DATABASE sysadmin; UPDATE ph_threshold SET value = '90' WHERE name = 'AUTOREPACK_SPACE';

48 © 2011 IBM Corporation 48 Auto Compress/Repack/Shrink and Defrag (7)  Shrink  To enable auto shrink, update the sysadmin:ph_threshold table and change the value to ‘T' for the AUTOSHRINK_ENABLED row.  DATABASE sysadmin; UPDATE ph_threshold SET value = ‘ T' WHERE name = 'AUTOSHRINK_ENABLED';  Default is ‘F’

49 © 2011 IBM Corporation 49 Auto Compress/Repack/Shrink and Defrag (8)  Shrink (cont'd)  The criteria for auto shrinking a table/fragment is configured as the percentage of unused allocated space. The default is 50.  The name of this sysadmin:ph_threshold parameter is 'AUTOSHRINK_UNUSED'.  To change the percentage value, update sysadmin:ph_threshold DATABASE sysadmin; UPDATE ph_threshold SET value = '90' WHERE name = 'AUTOSHRINK_UNUSED';

50 © 2011 IBM Corporation 50 Auto Compress/Repack/Shrink and Defrag (9)  Defrag  To enable auto defrag, update the sysadmin:ph_threshold table and change the value to ‘T' for the AUTODEFRAG_ENABLED row. DATABASE sysadmin; UPDATE ph_threshold SET value = ‘ T' WHERE name = 'AUTODEFRAG_ENABLED';  The default is ‘F’

51 © 2011 IBM Corporation 51 Auto Compress/Repack/Shrink and Defrag (10)  Defrag (cont'd)  The criteria for auto defragging a table/fragment is when the number of extents exceeds the user defined value for sysadmin:ph_threshold threshold.  The name of this sysadmin:ph_threshold parameter is 'AUTODEFRAG_EXTENTS'.  To change the value update sysadmin:ph_threshold DATABASE sysadmin; UPDATE ph_threshold SET value = '100' WHERE name = 'AUTODEFRAG_EXTENTS';  Default is 100.

52 © 2011 IBM Corporation 52 Onstat –g ppd partnum – Table Compression Information  Partnum – Table partition number.  Version -- Version of the compression dictionary format (dictionary code). Always 1 in the first release. If we ever change a dictionary format, we would increment this.  DbsNum -- Dbspace number the dictionary is associated with.  CrTS -- Timestamp (# of seconds since Jan 1, 1970) since dictionary creation.  CrLogID -- Unique logical log ID of the log position at the time of dictionary creation.  CrLogPos -- Position within log CrLogID at the time of dictionary creation.  DrTS -- Timestamp (# of seconds since Jan 1, 1970) of the drop of this dictionary. This is NULL if the dictionary is still active.  DrLogID -- Unique logical log ID of the log position at the time of the dictionary drop. This is NULL if the dictionary is still active.  DrLogPos -- Position within log DrLogID at the time of the dictionary drop. This is NULL if the dictionary is still active.

53 © 2011 IBM Corporation 53 Onstat –g ppd partnum – OAT Table Compression Information  Partnum – Table partition number.  Version -- Version of the compression dictionary format (dictionary code). Always 1 in the first release. If we ever change a dictionary format, we would increment this.  DbsNum -- Dbspace number the dictionary is associated with.  CrTS -- Timestamp (# of seconds since Jan 1, 1970) since dictionary creation.  CrLogID -- Unique logical log ID of the log position at the time of dictionary creation.  CrLogPos -- Position within log CrLogID at the time of dictionary creation.  DrTS -- Timestamp (# of seconds since Jan 1, 1970) of the drop of this dictionary. This is NULL if the dictionary is still active.  DrLogID -- Unique logical log ID of the log position at the time of the dictionary drop. This is NULL if the dictionary is still active.  DrLogPos -- Position within log DrLogID at the time of the dictionary drop. This is NULL if the dictionary is still active.

54 © 2011 IBM Corporation 54 Onstat –g dsk – Compression Information (1)  This onstat presents information on active compression operations, shown here as onstat –gr dsk.

55 © 2011 IBM Corporation 55 Onstat –g dsk – Compression Information (2)  Column Headers  partnum – Partition number we are operating on.  op - Operation type: – RSC_CREATE_DICTIONARY = 0x00000001, /* “create dictionary" flag */ – RSC_COMPRESS = 0x00000002, /* "compress" flag */ – RSC_REPACK = 0x00000004, /* "repack" flag */ – RSC_REPACK_OFFLINE = 0x00000008, /* "repack_offline" flag */ – RSC_SHRINK = 0x00000010, /* "shrink" flag */ – RSC_UNCOMPRESS = 0x00000020, /* "uncompress" flag */ – RSC_UNCOMPRESS_OFFLINE = 0x00000040, /* "uncompress_offline" flag */ – RSC_ESTIMATE_COMPRESSION = 0x00000080,/* "estimate_compression" flag */ – RSC_PURGE_DICTIONARY = 0x00000100, /* "purge_dictionary" flag */  processed - How many rows/pages processed.  cur_page --> Current page we are operating on.  duration --> How long the operation has been running.  table ---> Table name.

56 © 2011 IBM Corporation 56 Onstat –g dsk – OAT Compression Information  OAT has the ability to run the onstats as well, in this case onstat –g dsk

57 © 2011 IBM Corporation 57 Sysmaster:Sysstoragemgr – Compression Status  Select * from sysstoragemgr  Run time status of: –Compress –Repack –Shrink –Uncompress –Estimate compression –Create Dictionary –Combinations of the above.

58 © 2011 IBM Corporation 58 Oncheck –pt – Compression Information  Compression dictionary summary at the bottom (if the table is compressed, at top right).

59 © 2011 IBM Corporation 59 Oncheck –pT – Compression Information  Compression information in two places:

60 © 2011 IBM Corporation 60 Message Log File  The message log file has been greatly enhanced to reflect the status of compression operations as they occur:

61 © 2011 IBM Corporation 61 Tuning – Whole Table Compression Operations  Physical Log very large.  Large Logical Logs and a lot of them: –Compression on whole tables uses a lot of logical log space.  Parallel compression operations possible: –Machines > 1 CPU. –Fragment tables on single CPU machines: Single fragment nptotal < BUFFERPOOL. Not too many tables at once – possible backup on CPU.  Heavy BUFFERPOOL buffers: –Excess memory should be available in SHMVIRTSIZE. –SHMADD minimum 4x default.

62 © 2011 IBM Corporation 62 Compression Estimator Tool  Downloadable – http://www-01.ibm.com/software/data/informix/compressionhttp://www-01.ibm.com/software/data/informix/compression

63 © 2011 IBM Corporation 63 References  One telecommunications provider obtained 50%+ storage improvements on a system with a mandated five year data retention storage requirement. This saved the customer huge money in hardware purchases no longer required, speeded up reporting by 30% and reduced database backup times by 50%.  A water distribution utility utilized compression with storage savings of 70%. Processing time on batch queries dropped by 50%. Overall performance on the system increased by 50%.

64 © 2011 IBM Corporation 64 Summary  Compression and Storage Optimization can save disk space and thus $$$.  For I/O-bound workloads, compression can also improve performance.  Compression reduces logging.  Compression fits more data into the buffer pool.  Storage Optimization allows extent space freed by the repack and shrink operations to be reclaimed from tables and fragments of tables.

65 © 2011 IBM Corporation 65 Questions?

66 © 2011 IBM Corporation 66 Logo


Download ppt "© 2011 IBM Corporation June 22, 2016 Compression."

Similar presentations


Ads by Google