OPS-8: Effective OpenEdge® Database Configuration

OPS-8: Effective OpenEdge® Database Configuration
Richard Shulman Principal Support Engineer

Agenda Performance The Physical Layout Other Considerations

Performance What performance gains could I expect:
By just moving to OpenEdge By going to Type I Storage Areas By going to Type II Storage Areas NOTE: YMMV* (Your Mileage May Vary) Just by moving to OpenEdge there are many improvements to the commands, the potential size of the database, and many characteristics of performance for the client. Type I Storage Areas allow greater ability to align the record sizes of an area to appropriately utilize space within the database block. Type II Storage Areas give additional capabilities to organize blocks of similar data together to improve disk throughput for reads and writes of large sequential disk operations. As the slide says, YMMV.

Performance Real Customer Experience: Manufacturing Industry
Challenge was long running major processes Customer Statement process was taking over 25 min each (interactive process) Nightly MRP was taking over 4 hours

Real World Results Minutes
In this example the customer migrated their old database to a simple database with everything in the schema area yielding just over 25 minutes for the run. Moving to a separate Type I area dropped the time to about 15 minutes. Upgrading to OpenEdge dropped the time to about 11 minutes. Re-architecting the database using Type II areas dropped the time to about 4 minutes.

Real World Results Minutes
In this example the customer migrated their old database to a simple database with everything in the schema area yielding just over 4 hours for the nightly MRP run. Moving to a separate Type I area dropped the time to about 3 hours. Upgrading to OpenEdge dropped the time to about 140 minutes (2 hours 20 minutes). Re-architecting the database using Type II areas dropped the time to about 80 minutes.

Real World Results Why so big of a difference?
Is this typical of what I can expect? How fast can I get there? For the previous slides discussions of times. The v9 baseline was a dump and load into a v9 schema area with standard records per block. All data was contained initially in the Schema Area. The move from schema to separate Type I area dropped the times down partially by allowing a different definition of records per block. The shift from v9 to OpenEdge gave initial improvements in some of the algorithms of the engine. The shift from Type I to Type II gave the capabilities of clusters and the improvement of data access due to contiguity of blocks. Is this typical? The percentages of improvement may not always be this great but improvements are expected from each of these steps. For the greatest improvements the typical course requires a dump and load into new areas. This is the limiter on how fast you can get there.

Agenda Performance The Physical Layout Other Considerations

Storage Areas? No adverse application effects
Physical reorg does NOT change the application Object location is abstracted from the language by an internal mapping layer Different physical deployments can run with the same compiled r-code

Determining the Layout
What process is used to determine what the new layout should be? Run a database analysis on the production database or recent, full copy of the production database. The analysis is only valid against the full set of data to help determine number of records, average record size, scatter factor, etc. Alternatively a sample piece of code was written to facilitate some calculations. The sample is not bulletproof but should work with most databases on most platforms. See the PSDN Website:

Every layout could be different Not every company uses the same application the same way or with the same options. Taking the analysis from one customer may have different results than from another customer. Often, customers adapt functionalities that don’t exactly meet there needs. The volume of data for some fields may be vastly different than what was planned for by the developer. Therefore variations in tables, fields, etc. can yield different record sizes, different usages of areas, different records per block, etc.

Things to consider: Is your application from a Partner or do you maintain it? If it is from a Partner: Have you asked for their recommendations? Would they support you if you changed the layout? Do you have the capacity to re-org your database (you have the capability) Remember. If the application partner has any special coding which looks at the areas or makes definition file changes which are dependent on area information you may need to work with that AP to accommodate your changes.

To do a layout……. Step 1: Run a dbanalys report of the full database (or on a copy of it if you have one)

Step 1: The beginning…. Run a dbanalys report of the full database (or on a copy of it if you have one) What we look for: Large tables by record count Large tables by raw storage size Unused tables (no records) What is considered large? Sort the analysis by record count. You can use an Excel spreadsheet for this. Large is relative and subject to much discussion but if a table contains either 5% of the total records of your database or 5% of the total size of your database it is probably large in anyone’s book.

Step 2: Initial Separation…. For each of the large tables, make a separate data area for the table and a separate index area for its indices. This will add 2 new storage areas FOR EACH large table. For the tables with no records, make a small storage area for the tables and a separate storage area for the indexes. This will add 2 new storage areas total. Define separate areas for those tables with the biggest number of records Define a separate index area for the indices of those tables (one area for all the indices of one table). Put all tables which have no records into one combined area and create a separate index area as well. Though these tables have no data now, they may be used later and this might help to keep an eye on them.

Step 3: “Special” tables…. Every application has some of these: Control Tables (e.g. country codes) High Create/Destroy activity (e.g. batch report queue) Make a separate storage area for all control tables and a separate area for their indexes Make a separate storage area for all high create / destroy tables and a separate area for their indexes For control tables, define an area where all control tables will be housed. It shouldn’t need to be a big area because control tables are typically small and usually static.

Step 4: The rest…. Group the remaining tables by mean record size into 32, 64, 128 and 256 record per block groups. Make a separate storage area for each grouping and a separate area for their indexes

How to select the record per block and cluster setting?

How to select the record per block, and why do I care? Incorrect settings waste recid pointers and can cause internal block fragmentation (less critical in 10.1B and later due to 64-bit recid introduction) You have approx bytes in a 8k db block and 3900 bytes in a 4k db block For any versions of Progress / OpenEdge prior to 10.1B a 31-bit limit exists for the number of records which may be stored in 1 area. The limit is 31 bit (2 billion) because one of the bits is used by Progress to quickly signify if the record has been deleted. 10.1B and later has a limit of 63 bits (we still use one bit to signify the record has been deleted).

RECORD BLOCK SUMMARY FOR AREA "Employee" : 7 Record Size (B)----Fragments------Scatter Table Records Size Min Max Mean Count Factor Factor PUB.Benefits B PUB.Department B PUB.Employee K PUB.Family K So, in a 4k db block we could put 100 Benefits records, in a 8k db block we could put 200. However, neither of these are allowed values; WHAT DO WE DO?

Do we choose based on Performance or Storage requirements? Choose the higher rpb setting for better performance Choose the lower rpb setting for better Storage needs

How to select the cluster setting? Tables 64 or 512 block clusters Indexes 8 or 64 block clusters Typically we set 64 blocks per cluster for data tables as a moderate value. For those systems which have either a high record create or delete rate in short bursts of time or run large reports where sequential record read operations are common then 512 blocks per cluster may be better. Similar rules exist for Type II areas which contain indices but the use the lower values of 8 and 64 respectively.

Agenda General Migration Strategy The Fast and Easy Upgrade
Other Considerations

Other Considerations The physical layout Combining Databases ?
RAID considerations Separating files Fixed or Fixed/Floating Extents Combining Databases ? RAID is almost always preferred to non-Raid. There are many reasons to continue the practice of separating files (if the disk layout permits) to maximize performance, recoverability, or the ability to monitor activity. The traditional recommendations for 1 variable (floating) extent may are often unnecessary due to the advent of the OpenEdge functionality to add extents while the database is online. However, if there is no constant administrator to add extents online to a database it may still be advisable to keep one variable data extent per area because the database will shutdown if the database runs out of space.

Other Considerations The Database rules of normalization and the impact on performance Index considerations Many single component keys or fewer multi-component keys Though the rules of normalization are great to reduce duplication of data it makes crafting reports and potentially running reports less efficient. So long as the queries are written to match the index key order it is often better to have multi-component keys. If queries are written that do not include the initial fields of the index then multiple single key indices may be more appropriate.

Other Considerations Impact to startup parameters
-B is number of db blocks Impact to other environments Not just Production is impacted SQL Verify your SQL Width values using dbtool Don’t forget to UPDATE STATISTICS If during any modifications to the database, to improve performance, the database blocksize is changed, be aware this changes the amount of memory the database will use when it is started because the value of –B is in database blocks. Increasing the blocksize from 1k to 4k will mean more than 4 times the amount of memory will be used even though the same value of –B is used. Changes to 10.1C change the default client temp-table blocksize to the new value of 4k. This can make a big difference in the space used and memory used by the client application. In some cases it might be desirable to adjust the –Bt to limit the amount of blocks in memory and / or change the –tmpbsize to reduce the value back to the older default. To check if Update Statistics has ever been run on a database, look within a recent database analysis for _Systblstat _Sysidxstat _Syscolstat. If these tables do not exist or have zero values then update statistics has not been run for the database. Update Statistics should be run after any significant change to data (greater than 25%) is made.

In Summary Huge Performance Gains Possible Can be done in Phases
You can do this ! Huge performance gains are possible but dependent on a number of factors, size of database, types of reads and writes, hardware. These migration operations can be done in stages to have smaller downtimes but more of them.

? Questions

Thank You

OPS-8: Effective OpenEdge® Database Configuration

Similar presentations

Presentation on theme: "OPS-8: Effective OpenEdge® Database Configuration"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

OPS-8: Effective OpenEdge® Database Configuration

Similar presentations

Presentation on theme: "OPS-8: Effective OpenEdge® Database Configuration"— Presentation transcript:

Similar presentations

About project

Feedback