Download presentation
Presentation is loading. Please wait.
1
Walking Through A Database Health Check
QAD Midwest UG – Grand Rapids, MI Mike Furgal Director – DB and Pro2 Services October 9th, 2017 Progress
2
Progress Services Mike Furgal QAD Global Alliance Partner
Introduction Progress Services QAD Global Alliance Partner 50+ QAD Specialists Managed Database Services Much much more Mike Furgal Progress employee since 1989 Architect of the OpenEdge database Director of DB Services
3
How Healthy is your QAD environment
Best Practices Performance Metrics Reading Data Updating Data Memory Contention CPU Usage Disaster Recovery Plan
4
Base Configuration Parameters
Best Practices Reviewing Log Files Truncating Log Files Database Structure Base Configuration Parameters Progress Software
5
Log FIles Log files should not be large than 50 MB – Archive and truncate monthly Common Errors (5635) SYSTEM ERROR: -s exceeded (-----) TCP/IP write error occurred with errno 32 (-----) vv_flush:I/O error 5 on fd 1 (49) SYSTEM ERROR: Memory violation. (6072) SYSTEM ERROR: error writing, file (DBI File) You should monitor the log file daily or weekly
6
Database Structure Guidelines
5 to 10 extents per Storage Area 15%+ of free space to grow within each Storage Area Each Storage Area needs a variable overflow – just incase No User data (tables or indexes) should be in the Schema Area
7
After Imaging is a requirement for all production systems
After Imaging Enabled After Imaging is a requirement for all production systems After Imaging provides Point in Time recovery Restored Backup + Applied AI files = Point in Time Recovery Easy to turn on Easy to maintain Backup and AI Retention Requires some decisions
8
After Image Writer (AIW) Before Image Writer (BIW)
Background Processes After Image Writer (AIW) You need one Before Image Writer (BIW) Asynchronous Page Writer (APW) One should be sufficient Watchdog (WDOG)
9
Database Blocksize Miscellaneous
8K is best After Image and Before Image Blocksize 16K is best Large Files Enabled After Image memory and Before Image Memory Should match AI memory should NOT be 1.5 time BI memory – old information from Progress version 6.2
10
Best Practices Summary
Database Size (MB) DB Blocksize Large Files BI Blocksize AI Enabled AI Blocksize Ai Buffers BI Buffers Errors in Log Data in Schema Area mfgprd 242,826 8192 Yes 16384 100 200 tmsprd 114,442 50 qsxprd 8,534 No cpdprd 6,534 admprd 4,825 audprd 3,488 hlpprd 181 20 qxoprd 16 qxeprd 3
11
Performance Check CRUD vs DB Reads (Buffer Hit Ratio)
12,250 / 453 = 38 22 hour sample = too long
12
Performance Check Buffer Hit Ratio 10 Minute Sample
268,362 / 3,462 = 78 10 Minute Sample
13
Buffer Hit Ratio 10 Minute Samples
14
Database Reads Per Second
10 Minute Samples
15
What affects the Buffer Hit Ratio Memory Usage and Allocation
Database Scatter or Fragmentation Database Queries
16
Difficult to determine exactly
Memory Usage Need to allocate enough memory to hold the “working set” of the database in memory. Difficult to determine exactly Allocating 10% of database size to memory is typically a good starting point Requires running 64bit OpenEdge
17
Comes in 3 forms Database Scatter Record Fragmentation
Physical Scatter Logical Scatter DB Block 1 DB Block 2 DB Block 3 Record frag 1 Record frag 2 frag 3
18
Database Scatter Physical Scatter
19
Logical Scatter How ordered are the records by the most used index?
Database Scatter Logical Scatter How ordered are the records by the most used index
20
Database Scatter Area (# Tables) Table Records Size Frag Factor
% Fragmented Scatter FIN (381) PUB.fcInstance 35,879,568 101.7G 1.3 29.30 1.0 Audit_Data (7) _aud-audit-data 109,006,237 21.8G 0.00 PUB.DocumentStorage 3,252,064 12.2G 1.2 15.50 MFG (867) PUB.uusg_det 50,394,220 9.2G PUB.PostingLine 22,331,654 4.1G PUB.spt_det 35,894,620 2.9G PUB.fcDaemonQueue 24,217,803 2.5G PUB.Posting 9,806,709 2.4G PUB.tr_hist 8,480,499 2.2G 0.40 PUB.fcSession 17,243,362 1.4G
21
CRUD
22
CRUD To debug this issue, the –tablerangesize and –indexrangesize need to be set properly so CRUD stats can be collected from all tables using the _tablestat and _indexstat Virtual System Tables.
23
PUB.spt_det.spt_sim_elem 5 175,340 777.5M 57
Index Utilization Table.Index Fields Levels Blocks Size Utilization PUB.uusg_det.uusg_prod_date 4 428,908 3.0G 92 ._Audit-time 1 3 123,369 951.9M 99 PUB.uusg_det.uusg_date 124,392 852.4M 88 PUB.spt_det.spt_sim_elem 5 175,340 777.5M 57 PUB.uusg_det.uusg_sid_user 147,398 770.2M 67 PUB.spt_det.spt_sim_part 159,144 675.9M 55 PUB.uusg_det.oid_uusg_det 79,136 533.8M 87 PUB.spt_det.oid_spt_det 52,575 403.7M PUB.fcInstance.prim 2 58,706 358.5M 78 PUB.uusg_det.uusg_user_date 77,952 357.5M 59 Index spt_dey.spt_sim_elm is 777 MB, but uses 175,340 x 8K blocks or 1.40 GB of space on disk and more importantly in memory
24
This was all about Reading data
This was all about Reading data. Writing data is equally important and will be covered next.
25
Database Updates Speed of the disk Waits on IO Checkpoints
26
# proutil x –C truncate bi –bi 16384
Speed of the disk Do this test at home # proutil x demo # proutil x –C truncate bi –bi 16384 # time proutil x –C bigrow 2 –zextendSyncIO Do this both variable extent (as describe) and fixed extend Run multiple times to remove outliers (truncate in between runs) If the time to “bigrow” > 10 seconds, you have an IO problem
27
Waits on IO Promon – R&D – 2 – 5 BI Log Progress
28
Wait on IO This is time waiting for Before Image IO to happen. Increase –bibufs to make this go away. After Imaging has the same waits. Make sure –bibufs and –aibufs match. The memory is measured in Kilobytes, so be generous
29
Checkpoints Checkpointing is the periodic synchronization with the data in memory with the data on disk. It allows both smooth operation and predictable startup times. Checkpoint too frequently, it’s like a governer on the engine Checkpoint too infrequently and recovery times may be long Checkpoints between 1 and 5 minutes are desired
30
Promon – R&D – 3 – 4 Checkpoints
The Freq column tells you how often you are checkpoining. The value is in seconds
31
Checkpoints You can adjust the frequency by adjusting the Before Image Cluster Size Common values are 8 MB to 32 MB – the default is 512 KB, which is too small
32
Promon – R&D – 3 – 1 – Performance Indicators
Shared memory and CPU Promon – R&D – 3 – 1 – Performance Indicators
33
Latch Timeouts Goal is to be less than 10 per second. Adjust the –spin setting. If you have more than 20% idle CPU time, increase –spin to use it
34
There is much more to cover in latching Object Manager
Set it to 10,000 (-omsize) LRU Chain Skips Set it to 100 (-lruskips) Database Hot Spots Monitor to look for Alternative Buffer Pool opportunities
35
Recovery Time Objective (RTO)
Disastr Recovery Plan Terms Recovery Time Objective (RTO) How long it takes to recover from a disaster Recovery Point Objective (RPO) How much data are you willing to lose Are daily backups enough?
36
Disaster Recovery with OpenEdge Replication
Level 4 Offsite Storage of Backups and AI Level 3 After Imaging Level 2 Daily Backups Level 1
37
This provides you a high level overview of the Database Health Check
Summary This provides you a high level overview of the Database Health Check What was discussed here is typical for what we see in most QAD environments As always Your Mileage May Vary.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.