B9: Success with OpenEdge® Replication What you don’t know can hurt you David Eddy Senior Solution Consultant
Obligatory (gratuitious) Quotation An ounce of prevention is worth a pound of cure. 28.35 grams of prevention is worth .45 kilograms of cure
Success With Replication Understand the architecture Know your requirements Analyze your system Plan, configure, deploy,monitor Succeed 3
Agenda Terminology and architecture – Visual Replication performance hotspots Replication availability hotspots Source and target management 4
Replication Terminology Primary – original production machine Secondary – original recovery machine Source – production database Target – recovery database Replication server – rpserver (source) Replication agent – rpagent (target) DBService queue – stores IPC messages Network pipe – TCP/IP rpserver -> rpagent
Architecture of Replication – Primary/Source Primary Machine R/Write Clients AI Extents DB Service Q Repl Server Source Database Database Brk/Server Processes ! 6
Architecture of Replication – Secondary/Target Secondary Machine Repl Agent Database Brk/ServerProcesses Target Database R/O Clients 7
Architecture of Replication Primary Machine Secondary Machine R/Write Clients AI Extents DB Service Q Repl Server Source Database Database Brk/Server Processes Repl Agent Database Brk/ServerProcesses Target Database R/O Clients 8
Agenda Terminology and architecture - Visual Replication performance hotspots Replication availability hotspots Source and target management 9
Database Brk/Server Processes Database Brk/ServerProcesses Performance Hotspots Primary Machine Secondary Machine R/Write Clients AI Extents DB Service Q Repl Server Source Database Database Brk/Server Processes Repl Agent Database Brk/ServerProcesses Target Database R/O Clients 10
Source DB Performance Considerations Speed/power of machine Replication Plus – offload read only clients to target database AI files – fixed extents AI/BI blocksizes should be the same 16 K is the most efficient Need to truncate ai and bi
Target DB Performance Considerations System should not be underpowered by comparison to the source system No need for after-imaging Read only clients
Network Performance WAN vs. LAN The bigger the pipe the better Determine size of pipe Whitepaper available on PDSN http://tinyurl.com/6xqp78
Sample Network Bandwidth Calculation Hourly After Image Blocks from 5 production databases = 713mb By calculation: Replication size = 1.5 * AI size 713mb*1.5 = 1069.5mb Add replication overhead (1.1) 1069.5mb*1.1 = 1176.45mb Throughput per second: 1176.45mb/3600=.32679mb/b Change to kilobytes /s: .32679mb/s * 1000 = 326.79kb/b needed to sustain transfer
Replication Performance – DBService Buffer DBService Buffer Queue must be sized appropriately – Solution P121969 Failure to do so may result in source slowdown and target falling behind Analyze AI activity prior to implementing Use PROMON -> R&D -> Status Display (#1)-> DB Service Manager (#16).
Determining Optimal -Pica The larger the –pica, the further behind the target may get Monitor AI writes during busiest period for one hour -pica = (TAIW/Blockcount) * 1.25 Blockcount depends on version OpenEdge 10 – blockcount is 9.16 Progress® 9.1x – blockcount is 18.2
Sample –Pica Calculation Values up to 8192k for 10.1B01 and later 34560 TAIW over one hour Formula for OE 10: (34560/9.16) * 1.25 = 4716.1572052401746724890829694323 Why not just use 8192???
“Houston, we have a problem” - Promon 03/03/08 Status: Database Service Manager 17:15:02 Communication Area Size : 2049.00 KB Total Message Entries : 18733 Free Message Entries : 4 Used Message Entries : 18729 Registered Database Service Objects Name Rdy Status Messages Locked by OpenEdge Replication Server Y RUN 18729 OpenEdge RDBMS Y REG 0 OpenEdge DB Agent Y RUN 0
Agenda Terminology and Architecture - visual Replication performance hotspots Replication availability hotspots Source and target management 19
Availability Hotspots Primary Machine Secondary Machine R/Write Clients AI Extents DB Service Q Repl Server Source Database Database Brk/Server Processes Repl Agent Database Brk/ServerProcesses Target Database R/O Clients 20
Availability Hotspots Loss of Database TCP or Process Failure Primary Machine Secondary Machine R/Write Clients AI Extents DB Service Q Repl Server Source Database Database Brk/Server Processes DOWN! Locked Repl Agent Database Brk/ServerProcesses Target Database R/O Clients Severed 21
Availability Hotspots – AI Management Replication DOES NOT manage AI files AI Files must be emptied and backed up 10.1A AI archiver became available. Database crashes when no empty AI extents.
Availability Hotspots – Locked AI Files AI “locked” when AI notes not replicated to the target db Common issue – often caused by simple maintenance routines and failure to monitor status of replication Configure the rpserver and rpagent to detect and handle outages All about the *.repl.properties file!! Configure, test well and sleep easier at night
Availability – Many Locked AI Files Use dsrutil monitor against source and target db. Check for replication shared memory If rpagent is running, restart the replication server on the source If rpagent is not running, restart target database and restart replication server
“Houston, we have a problem” – part 2 Extent: 1 Extent: 3 Status: Busy Status: Locked Type: Variable Length Type: Variable Length Path: C:\wrk101c\repl\source.a1 Path: C:\wrk101c\repl\source.a3 Size: 3192 Size: 6264 Used: 3180 Used: 6200 Start: Wed May 14 14:09:34 2008 Start: Wed May 14 14:08:04 2008 Seqno: 5 Seqno: 3 Extent: 2 Extent: 4 Status: Locked Status: Locked Path: C:\wrk101c\repl\source.a2 Path: C:\wrk101c\repl\source.a4 Size: 230008 Size: 25208 Used: 229674 Used: 25063 Start: Wed May 14 13:45:30 2008 Start: Wed May 14 14:08:24 2008 Seqno: 2 Seqno: 4
Availability – All AI Files Locked Source database activity comes to a halt Can you afford to bring the db down? If not - disablesitereplication If yes stop database add new ai files prostrct reorder restart.
Prostrct add sourcedb addai.st Extent: 1 Extent: 4 Status: Busy Status: Locked Type: Variable Length Type: Variable Length Path: C:\wrk101c\repl\source.a1 Path: C:\wrk101c\repl\source.a4 Size: 3192 Size: 25208 Used: 3180 Used: 25063 Start: Wed May 14 14:09:34 2008 Start: Wed May 14 14:08:24 2008 Seqno: 5 Seqno: 4 Extent: 2 Extent: 5 Status: Locked Status: Empty Type: Variable Length Type: Variable Length Path: C:\wrk101c\repl\source.a2 Path: C:\wrk101c\repl\source.a5 Size: 230008 Size: 120 Used: 229674 Used: 0 Start: Wed May 14 13:45:30 2008 Start: N/A Seqno: 2 Seqno: 0 Extent: 3 Extent: 6 Path: C:\wrk101c\repl\source.a3 Path: C:\wrk101c\repl\source.a6 Size: 6264 Size: 120 Used: 6200 Used: 0 Start: Wed May 14 14:08:04 2008 Start: N/A Seqno: 3 Seqno: 0
Prostrct reorder ai sourcedb Extent: 1 Extent: 4 Status: Busy Status: Locked Type: Variable Length Type: Variable Length Path: C:\wrk101c\repl\source.a1 Path: C:\wrk101c\repl\source.a4 Size: 3192 Size: 230008 Used: 3180 Used: 229674 Start: Wed May 14 14:09:34 2008 Start: Wed May 14 13:45:30 2008 Seqno: 5 Seqno: 2 Extent: 2 Extent: 5 Status: Empty Status: Locked Type: Variable Length Type: Variable Length Path: C:\wrk101c\repl\source.a2 Path: C:\wrk101c\repl\source.a5 Size: 120 Size: 6264 Used: 0 Used: 6200 Start: N/A Start: Wed May 14 14:08:04 2008 Seqno: 0 Seqno: 3 Extent: 3 Extent: 6 Path: C:\wrk101c\repl\source.a3 Path: C:\wrk101c\repl\source.a6 Size: 1 20 Size: 25208 Used: 0 Used: 25063 Start: N/A Start: Wed May 14 14:08:24 2008 Seqno: 0 Seqno: 4
Source and Target DBs Won’t Synchronize Try restarting a few times. -Ma, -Mn, -n match? Changes to .properties files? If it continues to fail, contact support Source/target db log files AI files pmmgr.properties file arguments=-logging 2
Agenda Terminology and architecture - visual Replication performance hotspots Replication availability hotspots Source and target management 30
Managing Replication – DSRUTIL Monitor and Status Not immediately obvious when a failure occurs Proactively verify replication performance and status
DSRUtil Monitor DSRUTIL source/target –C monitor Attaches to replication shared memory. Cannot connect to replication shared memory. Status = -1 Checks status of Server and Agent Server status Agent status
DSRUtil source –C monitor OpenEdge Replication Monitor Page 1 Database: C:\wrk101c\repl\source Database is enabled as OpenEdge Replication: Source Server is: In Normal Processing Number of configured agents: 1 Delay Interval (current / min / max): 5 / 5 / 500 Recovery information: State: No recovery being performed Agents needing recovery: 0 Agents connected: 0 Agents in synchronization: 0 Transition information: Type: Manual
DSRUtil target –C monitor (page 1) Database: C:\wrk101c\repl\target Database is enabled as OpenEdge Replication: Target Agent: Name: agent1 ID: 1 Host name: State: Normal Processing Ready: Yes Critical: No Method: Asynchronous Agent is waiting for: Nothing Maximum bytes in TCP/IP message: 8500 Server/Agent connection time: Wed May 14 13:48:43 2008 Delay Interval (current / min / max): 5 / 5 / 500 Transition information: Type: Manual The last block received at: Wed May 14 13:51:14 2008 Activity information: Blocks received: 2084 Blocks processed: 2084
DSRUtil target –C monitor (page 2) Blocks acknowledged: 0 Notes processed: 207774 Transactions started: 10269 Transactions ended: 10269 Synchronization points: 33 AI Block Information: Source RDBMS Block (Seq / Block): 2 / 2516 Last Processed Block (Seq / Block): 2 / 2494 Latency Information: Repl Server behind Source DB by: 1 second(s) Current Source Database Transaction: 13906 Last Transaction Applied to Target: 13793 Target Current as of (Target, Source): Wed May 14 13:51:13 2008, Wed May 14 13:51:13 2008 with delta of 000:00:00
DSRUTIL Status DSRUTIL source/target –C status Return code indicates current state of replication server or agent. Good for automated scripts Grep for return code Replication User Guide provides details
Managing Replication – OE Management OpenEdge Management 10.1B02/3.1B02+ remotely monitor log files 10.1B02 AdminServer running on the remote system Attach OE_DB_Replication Log File Ruleset to the log file monitor Alert/email thrown when message violating rule set appears.
OE Management – Replication Log File Rule Set
Summary Terminology and Architecture Replication performance hotspots Replication availability hotspots Source and target management
Success With Replication Understand the architecture Know your requirements Analyze your system Plan, Configure, Monitor Succeed 40
For More Information, go to… PSDN www.psdn.com/library/kbcategory.jspa?categoryID=21 www.psdn.com/library/kbcategory.jspa?categoryID=334 Knowledge Centrum (esupport.progress.com) 3.1B01/10.1B01 upgrade: P122926, P123418, P123420, P123424, P123426, P123427, P123676 sizing –pica: P121969 Documentation OpenEdge Replication 10.1C docs (www.psdn.com) OpenEdge Management 3.1C docs (www.psdn.com)
? Questions
Thank You