DAT 412:Transactional Replication Internals Matt Hollingsworth Program Manager Microsoft Corporation
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Transactional Replication Snapshot Agent \\Snapshot Share Publishing database Subscribing database Distribution Agent DB log Distribution database Log Reader Agent Snapshot Agent writes initial data to Snapshot Share Publisher changes tracked and extracted by Log Reader Agent Distribution database (store) acts as reliable queue Distribution Agent forwards changes to Subscribers
Transactional Replication Updating subscribers Publisher Distributor Read-Only Subscriber Queued Updating Subscriber Queue NETWORK Queue Reader Agent Immediate Updating Subscriber 2PC, RPC
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
General Agent Overview Agents implement the core functions of replication Initiated via SQLAgent or from command line Maintain connections to repl server instances Connection info supplied on agent command line Can use integrated or SQL Server security SQL Server security: SecurityMode=0, login and password Integrated security: Security Mode=1, no login or password LoginTimeOut and QueryTimeOut configurable Can be scheduled or run on demand Continuous: sleeps for PollingInterval between loops Logging and History HistoryVerboseLevel [1|2|3] OutputVerboseLevel [0|1|2] Output ErrorFile
Agenda Review General Agent Properties Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Snapshot Agent Algorithm Connects to the Publisher and Distributor For each article in the publication Reads publisher schema metadata Generates schema scripts into snapshot directory according to user article options (.sch, .idx, .dri files) Bulk copies table data into snapshot directory (.bcp files) according to user defined row and column filters Updates status for subscription to active
Snapshot Agent Parameters MaxBcpThreads Improves parallelism during BCP out Misc. ReplicationType =1 (transactional) RowDelimiter FieldDelimiter
Snapshot Sync Properties Controls how the snapshot is generated and applied sp_addpublication @sync_method sets snapshot format (character or native) sets snapshot concurrency behavior @immediate_sync true: subscriber can retrieve the initial data immediately after subscribing false: subscriber must wait for next snapshot to synchronize @pre/post_snapshot_script executes user defined custom scripts
Snapshot Scripting Properties How to configure sp_addarticle @pre_creation_cmd none, delete, drop or truncate sp_addarticle @schema_option 0x00 User specified creation script (@creation_script) 0x01 Object creation 0x02 Custom SP’s 0x10 Clustered indexes 0x20 Convert UDT’s to base data types 0x40 Nonclustered indexes 0x80 PK Indexes 0x100 Triggers 0x200 FK Constraints 0x800 Defaults 0x1000 Column-level collation 0x2000 Extended properties 0x4000 UK Constraints 0x8000 PK Constraints UI: configure via article properties page
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Log Reader Agent Algorithm Connects to the publisher and distributor Scans the publisher log, building a list of committed tx which are marked for replication = sp_replcmds. For each command in the committed transactions For each publication receiving the command inserts data into the distributor using the user specified article format and column filter = sp_Msadd_repl_cmds27 if the row matches the specified user row filter Moves replication truncation mark in log = sp_repldone Commits transaction batch at distributor Logs history messages into the distribution database Note: Log Reader must run before a concurrent snapshot is available for subscribers
demo Log Reader Agent Profiler Trace
Log Reader Agent Parameters Batching ReadBatchSize – max number of transactions read from publisher log per processing loop always honors a tx boundary ReadBatchThreshold – max number of commands read from the publisher log per processing loop MaxCmdsInTran – max number of commands written to distributor as a transaction Warning: forces a commit when the limit is reached. This may result in transactional inconsistency at the subscriber in certain cases. Other Parameters PacketSize MessageInterval
Command Formats The Log Reader writes statements as “commands” into the distributor MSrepl_commands table The command format is defined by the user as an article property These commands are applied directly to the subscriber by the distribution agent This flexibility is valuable for application specific processing and performance tuning
“None” Command Format Log Reader skips the specified command type instead of inserting it into the distribution db When to use Great for auditing, use “None” for the delete command How to configure sp_addarticle @ins/upd/del_cmd = ‘None’ UI: configure via article properties page
“SQL” Command Format Commands stored in distribution DB as SQL inserts, updates, and deletes When to use Heterogeneous subscribers are present Lower distribution performance is acceptable Less SQL plan sharing occurs at the subscriber How to configure sp_addarticle @ins/upd/del_cmd = ‘SQL’ To include column names in insert statements, set @status = 8 ODBC parameterization improves performance. Set @status = 16 or 24 UI: configure via article properties page
Custom Command Formats Commands stored in distribution db as stored procedure calls = sp_Msins/upd/del<article> When to use Highest performance distribution Provides application flexibility since the stored procedure can be customized No heterogeneous subscribers are present How to configure Specify @schema_option = 0x02 to generate subscriber side stored procedures Configure @status = 16 to improve performance via parameterization UI: configure via article properties page
‘CALL’ Command Format How SP parameters are specified When to use Inserts all column values Updates initial pk column values all final column values for the updated row Delete parameters generated pk column values When to use for highest insert and delete performance when custom subscriber logic doesn’t require the values for columns in deleted rows How to configure sp_addarticle @ins/upd/del_cmd = ‘CALL’ UI: configure via article properties page
‘MCALL’ Command Format How SP parameters are specified Inserts not supported Updates initial pk column values final column values for the specific columns updated bitmask indicating which columns were updated Deletes When to use for highest update performance when custom subscriber logic doesn’t require the before values for columns How to configure sp_addarticle @ins/upd/del_cmd = ‘MCALL’ UI: configure via article properties page
‘XCALL’ Command Format How SP parameters are specified Inserts not supported Updates all initial and final column values Deletes all initial column values When to use For the most flexible subscriber side custom logic and transforms How to configure sp_addarticle @ins/upd/del_cmd = ‘XCALL’ UI: article properties page
demo Browsing Distributor Commands with sp_browsereplcmds
User Trigger Considerations Some publisher statements are applied as different sets of statements at the subscriber Concurrent snapshot processing applies compensating commands after snapshot is delivered to subscriber By default, SQL Server logs updates to unique constraints in replicated tables as as deletes followed by inserts SQL Server 2000 SP1 added new dbcc option that converts single row unique constraint updates from delete/insert pairs back into single update statements for the subscriber This trace flag is not available for queued and immediate updating subscribers How to configure dbcc traceon(8207,-1) Note: updates to columns in unique keys that affect multiple rows are always applied to the subscriber as deletes followed by inserts.
Stored Proc Considerations Two SP publishing methods exist: Replicate the rows affected by the SP Replicate the execution of the SP When to use Publish execution for SP’s that affect large numbers of rows. Provides significant performance improvement (10x). How to configure sp_addarticle @type value of ’proc exec’ replicates via RPC value of ’serializable proc exec’ replicates via RPC if publisher call is within a serializable transaction other values will replicate the individual rows affected by the stored procedure UI: article properties page
View Considerations Standard Views Indexed Views Replicated as scripted schema object definitions Requires publishing the underlying table(s) Indexed Views Replicated as a base object like a table only the changes to the indexed view data are replicated underlying table(s) do not need to be published
Text/Image Considerations Size Server option ‘max text repl size’ sets the largest text/image value on the replicated table Data beyond the max size is not allowed Error: Length of text data 1000 to be replicated exceeds configured maximum: 500." Text/Image Update Limitations with Repl No updates during concurrent snapshot Repeatable read tx isolation level or greater required when using text pointers XCALL before image not supported No updates at subscriber for text/image
Loopback Considerations Includes originating server instance and database with each distributor command to prevent looping command distribution scenarios When to use Necessary for Bi-directional (reciprocal) topologies where each database publishes and subscribes to the other How to Configure sp_addsubscription @loopback_detection = ‘true’ for both subscriptions
Backup Considerations Publisher restored to previous point-in-time Problem: distributor and subscriber may contain rows later than restored database! Solution: set “sync with backup” option on publisher to prevent commands from inserting into distributor until the log entries are backed up Side Effect: increases command distribution latency Distributor restored to previous point-in-time Problem: Distributor loses commands that have not been delivered to subscribers when they are already truncated from publisher log! Solution: set “sync with backup” option on distributor to prevent publisher log truncation until the distributor has backed up entries Side Effect: may increase size of publisher log Subscriber restored to previous point-in-time No issues if within the distributor retention interval since transaction timestamp on subscriber is also restored
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Distribution Agent Algorithm Connects to the distributor, subscriber, and snapshot share Applies scripts and bcp’s snapshot data into subscriber. Depends on status and sp_addsubscription @sync_type Gets next batch of commands for the subscription from the distributor MSrepl_commands table = sp_msget_repl_commands Determines last applied xact id for subscription by querying MSreplication_subscriptions transaction_timestamp Applies the literal commands directly to the subscriber Commits the batch
demo Distribution Agent Profiler Trace
Distribution Agent Parameters Performance related parameters BcpBatchSize CommitBatchSize CommitBatchThreshold MaxBcpThreads MaxDeliveredTransactions UseInprocLoader Misc parameters MessageInterval TransactionsPerHistory NoTextInitOnSync ProfileName DefinitionFile Publication QuotedIdentifier [0|1|2] Note: FTP parameters are obsolete
Distribution Tuning Tips Offload distribution agent load Configure pull subscriptions Tune databases Minimize indexes on subscriber Tune I/O for subscriber and distributor: RAID 0+1 for data, separate log Increase Batching Increase CommitBatchSize, CommitBatchThreshold, and MaxBcpBatchSize parameters Increase agent parallelism Use independent agent: sp_addpublication @independent_agent =’true’ Configure multiple publications for each subscriber according to transactional requirements Increase MaxBcpThreads parameters
Scripting Replication Stored Procs To manually generate subscriber replication procs exec sp_helparticle (to get article id) exec sp_scriptinsproc exec sp_scriptupdproc exec sp_scriptdelproc To manually generate queued conflict tables and triggers sp_makeconflicttable sp_addsynctriggers
Scripting Replication Stored Procs(2) Useful for manually generating an update proc that is optimized for tables with large numbers of indexes Exec Sp_scriptdynupdproc Subscriber Procedure Indexes on subscriber table Throughput, cmds/sec Default MCALL 1 clustered index 1,570 1 clustered index, 10 nonclustered indexes 110 Dynamic MCALL 1,180
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Queued Subscriber Trigger Logic SQL DML modification to subscription table occurs Replication trigger fires and inserts data into local subscriber queue Queue can be a SQL Server table (msreplication_queue) or MSMQ Queue Reader Agent will later collect these changes from the queue and apply to the publisher
Queue Reader Agent Algorithm Connects to the distributor, subscriber, and publisher For each subscription serviced by the distributor Gets next batch of commands from the subscriber queue = MSreplication_queue via stored procedure = sp_replsqlqgetrows Applies commands to the publisher table sp_MSsync_ins/upd/del_<table>_# Note: manages an internal thread pool to keep CPU’s busy and increase parallelization. Groups approximately 1000 commands per batch. Commits the publisher rows Conflicts detected via msrepl_tran_version uniqueidentifier column differences Note: ranged updates (update foo set x=x+1) to unique key columns will fail at the publisher since they are applied as individual update statements.
demo Queue Reader Agent Profiler Trace
Queue Reader Agent Parms You can configure who wins in case of conflicting updates ResolverState [1|2|3] Publisher wins conflict Publisher wins conflict with reinitialization of subscriber Subscriber overwrites existing Publisher values
Administering the Queues sp_getqueuedrows Returns the transactions that are stored in the queue sp_replqueuemonitor Returns the commands that are stored in the queue sp_setreplfailovermode Switches between immediate updating and queued updating modes
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Distributor Cleanup Agents Distribution Cleanup Deletes rows from the distribution Msrepl_cmds and Msrepl_transactions table which are outside the distribution database retention period sp_adddistributiondb @max_distretention, sp_MSdistribution_cleanup History and Logging Cleanup Deletes rows from each distributor agent history table which are outside the history retention period sp_adddistributiondb @max_historycleanup, sp_MShistory_cleanup @history_retention Expired Subscription Cleanup Removes subscriptions which have not synchronized within the publication retention period sp_addpublication @retention sp_expired_subscription_cleanup
Other Replication Agents Replication Checkup Raises error 14151 for all agents that have not logged messages within the specified interval sp_replication_agent_checkup @heartbeat_interval Re-init Subs with Data Validation Failures Marks all subscriptions for re-initialization that have failed a validation check sp_MSreinit_failed_subscriptions
Agenda Review General Agent Overview Snapshot Agent Log Reader Agent Distribution Agent Queue reader Agent Other Replication Agents Summary
Replication Summary Fast Flexible Consistent Can deliver 2500 commands per second Latency is often a few seconds Flexible Modular agent approach maximizes application flexibility Stored procedures and DMO permit fine-grained control Consistent Data is kept transactionally consistent
Other Resources DAT411 Programming and Deploying Microsoft SQL Server 2000 Replication: Lessons Learned Thursday 15:00 - 16:15, Room 3 Transactional Replication Performance Tuning whitepaper http://www.microsoft.com/SQL/techinfo/administration/2000/ReplPerf.asp
Don’t forget to complete the on-line Session Feedback form on the Attendee Web site https://web.mseventseurope.com/teched/
© 2002 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.