Emil Pilecki Credit: Luca Canali, Marcin Blaszczyk, Steffen Pade
Agenda About CERN Oracle and Data Guard at CERN DG perks and benefits Zero data loss over long distances (far sync) Far sync testing results
3 About CERN European Organization for Nuclear Research founded in member states, 2 candidates, 6 observers + UNESCO and UE 60 Non-member States collaborate with CERN 2500 staff members and scientists
4 LHC and Experiments Large Hadron Collider (LHC) – particle accelerator collides beams at very high energy 27 km long circular tunnel Located ~100m underground Protons travel at % the speed of light Collisions are analysed with usage of special detectors and software in the experiments dedicated to LHC New particle discovered! Consistent with the Higgs Boson Announced on July 4th 2012
5 Oracle at CERN Since 1982, version 2.3 Oracle DBs play a key role in the LHC production chains Accelerator logging and monitoring systems Online acquisition, offline data (re)processing, data distribution, analysis Grid infrastructure and operation services Monitoring, dashboards, etc. Data management services File catalogues, file transfers, etc. Metadata and transaction processing for tape storage system Administrative services
6 CERN’s Databases Over 100 Oracle databases, mostly RAC NAS storage plus some SAN with ASM ~400 TB of data files for production DBs Examples of CERN’s critical DBs: LHC logging database ~170 TB, expected growth up to 70 TB / year 13 Production experiments’ databases ~140 TB in total 15 production systems protected with Data Guard Active Data Guard since 11g
7 Our Data Guard architecture Primary Database Active Data Guard for disaster recovery Active Data Guard for read only workloads 2. Busy & critical ADG 1. Low load ADG Active Data Guard for read only workloads and disaster recovery Primary Database Maximum performance Redo Transport LOG_ARCHIVE_DEST_X=‘SERVICE= OPTIONAL ASYNC NOAFFIRM VALID FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME= ’
8 (Active) Data Guard benefits Features and functionalities we profit from: Data protection for disaster recovery Replication and offloading read only workload Database backups from standby Safeguard logical data corruptions with flashback Snapshot standby for testing Fast upgrades and hardware migrations Detection of lost writes Automatic block media recovery
9 Disaster recovery We have been using it since a few years Switchover/failover is our first line of defence Saved the day already for production services Current disaster recovery site at 10 km away from our main datacentre Remote site in Hungary to be used soon Over 1000km distance Network latency of 25ms is a challenge Plan to move most of the standby databases there within 1 year
10 Offloading production databases Efficient replication of the whole database Workload distribution Transactional workload runs on primary Read-only workload can be moved to ADG Read-mostly workload: DMLs can be redirected to primary with a dblink Database backups from standby Significantly reduces load on primary by Removes sequential I/O of full backup ADG allows usage of block change tracking for fast incremental backups
11 Flashback and snapshot standby Flashback enabled on standby only Recover from human errors and data corruptions Avoid impacting primary database with flashback logs generation Snapshot standby Testing changes before implementing them on primary Safe – redo is still sent to standby Very easy to use SQL> ALTER DATABASE CONVERT TO SNAPSHOT STANDBY; SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
12 Fast upgrades and migrations Clusterware 11g + RDBMS 11g Clusterware 12c + RDBMS 11g Redo Transport RW Access RW Acess Clusterware 12c + RDBMS 12c RDBMS upgrade DATABASE downtime Upgrade complete!
13 Fast upgrades and migrations Risk mitigation Fresh installation of the new clusterware Old system stays untouched Allows full upgrade test Allows stress testing of new system Downtime reduction ~ 1h for RDBMS upgrade Additional hardware required unless migration to new one is expected anyway
14 Lost write detection and ABMR Slave exiting with ORA-752 exception Errors in file /ORA/dbs0a/PDBR_RAC50/diag/rdbms/pdbr_rac50/PDBR1/trace/PDBR1_pr0l_92600.trc: ORA-00752: recovery detected a lost write of a data block ORA-10567: Redo is inconsistent with data block (file# 67, block# , file offset is bytes) ORA-10564: tablespace STRMMON ORA-01110: data file 67: '/ORA/dbs03/PDBR_RAC50/datafile/STRMMON_67.dbf' ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# Mon Apr 14 06:52: Recovery Slave PR0L previously exited with exception 752 Stops redo application when a lost write is detected Previous consistent block version still on standby Helps to diagnose and repair the error Automatic Block Media Recovery with ADG Fixes physical block corruptions Works both ways: Primary ADG
15 Zero data loss replication Use synchronous redo transport method DML statements impacted due to commit acknowledgment on standby LOG_ARCHIVE_DEST_X=‘SERVICE= OPTIONAL SYNC AFFIRM VALID FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME= ’ Data Guard Standby Primary Database Redo Transport Commit Ack Network latency matters!!!
16 Long distances = high network latency = slow commit acknowledge with SYNC redo transport Far Sync concepts Redo Transport 25 ms sync async
17 Far Sync testing at CERN Functional Does it work? Are there any bugs? Performance Simulated heavy DML workload with and without Far Sync Oracle Real Application Testing – workload captured from production databases Redo Transport 25 ms Redo Far Sync
18 Far Sync testing results Functional tests It works well!!! but… : FRA not cleaned up automatically on FAR SYNC instance : Failover to alternate destination does not work with FAR SYNC Both bugs still present in production Some configuration issues with Data Guard Broker Redo Transport 25 ms Redo Far Sync
19 Far Sync testing results Performance tests with simulated heavy DML workload 256 parallel sessions inserting data in 500 row batches, 50 batches per session. The target table partitioned and indexed: 4 local b-tree indexes, 6 local bitmap indexes, global primary key index with reversed keys. Each session inserting data into it's own partition.
20 Far Sync testing results Performance tests with Oracle Real Application Testing framework Real production workload captured per schema Workload replay with and without Far Sync 25ms latency Replay parameters: connect_time_scale=0 think_time_scale=0 CMSR – DML mostly workload LCGR – read only workload
21 Far Sync summary Very promising for long distance replication if data loss is not acceptable Up to 60% performance gain (DML only workloads) with 25ms network latency Lightweight and easy to deploy (virtual machine) If latency <5ms most likely you don’t need Far Sync There are still bugs that need fixing Redo Transport 25 ms Redo Far Sync
22 Discussion