WLCG Collaboration Workshop CMS online/offline replication

Slides:



Advertisements
Similar presentations
Replication solutions for Oracle database 11g Zbigniew Baranowski.
Advertisements

INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Oracle High Availability Solutions RAC and Standby Database Copyright System Managers LLC 2008.
Acknowledgments Byron Bush, Scott S. Hilpert and Lee, JeongKyu
Oracle Architecture. Instances and Databases (1/2)
DB server limits (process/sessions) Carlos Fernando Gamboa, BNL Andrew Wong, TRIUMF WLCG Collaboration Workshop, CERN Geneva, April 2008.
INTRODUCTION TO ORACLE DATABASE ADMINISTRATION Lynnwood Brown System Managers LLC Introduction – Lecture 1 Copyright System Managers LLC 2007 all rights.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Oracle Clustering and Replication Technologies CCR Workshop - Otranto Barbara Martelli Gianluca Peco.
Replication Technologies at WLCG Lorena Lobato Pardavila CERN IT Department – DB Group JINR/CERN Grid and Management Information Systems, Dubna (Russia)
Harvard University Oracle Database Administration Session 2 System Level.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
Introduction History The principles of the relational model were first outlined by Dr. E.F Codd in a June 1970 paper is called “A Relational Model of Data.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Oracle Architecture. Database instance When a database is started the current state of the database is given by the data files, a set of background (BG)
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
CERN IT Department CH-1211 Genève 23 Switzerland t Streams new features in 11g Zbigniew Baranowski.
1 Copyright © 2005, Oracle. All rights reserved. Introduction.
1 Data Guard Basics Julian Dyke Independent Consultant Web Version - February 2008 juliandyke.com © 2008 Julian Dyke.
1 Copyright © 2009, Oracle. All rights reserved. Exploring the Oracle Database Architecture.
Database monitoring and service validation Dirk Duellmann CERN IT/PSS and 3D
Oracle for Software Developers. What is a relational database? Data is represented as a set of two- dimensional tables. (rows and columns) One or more.
1 © 2006 Julian Dyke Streams Julian Dyke Independent Consultant juliandyke.com Web Version.
Oracle on Windows Server Introduction to Oracle10g on Microsoft Windows Server.
5 Copyright © 2009, Oracle. All rights reserved. Right-Time Data Warehousing with OWB.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
CSE 781 – DATABASE MANAGEMENT SYSTEMS Introduction To Oracle 10g Rajika Tandon.
Oracle Tuning Considerations. Agenda Why Tune ? Why Tune ? Ways to Improve Performance Ways to Improve Performance Hardware Hardware Software Software.
1 Oracle Architectural Components. 1-2 Objectives Listing the structures involved in connecting a user to an Oracle server Listing the stages in processing.
Oracle Tuning Ashok Kapur Hawkeye Technology, Inc.
1 Data Guard. 2 Data Guard Reasons for Deployment  Site Failures  Power failure  Air conditioning failure  Flooding  Fire  Storm damage  Hurricane.
Siebel 8.0 Module 5: EIM Processing Integrating Siebel Applications.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Outline Introduction to Oracle Memory Structures SGA, PGA, SCA The Specifics of the System Global Area (SGA) Structures Overview of Program Global Areas.
Commercial RDBMSs Access and Oracle. Access DBMS Architchecture  Can be used as a standalone system on a single PC: -JET Engine -Microsoft Data Engine.
Page 1. Data Integration Using Oracle Streams A Case Study Session #:
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
Oracle® Streams for Near Real Time Asynchronous Replication Nimar S. Arora Oracle USA.
DB Questions and Answers open session Carlos Fernando Gamboa, BNL WLCG Collaboration Workshop, CERN Geneva, April 2008.
11 Copyright © 2006, Oracle. All rights reserved. Checkpoint and Redo Tuning.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
INTRODUCTION TO ORACLE DATABASE ADMINISTRATION Lynnwood Brown President System Managers LLC Introduction – Lecture 1 Copyright System Managers LLC 2003.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Windows Server 2003 系統效能監視 林寶森
HyperKVS Group Meeting Oracle Streams Dr. Volker Kuhr.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
1 11g NEW FEATURES ByVIJAY. 2 AGENDA  RESULT CACHE  INVISIBLE INDEXES  READ ONLY TABLES  DDL WAIT OPTION  ADDING COLUMN TO A TABLE WITH DEFAULT VALUE.
C Copyright © 2006, Oracle. All rights reserved. Integrating with Oracle Streams.
3 Copyright © 2004, Oracle. All rights reserved. Database Architecture Comparison.
20 Copyright © 2006, Oracle. All rights reserved. Best Practices and Operational Considerations.
1 Copyright © 2005, Oracle. All rights reserved. Oracle Database Administration: Overview.
Oracle Clustering and Replication Technologies UK Metadata Workshop - Oxford Barbara Martelli Gianluca Peco.
Marcin Bogusz CERN, PH-CMG WLCG Collaboration Workshop CMS online/offline replication Online/offline replication via Oracle Streams WLCG Collaboration.
Chapter 21 SGA Architecture and Wait Event Summarized & Presented by Yeon JongHeum IDS Lab., Seoul National University.
With Temporal Tables and More
Replication using Oracle Streams at CERN
Inside transaction logging
Streams Service Review
BizTalk Throttling and Threshold
Alejandro Álvarez on behalf of the FTS team
STREAMS failover and resynchronization
Oracle Database Monitoring and beyond
Some less known facts about log file sync and other LGWR-related waits
Case studies – Atlas and PVSS Oracle archiver
Oracle Streams Performance
Transaction Log Internals and Performance David M Maxwell
Data Definition Language
Lu Tang , Qun Huang, Patrick P. C. Lee
An Overview of GoldenGate Replication
Presentation transcript:

WLCG Collaboration Workshop CMS online/offline replication Online/offline replication via Oracle Streams WLCG Collaboration Workshop 22-27 January 2007 WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Agenda Online/offline replication setup Streams architecture Replication tests Performance Questions WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Setup OMDS @CMS ORCON @CMS Oracle Streams replication ORCOFF @IT O2O Transformation WLCG Collaboration Workshop CMS online/offline replication

TARGET DATABASE (replica) STREAMS Architecture capture staging consumption CAPTURE PROCESS LCRs SOURCE QUEUE capture changes propagate events log changes LCRs APPLY PROCESS apply changes REDO LOG DESTINATION QUEUE SOURCE DATABASE TARGET DATABASE (replica) user changes Slide by Eva Dafonte

WLCG Collaboration Workshop CMS online/offline replication Test setup Replication between two dual node RACs Capture process runs on instance #2 of d3r Apply process runs on instance #2 of test1 WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Test environment Test replicates ecalpedestals objects 7 objects have to be inserted in a single transaction Each object is 1,63 MB in size, after object-to-relational mapping 61200 st_ecalpedestals_item rows Objects are generated by PL/SQL procedure provided by Ricky Egeland We have 428,400 row inserts in a single commit, which corresponds to the number of LCRs LCR size is row-dependant WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Tables CREATE TABLE st_ecalpedestals ( iov_value_id NUMBER(10) NOT NULL, time NUMBER(38) ); alter table st_ecalpedestals add constraint st_ecalpedestals_pk primary key (iov_value_id); CREATE TABLE st_ecalpedestals_item ( pos NUMBER(10) NOT NULL, det_id NUMBER(10) NOT NULL, mean_x1 BINARY_FLOAT NOT NULL, mean_x12 BINARY_FLOAT NOT NULL, mean_x6 BINARY_FLOAT NOT NULL, rms_x1 BINARY_FLOAT NOT NULL, rms_x12 BINARY_FLOAT NOT NULL, rms_x6 BINARY_FLOAT NOT NULL alter table st_ecalpedestals_item add constraint st_ecalpedestals_item_pk primary key (iov_value_id, pos); alter table st_ecalpedestals_item add constraint st_ecalpedestals_item_fk foreign key (iov_value_id) references st_ecalpedestals (iov_value_id); WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Streams monitoring Tools strmmon Zbyszek’s monitor AWR Logminer Lemonweb Oracle Enterprise manager Parameters LCRs flow Streams pools size Redo generated on source and destination databases Replication time WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Performance Goal : to reduce replication time = increase throughput Seemed that replication time was dependant on interval between test runs Periodical time structure of the results Responsible for differences: KJC Wait events (AWR) Solution – 10.2.0.3 patch set WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Performance 10.2.0.3 Replication time – 300s With 11.4 MB payload 0.038 MB/s 1428 row inserts per second Average LCR apply rate – 2700/s AWR report Both CPUs utilized in ~50% 120 seconds of Streams „AQ: enqueue blocked due to flow control” wait events Lemonweb Very high I/O during test run (over 1 GB) WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Redo High I/O turned out to be generated by Log Writer (LGWR) process Further investigation has shown that the redo on the destination site was 12 times larger than on the source database Log mining has shown that the redo is generated due to inserting to STREAMS$_APPLY_SPILL_MSGS_PART table Typical redo entry for st_ecalpedestals_item table was 75 bytes vs. 590 bytes for spill queue Redo volume on the destination is ~700 MB, both tables insert-related redo is ~300 MB so we still have 400MB unaccounted for WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication LCRs Flow WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Redo destination WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Spilling Inserting data to STREAMS$_APPLY_SPILL_MSGS_PART table is related to the architecture of the apply process The messages are spilled to the disk from the memory Apply process’ TXN_LCR_SPILL_THRESHOLD user modifiable parameter can change the spill threshold Increasing the threshold over the number of LCRs in a transaction should disable the apply spilling However, increasing this parameter degrades the performance WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Streams pool Pool size is set to 1.3 GB Pool usage never exceeds 75% No matter how high the spill threshold is No matter how big the transaction is Pool de-allocation problem WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Queue spilling Spilling when messages enqueued for longer period of time probably not the case Spilling when there’s no memory available maybe the case, but 1.3 GB of streams pool! Higher spill thresholds lead to „propagation gaps” and degrade performance, default threshold is 10,000 Related to flow control Affects source database redo volume (queue spilling) While spill threshold set to values over 200K messages, queue spilling occurs – other than apply spilling Disk spilling inevitable given current constraints WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Strmmon - destination 2007-01-19 10:55:50 || test12-> | NET 4K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:55:52 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <100%I 0%F -> 2007-01-19 10:55:54 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:55:56 || test12-> | NET 821K 0 | PR01 2526 | Q570584 2526 0 | - A003 0 0 19sec <100%I 0%F -> 2007-01-19 10:55:58 || test12-> | NET 1M 0 | PR01 3945 | Q570584 3940 0 | - A003 0 0 19sec <100%I 0%F -> 2007-01-19 10:56:00 || test12-> | NET 1M 0 | PR01 6005 | Q570584 6009 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:02 || test12-> | NET 356K 0 | PR01 1103 | Q570584 1099 0 | - A003 0 0 19sec <100%I 0%F -> 2007-01-19 10:56:04 || test12-> | NET 2M 0 | PR01 8197 | Q570584 8201 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:06 || test12-> | NET 316K 0 | PR01 1004 | Q570584 997 0 | - A003 0 0 19sec <100%I 0%F -> 2007-01-19 10:56:08 || test12-> | NET 1M 0 | PR01 4745 | Q570584 4753 0 | - A003 0 0 19sec <43%I 0%F -> 2007-01-19 10:56:10 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:12 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:14 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:16 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:18 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:20 || test12-> | NET 5K 0 | PR01 0 | Q570584 0 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:22 || test12-> | NET 203K 0 | PR01 600 | Q570584 600 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:25 || test12-> | NET 884K 0 | PR01 2750 | Q570584 2750 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:27 || test12-> | NET 884K 0 | PR01 2750 | Q570584 2750 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:29 || test12-> | NET 868K 0 | PR01 2750 | Q570584 2750 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:31 || test12-> | NET 884K 0 | PR01 2750 | Q570584 2750 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:33 || test12-> | NET 884K 0 | PR01 2746 | Q570584 2741 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:35 || test12-> | NET 772K 0 | PR01 2510 | Q570584 2507 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:37 || test12-> | NET 882K 0 | PR01 2757 | Q570584 2757 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:39 || test12-> | NET 996K 0 | PR01 3046 | Q570584 3045 0 | - A003 0 0 19sec <0%I 0%F -> 2007-01-19 10:56:41 || test12-> | NET 756K 0 | PR01 2290 | Q570584 2298 0 | - A003 0 0 19sec <0%I 0%F -> WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication 2 x spilling WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication 2 x spilling WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication LCR performance Performance greatly depends on the number of LCRs in a transaction, rather than LCR size Test with BLOB objects instead of relational data The same payload but binary data Inserting 7 x 1.63 MB binary objects Such a transaction produces ~3600 LCRs Replication time ~30 seconds (factor 10 difference) No queue spilling Probably no apply spilling Redo ratio source/destination is 45/66 (MB) WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Questions? WLCG Collaboration Workshop CMS online/offline replication

WLCG Collaboration Workshop CMS online/offline replication Thank you WLCG Collaboration Workshop CMS online/offline replication