Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF
2 Outline LCG 3D (Distributed Deployment of Databases) project status Oracle High Availability/Replication features MySQL High Availability/Replication features Databases in the GRID Oracle replication case study: LFC MySQL replication case study: VOMS
3 M M LCG 3D Service Architecture T2 - local db cache -subset data -only local service O O O T1- db back bone - all data replicated - reliable service T0 - autonomous reliable service Oracle Streams http cache (SQUID) Cross DB copy & MySQL/SQLight Files O Online DB -autonomous reliable service F S S SS R/O Access at Tier 1/2 (at least initially) Successfully Implemented Not Implemented Is it possible/interesting to investigate Oracle Heterogeneus Connectivity for Tier-1 to Tier-2 replication?
4 Oracle Building Blocks ASM RAC Each cloud has to guarantee high availability, scalability, fault tolerance CNAF High availability achieved at different levels: Storage H/W level: RAID, Storage Area Network Storage Logic level: logical volume manager Automatic Storage Manager Database level: Real Application Clusters. Database shared among different servers. Load balancing, connection retries, failover implemented in Oracle drivers (quasi-transparent to applications) Disaster recovery: Recovery MANager backups (RMAN) Retention policy on disk: 2 days Retention policy on tape 31 days Availability rate: 98,7% in 2007 Availability (%) = Uptime/(Uptime + Target Downtime + Agent Downtime)
5 Master DB Replica DB Queue Redo Log Database Objects Capture Queue Apply Database Objects LCR Oracle Streams Replication Propagation
6 MySQL High Availability and replication features Master – Slave replication: Referred as Asynchronous replication Available since 3.23 stable and reliable feature Some examples of it in GRID production deployment (VOMS) The original databases is managed by master. The slave manages a copy of the original databases. The update queries (update, delete and insert in SQL jargon) must be executed only on the master host. SQL and update commands are replicated, not the changed data Multimaster replication Available since 5.0 new and not fully tested feature Possible only under particular conditions which allow for simple conflict resolution policies MySQL cluster Referred as Synchronous replication It doesn’t seem to be a stable feature as you can read from the MySQL 5.1 manual “This chapter represents a work in progress, and its contents are subject to revision as MySQL Cluster continues to evolve” I Know of no MySQL production systems currently deployed as cluster
7 Databases in GRID services Databases are key components of various GRID components (list not exhaustive): FTS Database used for data persistency MySQL and Oracle backends supported, but Oracle is recommended High availability through Clusters LFC MySQL and Oracle backends supported Both MySQL and Oracle replication supported VOMS MySQL and Oracle backends supported Both MySQL and Oracle replication are supported
8 Oracle replication case study: LFC LFC: LCG File Catalog is a high performance file catalog which stores LFN GUIDPFN mappings. Oracle One-way Streams replication is used in WLCG in order to balance the load of LFC read-only requests among different catalog residing in various Tier-1s The LFC code has been slightly modified in order to prevent an user to accidentally write into a read – only catalog. The only thing an administrator has to do, is to set the variable RUN_READONLY="yes" in the /etc/sysconfig/lfcdaemon configuration file. Database replication has to replicate all tables except CNS_USERINFO and CNS_GROUPINFO In case of write attempts on the read-only LFC, you would get an error: $ lfc-mkdir /grid/dteam/hello cannot create /grid/dteam/hello: Read-only file system Replication speed requirements are not very strict: Update frequency ~ 1 Hz Replication latency < 10 min
9 LHCb LFC Replication deployment CERN-CNAF 2 nodes Cluster Replica Oracle DB 6 nodes Cluster Master Oracle DB Oracle Streams Read Only Clients WAN LFC R-O Server LFC R-W Server LFC R-W Server Read Only Clients CERN CNAF LFC R-O Server r/w Clients LFC R-O Server Stress test: insertions at 900 Hz for 24 hours Max latency : 55 sec Mean latency: 15 sec Full Consistency maintained
10 MySQL replication case study: VOMS The Virtual Organization Membership Service server manages authorization data provides a database of users, groups, roles and capabilities that are grouped in Virtual Organizations (VO's) users query the VOMS Server in order to get their VO grid credentials (proxy) read-only operations originated by various command such as voms-proxy-info. They could be balanced across read only VOMS replicas write operations are originated by mk-gridmap and voms-proxy-init commands Expected write-rate on the VOMS server is: 1 Hz of voms-proxy-init Peaks of 100 Hz of mk-gridmap (to be fixed) A MySQL master-slave replication deployment can be useful for load balancing and fail over in case of read-only operations VOMS supports MySQL one-way replication. Some examples of VOMS on replicated MySQL: LIP (Portugal) Fermilab CNAF – INFN Padova (CDF VOMS)
11 VOMS replicated deployment VOMS code has been adapted to MySQL replication, it provides a script which creates a slave MySQL replica, given a Master MySQL and a consistent dump. Concurrent writes VOMS server has a web component, running in a web container provided by TomCat System that keeps the administration interface. Problem: The administration interface running on a slave host will update the seqnumber and realtime tables of each VO database. Solution: Data from those tables must not be replicated to the slave hosts. replicate-ignore-table=VOMS_seqnumber replicate-ignore-table=VOMS_realtime Some stress tests performed by Fermilab: VOMS MySQL successfully queried at 125Hz (10.8M/day) System load – 0.2, CPU – 10% (dual-core machine) Simulated failures of one VOMS servers Disabled network: New requests not routed to failed server Re-enabled network: server added back to the pool for scheduling Open connections during service failure are lost Affected number of connections is very small (1-2) Simulated failure of MySQL server After re-enabling server, transaction logs replayed automatically VOMS on Oracle replication is under test and will be available soon
12 Conclusions Different high availability/redundancy techniques have been tested in WLCG environment and allow for a good availability of GRID database services Both Oracle and MySQL replication solutions have been deployed in WLCG and offer different solutions in order to address different kind of load LCG 3D project have developed a Tier-0 to Tier-1 replication but have left the Tier-1 to Tier-2 distribution issues to sites. Do we need to address them?