Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Implementation and performance analysis of.

Similar presentations


Presentation on theme: "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Implementation and performance analysis of."— Presentation transcript:

1 EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Implementation and performance analysis of the LHCb LFC replica using Oracle Streams technology BARANOWSKI Zbigniew (CERN) BONIFAZI Federico (INFN-CNAF) CARBONE Angelo (INFN-CNAF) DAFONTE PEREZ Eva (CERN) DUELLMANN Dirk (CERN) MARTELLI Barbara (INFN-CNAF) PECO Gianluca (INFN-Bologna) VAGNONI Vincenzo (INFN-Bologna)

2 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 2 Overview LHCb LFC replica implementation in collaboration with 3D project and CNAF –Oracle technologies deployed Tests description and results –Measure the latency between source and destination database as function of writing rate Conclusions

3 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 3 LHCb compunting model Every LHCb application running at T0/T1/T2 needs to read/write from the LFC Every T1+CERN will run reconstruction using information stored in the conditions database The LHCb computing model foresees the LFC and conditions database replication at each T1 The database replication becomes quite important in order to assure –Scalability –Geographic redundancy –Fault tolerance

4 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 4 Oracle LFC and conditions database deployment The Oracle LFC and conditions databases are implemented using high availability technologies: –Storage level: protection from disk failures is achieved using Oracle Automatic Storage Management (ASM) on a Storage Area Network. –Database level: Oracle Real Application Cluster allows sharing of database across multiple instances. –Replication Level: Oracle Streams enables the propagation and management of data, transactions and events in a data stream from one database to another

5 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 5 Oracle Automatic Storage Management ASM automatically –optimizes storage to maximize performance across the disks that ASM manages. –distributes the storage load across all of the available disks. –partitions your total disk space requirements into uniformly sized units across all disks in a disk group. –mirrors data to prevent data loss. 1(4 raw) TB of database dedicated storage based on Fibre Channel technologies, shared between LFC and conditions database DATA ~2TB ASM writes RAID5 RAID1 RECOVERY ~2TB CNAF Implementation RAID1

6 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 6 Oracle Real Application Clusters Oracle Servers Storage Area Network Goal: The Oracle Real Application Cluster technology allows to share a database amongst several database servers. How: A modular architecture that can scale up to a large number of components Scalability: you can add nodes dynamically to meet increased performance needs. High Availability: if a node fails, the others remain open and active.

7 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 7 Replication with streams (I) The Streams technology enables information sharing –The writing operation performed on the master database are propagate by means of streams to the replica database –Streams information flow has 3 phases: Capture, Staging and Consumption. Destination DB Replica Source DB Master Oracle Streams WAN CERN CNAF Consumption Staging Capture

8 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 8 Replication with streams (II) CAPTURE : Source database events, i.e. changes made to tables, schemas, or an entire database, are captured, filtered and stored in LCR (Logical Change Record) –LCR can be consider as elementary database operations that correspond, in the most of the cases, to a single row modification. STAGING : Streams publishes captured LCR into a staging area –Implemented as a queue –Use a temporary buffer in order to quickly access to the queue –If the filling rate become too high, the buffer of the Streams queue becomes full and Oracle needs to write the LCR on the disk (persistent part of the queue). This decreases performances. (Spill Over) COMSUMPTION: Staged events are consumed by subscribers to the destination database –Apply process

9 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 9 LHCb LFC Replication deployment CERN-CNAF 2 nodes Cluster Replica Oracle DB 6 nodes Cluster Master Oracle DB Oracle Streams Read Only Clients WAN LFC R-O Server LFC R-W Server LFC R-W Server LFC R-O Server Read Only Clients CERN CNAF LFC R-O Server Population Clients

10 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 10 LHCb LFC usage and test Monte Carlo simulation –Transfer output from a MC job to one or more Storage Element and register the file in the catalogue (write) Data processing (raw data reconstruction, analysis, stripping, etc…) –Send the job to the T1 site where the data are available and produce an output to be registered (read/write) Data transfer –find the replica to transfer, perform the transfer and register the new destination (read/write ) Etc… In order to use a replicated database is quite important that master and replica database are synchronized –Measure the latency between source a destination databases. –LHCb requirements: less then 30 minutes.

11 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 11 LHCb LFC Replication Tests Two different tests have been realized to evaluate –the time latency between the master and replicated database –the performance of the LFC front-end with writing/deleting operations as a function of increasing number of clients Python scripts using LFC API functions add files and replica to –lfc-lhcb.cern.ch (Master database) –Tests perform with increasing number of simultaneously writing and deleting clients 10,20,40,76 –For each number of clients (10,20,40,76) added:  8K files and 10 replica for each file (similar to LHCb usage)  Test I  16K files and 25 replica for each file (beyond LHCb usage)  Test II  The load is uniformly distributed over the clients

12 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 12 Stmmon web monitorin tool All the measurements and plots shown are taken from Strmmon web monitoring tool setup by 3D project –http://itrac315.cern.ch:4889/streams/streams.php –The tool plots the monitoring streams quantities on web previously stored on a dedicated repository database –Very useful to measure  The total LCR latency (time elapsed between the creation of the LCR at the master and the apply to the destination database)  LCR rate (captured, queued, dequeued, applied)

13 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 13 Test results [TEST I] TEST I : add and delete 8K files (10 replicas for each file) with 10,20,40,76 parallel clients (~90K entries) –Used two tables  CNS_FILE_METADATA  CNS_FILE_REPLICA 240 LCR/s 0 [s] 650 LCR/s 0 [s] Add entries (10 clients) Delete entries (10 clients) Latency (delete, 10 clients) 20 sec Latency 0 sec [s]

14 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 14 Test results [Test I] 550 0 [s] Add entries (76 clients) 1500 0 [s] delete entries (76 clients) Latency (add, 76 clients) 16 sec 0 sec [s] 20 sec Latency (delete, 76 clients) 0 sec [s] LCR/s Apply Latency

15 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 15 Test results [Test I] Average LCR/s as function of clients LCR/s [clients] ▲ delete entries ■ add entries Linear growing of LCR rate as a function of writing and deleting clients Latency, stable (12/13 sec. ), independent from the number of clients Average latency as function of clients [clients] Latency [s] ▲ delete entries ■ add entries

16 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 16 Test result [Test II] Test II: add and delete 16K files (25 replicas for each file) with 10,20,40,76 parallel clients (~560K entries) –Adding more replica per file increase the LCR rate 1500 LCR/s 0 [s] delete entries (76 clients) Test I 2000 LCR/s 0 [s] delete entries (76 clients) Test II

17 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 17 Test result [Test II] Linear increasing of LCR are as function of the clients –but with 76 clients the LFC front-end becomes a limit At the rate of 900 LCR/s the replication starts to accumulate latency –Increase the I/O at the source database due to grow activities –the latency is still to far from LHCb requirements Average LCR/s as function of clients LCR/s [clients] ▲ delete entries ■ add entries Average latency as function of clients [clients] ▲ delete entries ■ add entries Latency [s]

18 Enabling Grids for E-sciencE 2 nd EGEE User Forum, Manchester Angelo Carbone (CNAF) 18 Conclusion High Availability is a key issue for database services and is well addressed by present Oracle technologies. 3D project and CNAF have successfully deployed such technologies achieving good stability and reliability of the service. Writing 125 entries/s and deleting 375 entries/s the replication has an average latency below 14 seconds. Increasing the rate of a factor 2 the latency on average is below 80 seconds –the LHCb requirements are still met. This work is in progress –LFC replication doesn’t present problems as expected –The conditions database will have larger volume and throughput requirements.


Download ppt "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Implementation and performance analysis of."

Similar presentations


Ads by Google