LCG 3D Project Status and Production Plans Dirk Duellmann, CERN IT On behalf of the LCG 3D project https://lcg3d.cern.ch CHEP 2006, 15th February, Mumbai.

Slides:



Advertisements
Similar presentations
CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Persistency Framework CORAL, POOL, COOL – Status and Outlook A. Valassi, R. Basset,
Advertisements

RLS Production Services Maria Girone PPARC-LCG, CERN LCG-POOL and IT-DB Physics Services 10 th GridPP Meeting, CERN, 3 rd June What is the RLS -
CERN - IT Department CH-1211 Genève 23 Switzerland t Relational Databases for the LHC Computing Grid The LCG Distributed Database Deployment.
Database monitoring and service validation Dirk Duellmann CERN IT/PSS and 3D
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
LCG 3D StatusDirk Duellmann1 LCG 3D Throughput Tests Scheduled for May - extended until end of June –Use the production database clusters at tier 1 and.
LCG Database Workshop Summary and Proposal for the First Distributed Production Phase Dirk Duellmann, CERN IT (For the LCG 3D Project :
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
ORACLE GOLDENGATE AT CERN
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
Databases Technologies and Distribution Techniques Dirk Duellmann, CERN HEPiX, Rome, April 4th 2006.
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
CHEP 2006, Mumbai13-Feb-2006 LCG Conditions Database Project COOL Development and Deployment: Status and Plans Andrea Valassi On behalf of the COOL.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Database Administrator RAL Proposed Workshop Goals Dirk Duellmann, CERN.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
3D Workshop Outline & Goals Dirk Düllmann, CERN IT More details at
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
CERN - IT Department CH-1211 Genève 23 Switzerland t COOL Conditions Database for the LHC Experiments Development and Deployment Status Andrea.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
Database Readiness Workshop Summary Dirk Duellmann, CERN IT For the LCG 3D project SC4 / pilot WLCG Service Workshop.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Database Competence Centre openlab Major Review Meeting nd February 2012 Maaike Limper Zbigniew Baranowski Luigi Gallerani Mariusz Piorkowski Anton.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
3D Project Status Dirk Duellmann, CERN IT For the LCG 3D project Meeting with LHCC Referees, March 21st 06.
CERN IT Department CH-1211 Genève 23 Switzerland t Streams Service Review Distributed Database Workshop CERN, 27 th November 2009 Eva Dafonte.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
CERN IT Department CH-1211 Geneva 23 Switzerland t WLCG Operation Coordination Luca Canali (for IT-DB) Oracle Upgrades.
LCG 3D Project Update (given to LCG MB this Monday) Dirk Duellmann CERN IT/PSS and 3D
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
Database Project Milestones (+ few status slides) Dirk Duellmann, CERN IT-PSS (
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
A quick summary and some ideas for the 2005 work plan Dirk Düllmann, CERN IT More details at
2007 Workshop on INFN ComputingRimini, 7 May 2007 Conditions Database Service implications at Tier1 and Tier2 sites Andrea Valassi (CERN IT-PSS)
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Level & Responsibilities Dirk Düllmann LCG 3D Database Workshop September,
1 LCG Distributed Deployment of Databases A Project Proposal Dirk Düllmann LCG PEB 20th July 2004.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 6 April 2005.
INFN Tier1/Tier2 Cloud WorkshopCNAF, 22 November 2006 Conditions Database Services How to implement the local replicas at Tier1 and Tier2 sites Andrea.
Database Readiness Workshop Summary Dirk Duellmann, CERN IT For the LCG 3D project GDB meeting, March 8th 06.
WLCG IPv6 deployment strategy
Status of Distributed Databases (LCG 3D)
Maria Girone, CERN – IT, Data Management Group
Dirk Duellmann CERN IT/PSS and 3D
(on behalf of the POOL team)
IT-DB Physics Services Planning for LHC start-up
LCG 3D Distributed Deployment of Databases
Database Services at CERN Status Update
3D Application Tests Application test proposals
Database Readiness Workshop Intro & Goals
LCG Distributed Deployment of Databases A Project Proposal
Conditions Data access using FroNTier Squid cache Server
Workshop Summary Dirk Duellmann.
LHCb Conditions Database TEG Workshop 7 November 2011 Marco Clemencic
3D Project Status Report
The LHCb Computing Data Challenge DC06
Presentation transcript:

LCG 3D Project Status and Production Plans Dirk Duellmann, CERN IT On behalf of the LCG 3D project CHEP 2006, 15th February, Mumbai

LCG 3D StatusDirk Duellmann2 Related Talks –LHCb conditions database framework [168 M. Clemencic] –Database access in Atlas computing model [38 A. Vaniachine] –Software for a variable Atlas detector description [67 V. Tsulaia] –Optimized access to distributed relational database system [331 J. Hrivnac] –COOL Development and Deployment - Status and Plans [337 A.Valassi] –COOL performance and distribution tests [338 A. Valassi poster] –CORAL relational database access software [329 - I. Papadopoulos] –POOL object persistency into relational databases [330 G. Govi poster]

LCG 3D StatusDirk Duellmann3 Distributed Deployment of Databases (=3D) LCG today provides an infrastructure for distributed access to file based data and file replication Physics applications (and grid services) require similar services for data in relational databases –Physics applications and grid services use RDBMS –LCG sites have already experience in providing RDBMS services Goals for common project as part of LCG –increase the availability and scalability of LCG and experiment components –allow applications to access data in a consistent, location independent way –allow to connect existing db services via data replication mechanisms –simplify a shared deployment and administration of this infrastructure during 24 x 7 operation Scope set by LCG PEB – Online - Offline - Tier sites

LCG 3D StatusDirk Duellmann4 LCG 3D Service Architecture T2 - local db cache -subset data -only local service M O O O M T1- db back bone - all data replicated - reliable service T0 - autonomous reliable service Oracle Streams http cache (SQUID) Cross DB copy & MySQL/SQLight Files O Online DB -autonomous reliable service F S S SS R/O Access at Tier 1/2 (at least initially)

LCG 3D StatusDirk Duellmann5 Building Block for Tier 0/1 - Oracle Database Clusters Two+ dual-CPU nodes Shared storage (eg FC SAN) Scale CPU and I/O ops (independently) Transparent failover and s/w patches

LCG 3D StatusDirk Duellmann6 How to keep Databases up-to-date? Asynchronous Replication via Streams CNAFRAL SinicaFNAL IN2P3BNL CERN LCR LCR LCR LCR LCR LCR LCR LCR insert into emp values ( 03, “Joan”, ….) applypropagationcaptureapplypropagationcapture Slide : Eva Dafonte Perez

LCG 3D StatusDirk Duellmann7 Further Decoupling between Databases CERN RAC SOURCE DATABASE COPY redo log files DOWNSTREAM DATABASEDESTINATION SITES CNAF FNAL CERN propagation jobs Objectives  Remove impact of capture from Tier 0 Database 2. Isolate Destination sites from each other  pair capture process + queue x each target site  big Streams pool size  redundant events ( x number of queues) capture process Slide : Eva Dafonte Perez

LCG 3D StatusDirk Duellmann8 Offline FroNTier Resources/Deployment Tier-0: 2-3 Redundant FroNTier servers. Tier-1: 2-3 Redundant Squid servers. Tier-N: 1-2 Squid Servers. Typical Squid server requirements: –CPU/MEM/DISK/NIC=1GHz/1 GB/100GB/Gbit –Network: visible to Worker LAN (private network) and WAN (internet) –Firewall: Two Ports open for URI (FroNTier Launchpad) access and SNMP monitoring (typically 8000 and 3401 respectively) Squid non-requirements –Special hardware (although high-throughput Disk I/O is good) –Cache backup (if disk dies or is corrupted, start from scratch and reload automatically) Squid is easy to install and requires little on-going administration. Squid(s) Tomcat(s) Squid DB Squid Tier 0 Tier 1 Tier N FroNTier Launchpad http JDBC Slide : Lee Lueking

LCG 3D StatusDirk Duellmann9 Test Status : 3D testbed Replication test progressing well –Offline->T1: COOL ATLAS : Stefan Stonjek (CERN, RAL, Oxford) COOL LHCb : Marco Clemencic (CERN, RAL, GridKA?) FroNtier CMS : Lee Lueking (CERN and several t1/t2 sites) ARDA AMGA: Birger Koblitz (CERN->CERN) AMI : Solveig Albrandt (IN2P3->CERN - setting up) –Online->offline: CMS Conditions : Saima Iqbal (functional testing) ATLAS : (Gancho Dimitrov) Server setup, pit network LHCb : planning with LHCb online Coordination during weekly 3D meetings

LCG 3D StatusDirk Duellmann10 LCG Database Deployment Plan After October ‘05 workshop a database deployment plan has been presented to LCG GDB and MB – Two production phases March - Sept ‘06 : partial production service –Production service (parallel to existing testbed) –H/W requirements defined by experiments/projects –Based on Oracle 10gR2 –Subset of LCG tier 1 sites: ASCC, CERN, BNL, CNAF, GridKA, IN2P3, RAL Sept ‘06- onwards : full production service –Adjusted h/w requirements (defined at summer ‘06 workshop) –Other tier 1 sites joined in: PIC, NIKHEF, NDG, TRIUMF

LCG 3D StatusDirk Duellmann11 Proposed Tier 1 Hardware Setup Propose to setup for first 6 month –2/3 dual-cpu database nodes with 2GB or more Setup as RAC cluster (preferably) per experiment ATLAS: 3 nodes with 300GB storage (after mirroring) LHCb: 2 nodes with 100GB storage (after mirroring) Shared storage (eg FibreChannel) proposed to allow for clustering –2-3 dual-cpu Squid nodes with 1GB or more Squid s/w packaged by CMS will be provided by 3D 100GB storage per node Need to clarify service responsibility (DB or admin team?) Target s/w release: Oracle 10gR2 –RedHat Enterprise Server to insure Oracle support

LCG 3D StatusDirk Duellmann12 DB Readiness Workshop last week Readiness of the production services at T0/T1 –status reports from tier 0 and tier 1 sites –technical problems with the proposed setup (RAC clusters)? Readiness of experiment (and grid) database applications –Application list, code release, data model and deployment schedule –Successful validation at T0 and (if required T1)? Review site/experiment milestones from the database project plan –(Re-)align with other work plans - eg experiment challenges, SC4 Detailed presentations of experiments and sites at –

LCG 3D StatusDirk Duellmann13 CERN Hardware evolution for 2006 Current State ALICEATLASCMSLHCbGrid3DNon-LHCValidation - 2-node offline 2-node -- 2x2-node 2-node online test Pilot on disk server Proposed structure in Q node4-node 4--node2-node2-node (PDB replacement) 2-node valid/test 2-node pilot Compass?? Online? Linear ramp-up budgeted for hardware resources in Planning next major service extension for Q3 this year Slide : Maria Girone

LCG 3D StatusDirk Duellmann14 Frontier Production Configuration at Tier 0 Squid runs in http-accelerator mode (as a reverse proxy server) Slide : Luis Ramos

LCG 3D StatusDirk Duellmann15 Tier 1 Progress Sites largely on schedule for a service start end of March –h/w either installed already (BNL, CNAF, IN2P3) or expect delivery of order shortly (GridKA, RAL) –Some problems with Oracle Clusters technology encountered and solved! –Active participation from sites - DBA community building up First DBA meeting focusing on RAC installation, setup and monitoring hosted by Rutherford scheduled for second half of March Need to involve remaining Tier 1 sites now –Establishing contact to PIC, NIKHEF, NSG, TRIUMF to follow workshops, and meetings

LCG 3D StatusDirk Duellmann16 LCG Application s/w Status Finished major step towards distributed deployment: –added common and configurable handling of server lookup, connection retry, failover and client side monitoring via CORAL –COOL and POOL have released versions based on new CORAL package[talks by I. Papadopoulos and A. Valassi] –FroNTier has been added as plug-in into CORAL CMS is working on FroNTier caching policy FroNTier apps need to implement this policy to avoid stale cached data lookups LCG persistency framework s/w expected to be stable by end of February for distributed deployment as part of SC4 or experiment challenges Caveat: the experiment conditions data model may stabilize only later -> possible deployment issues

LCG 3D StatusDirk Duellmann17 Open Issues Support for X.509 (proxy) certificates by Oracle? –May need to study possible fallback solutions Server and support licenses for Tier 1 sites Instant client distribution within LCG In discussion with Oracle via commercial contact at CERN

LCG 3D StatusDirk Duellmann18 Databases in Middleware & Castor Took place already for services used in SC3 –Existing setups at the sites –Existing experience with SC workloads -> extrapolate to real production LFC, FTS - Tier 0 and above –Low volume, but high availability requirements –CERN: Run on 2-node Oracle cluster; outside single box Oracle or MySQL CASTOR 2 - CERN and some T1 sites –Need to understand scaling up to LHC production rates Currently not driving the requirements for the database service Need to consolidate databases configs and procedures –may reduce effort/diversity at CERN and Tier 1 sites

LCG 3D StatusDirk Duellmann19 Experiment Applications Conditions - Driving the database service size at T0 and T1 –EventTAGs (may become significant - need replication tests and concrete experiment deployment models) Framework integration and DB workload generators exist –successfully tested in various COOL and POOL/FroNTier tests –T0 performance and replication tests (T0->T1) looks ok Conditions: Online -> Offline replication only starting now –May need additional emphasis for online tests to avoid surprises –CMS and ATLAS are executing online test plans Progress in defining concrete conditions data models –CMS showed most complete picture (for Magnet Test) –Still quite some uncertainty about volumes, numbers of clients

LCG 3D StatusDirk Duellmann20 Summary Database Deployment Architecture defined –Streams connected Database Clusters for Online, Tier 0 (ATLAS, CMS, LHCb) –Streams connected Database Cluster for Tier 1 (ATLAS, LHCb) –FroNTier/SQUID distribution for Tier 1/Tier 2 (CMS) –File snapshots (SQLight/MySQL) via CORAL/Octopus (ATLAS, CMS) Database Production Service and Schedule defined Setup proceeding well at Tier 0 and 1 sites –Start at end of March seems achievable for most sites Application performance tests progressing –First larger scale conditions replication tests with promising results for streams and frontier technologies Concrete conditions data models still missing for key detectors

LCG 3D StatusDirk Duellmann21 Conclusions There is little reason to believe that a distributed database service will move into stable production any quicker than any of the other grid services We should start now to ramp up to larger scale production operation to resolve the unavoidable deployment issues We need the cooperation of experiments and sites to make sure that concrete requests can be quickly validated against a concrete distributed service