Jean-Philippe Baud, IT-GD, CERN November 2007

Slides:



Advertisements
Similar presentations
HEPiX GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD.
Advertisements

Data Management Expert Panel - WP2. WP2 Overview.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
DPM Name Server (DPNS) Namespace Authorization Location of physical files DPM Server Requests queuing and processing Space Management SRM Servers v1.1,
Database Architectures and the Web
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
W. Sliwinski – eLTC – 7March08 1 LSA & Safety – Integration of RBAC and MCS in the LHC control system.
Eric Westfall – Indiana University James Bennett – Indiana University ADMINISTERING A PRODUCTION KUALI RICE INFRASTRUCTURE.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
DPM CCRC - 1 Research and developments DPM status and plans Jean-Philippe Baud.
LFC tutorial Jean-Philippe Baud, IT-GT, CERN July 2010.
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
CERN IT Department CH-1211 Geneva 23 Switzerland t Daniel Gomez Ruben Gaspar Ignacio Coterillo * Dawid Wojcik *CERN/CSIC funded by Spanish.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/02/08 VOMS deployment Extent of VOMS usage in LCG-2 –Node types gLite 3.0 Issues Conclusions.
Transaction-based Grid Data Replication Using OGSA-DAI Presented by Yin Chen February 2007.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
EGEE is a project funded by the European Union under contract IST VO box: Experiment requirements and LCG prototype Operations.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 in DPM Sophie Lemaitre Jean-Philippe.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
AMGA-Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 05 July 2006.
Site Authorization Service Local Resource Authorization Service (VOX Project) Vijay Sekhri Tanya Levshina Fermilab.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
The VOMS and the SE in Tier2 Presenter: Sergey Dolgobrodov HEP Meeting Manchester, January 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
Security recommendations DPM Jean-Philippe Baud CERN/IT.
INFSO-RI Enabling Grids for E-sciencE Running reliable services: the LFC at CERN Sophie Lemaitre
EGEE Data Management Services
gLite Basic APIs Christos Filippidis
Manuel Brugnoli, Elisa Heymann UAB
Apache web server Quick overview.
Netscape Application Server
StoRM: a SRM solution for disk based storage systems
Status of the SRM 2.2 MoU extension
High Availability Linux (HA Linux)
The lightweight Grid-enabled Disk Pool Manager (DPM)
3D Application Tests Application test proposals
LFC Status and Futures INFN T1+T2 Cloud Workshop
Jean-Philippe Baud - Sophie Lemaitre IT-GD, CERN May 2005
StoRM Architecture and Daemons
Introduction to Data Management in EGI
T-StoRM: a StoRM testing framework
Ákos Frohner EGEE'08 September 2008
Data Management cluster summary
Database Services for CERN Deployment and Monitoring
Developing and testing enterprise Java applications
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Jean-Philippe Baud, IT-GD, CERN November 2007 LFC and DPM Jean-Philippe Baud, IT-GD, CERN November 2007

Reliability Workshop: LFC-DPM Agenda Goals for LFC and DPM DPM architecture Simple design Good coding practices Secure services Testing Operations Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Goals for LFC and DPM LFC: LCG File Catalogue Replace EDG RLS Provide hierarchical name space, access control lists, sessions and transactions DPM: Disk Pool Manager Provide a scalable solution to replace the Classic Storage Elements at Tier2s Focus on manageability Easy to install Easy to configure Low effort for ongoing maintenance Easy to add/remove resources Integrated security (authentication/authorization) Reliability Workshop: LFC-DPM

SRM-enabled client, etc. Reliability Workshop: LFC-DPM DPM architecture CLI, C API, SRM-enabled client, etc. DPM head node DPMCOPY node DPM Name Server Namespace Authorization Physical files location DPM Server Requests queuing and processing Space management SRM Servers (v1.1, v2.1, v2.2) Disk Servers Physical files Direct data transfer from/to disk server (no bottleneck) data transfer … DPM disk servers Reliability Workshop: LFC-DPM

DPM architecture (head node) Space manager Request Scheduler Persistency Server side DPM daemon DPM client Asynchronous requests to DB Interoperability Database backend DPM tables DPM client DPM client SRM v1 and v2 daemons DPNS tables DPNS daemon DPM client Maestro of metadata Metadata Control interface SRM client Insert/select data to/from the DPM tables Insert/select data to/from the DPNS tables Synchronous requests Authentication Lcg-util/gfal Control data Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Simple design (1) DPM architecture is database centric Only 2 DBs Fairly simple schema No complex query (mostly key access) Use of bind variables, indices, transactions and integrity constraints Automatic reconnection to the DB (allows transparent failover when using Oracle) Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Simple design (2) Few daemons Mainly communicating through the DB Stateless Configuration is kept in DB A given daemon can be restarted on a different server Scalability and high availability All servers (except the DPM one) can be replicated if needed (DNS load balancing) Daemons can be restarted independently Automatic retries in clients Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Good coding practices For long term maintainability of the code Portable code (compiled and tested on several platforms) Modular code with enough comments Protect against buffer overrun Check validity of parameters Check for memory leaks Avoid mutexes in multi-threaded applications for performance reason (good design is needed) Code profiling Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Security All control and I/O services have security built-in (GSI) The entries in the name space can be protected by Posix Access Control Lists All privileged operations can only be done with a Host Certificate on a trusted host VOMS integration: groups, sub-groups and roles are supported The DNs and VOMS FQANs are mapped to virtual ids (no pool account) All the groups present in the proxy are used for authorization in the namespace Only the primary group/role is used in disk pool selection Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Testing Unit tests Test of new features Test after bug fixes Functional tests Full test suite Interoperability testing (SRM) Stress tests Find the limits of the system Discover timing and corner problems Pilot service (LFC only) Test of bulk methods by Atlas Test of new permission and ownership scheme (LHCb) Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Operations Common logging format with timestamps and user identity LFC upgrade is transparent if no DB schema change and if 2 frontends are used We limit the number of DB schema updates to about once a year LFC and DPM databases do not need to run on the same machine as the frontend server Monitoring scripts (LFC) Number of threads, response time, DB errors Reliability Workshop: LFC-DPM

Reliability Workshop: LFC-DPM Conclusion The LFC and DPM have become very popular (more than 100 sites are using them for many VOs) The simple and robust design allows us to do external site support with less than one FTE at CERN Documentation: https://twiki.cern.ch/twiki/bin/view/LCG/DataManagementDocumentation Reference man pages Admin guide Troubleshooting Reliability Workshop: LFC-DPM