POOL File Catalog: Design & Status

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
Data Management Expert Panel - WP2. WP2 Overview.
1 CHEP 2000, Roberto Barbera Tests of data management services in EDG 1.2 ALICE Off-line Week,
Futures – Alpha Cloud Deployment and Application Management.
RLS Production Services Maria Girone PPARC-LCG, CERN LCG-POOL and IT-DB Physics Services 10 th GridPP Meeting, CERN, 3 rd June What is the RLS -
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
N° 1 LCG EDG Data Management Catalogs in LCG James Casey LCG Fellow, IT-DB Group, CERN
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
 2004 Prentice Hall, Inc. All rights reserved. Chapter 25 – Perl and CGI (Common Gateway Interface) Outline 25.1 Introduction 25.2 Perl 25.3 String Processing.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
VOX Project Status T. Levshina. Talk Overview VOX Status –Registration –Globus callouts/Plug-ins –LRAS –SAZ Collaboration with VOMS EDG team Preparation.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
1 Web Servers (Chapter 21 – Pages( ) Outline 21.1 Introduction 21.2 HTTP Request Types 21.3 System Architecture.
1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.
Developing Applications with the CSI Framework A General Guide.
Shell Interface Shell Interface Functions Data. Graphical Interface Graphical Interface Command-line Interface Command-line Interface Experiments Private.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans.
LCG Distributed Databases Deployment – Kickoff Workshop Dec Database Lookup Service Kuba Zajączkowski Chi-Wei Wang.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
Data Management The European DataGrid Project Team
Some Notes on Logical File Names and Related Interfaces David Malon ATLAS Database Group LHC Persistence Workshop 5 June 2002.
DDM Central Catalogs and Central Database Pedro Salgado.
Developing GRID Applications GRACE Project
International Planetary Data Alliance Registry Development and Coordination Project Report 7 th IPDA Steering Committee Meeting July 13, 2012.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
Understand Names Resolution
Architecture Review 10/11/2004
Persistency Project (POOL) Status and Work Plan Proposal
GFAL Grid File Access Library
gLite Basic APIs Christos Filippidis
Real Time Fake Analysis at PIC
Oxana Smirnova, Jakob Nielsen (Lund University/CERN)
(on behalf of the POOL team)
z/Ware 2.0 Technical Overview
Data Bridge Solving diverse data access in scientific applications
Spitfire Overview Gavin McCance.
Marc-Elian Bégin ETICS Project, CERN
More Interfaces Workplan
Open Source distributed document DB for an enterprise
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Middleware independent Information Service
POOL: Component Overview and use of the File Catalog
CRAB and local batch submission
BOSS: the CMS interface for job summission, monitoring and bookkeeping
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
First Internal Pool Release 0.1
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Grid Data Integration In the CMS Experiment
Grid Data Replication Kurt Stockinger Scientific Data Management Group Lawrence Berkeley National Laboratory.
EGEE Middleware: gLite Information Systems (IS)
Architecture of the gLite Data Management System
Integrating SRB with the GIGGLE framework
POOL Status & Release Plan for V0.4
MIS2502: Data Analytics MySQL and MySQL Workbench
CORBA Programming B.Ramamurthy Chapter 3 5/2/2019.
Production Manager Tools (New Architecture)
Presentation transcript:

POOL File Catalog: Design & Status Zhen Xie CMS-Princeton/CERN 11/09/02 Zhen Xie

Outline Component overview Design Status Requirements Interface design Use cases Database schema Status Infrastructure Implementation Performance test (update by Maria) 11/09/02 Zhen Xie

Component Overview Goal: maintain mapping between FileID (e.g. GUID) and PFNs for file based persistency; Text/XML,MySQL,EDG-RLS based implementations are forseen for the on/off the grid production at different scales; A very simple design to start with. First internal release by the end of September. 11/09/02 Zhen Xie

Requirements Unique and immutable FileID, memorisable text string LFN ; Provide PFN-FileID,LFN-PFN mappings; Register files before,after or in the job; Available on and off the grid; Can append to the same file from different jobs; Can clean up registration when job crashes. 11/09/02 Zhen Xie

Design One common public interface IFileCatalog for two clients: POOL storage manager Command line tools for user Three implementations for three scopes of usage Trivial text/XML catalog : local laptop user Native MySQL catalog : production farm EDG-RLS catalog : virtual organization Insulated interface and implementation 11/09/02 Zhen Xie

Use Case 1: Register File 11/09/02 Zhen Xie

Off the Grid: the first PFN is returned Use Case 2: Lookup File Off the Grid: the first PFN is returned 11/09/02 Zhen Xie

Use Case 3: Isolated Production Production manager pre-register PFNs into MySQL catalog; Production runs and the catalog is populated; Production ends. Production manager cleans up the catalog deleting file registration by unsuccessful jobs; Production manager assigns LFNs to file; Production manager publishes MySQL catalog fragment to the grid based catalog. 11/09/02 Zhen Xie

Use Case 4: Jobs on Disconnected Laptop User extracts the catalog fragmentation to be used in the job from a grid based catalog into a local text/XML catalog; One job starts and ends successfully. Output PFNs are registered into another text/XML catalog; User publish selected text/XML catalogs into the grid based catalog. N jobs 11/09/02 Zhen Xie

Database Schema 11/09/02 Zhen Xie

Status: Infrastructure cvs directories created pool/FileCatalog External library /afs/cern.ch/sw/lcg/external/mysql++ Component documentation: up-to-date Work package web site: up-to-date http://lcgapp.cern.ch/project/persist/catalog 1 MySQL server, 1 RLS server, 10 client nodes 11/09/02 Zhen Xie

Status: Implementation Public interface Pure virtual IFilecatalog, IFileCatalogConnection, IFileCatalogFactory MySQL catalog implementation MySQLFileCatalog, MySQLFileCatalogConnection, MySQLFileCatalogFactory MySQL catalog can connect to MySQL server 11/09/02 Zhen Xie

Status: Performance Test (Updated by Maria Girone) Native MySQL catalog performance. Single process from single client via network: ~4ms/insert (autocommit on) see fig.1. Concurrent access: ~1.5 ms/insert (autocommit on) see fig.1. <1 ms/insert (autocommit off) see fig.2. EDG-RLS catalog performance. Always on autocommit mode. Single process from single client via network: ~2ms/insert. Concurrent access: server crash. 11/09/02 Zhen Xie

11/09/02 Zhen Xie

11/09/02 Zhen Xie