GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October 16--20, 2000.

Slides:



Advertisements
Similar presentations
WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Advertisements

ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Author - Title- Date - n° 1 GDMP The European DataGrid Project Team
NIKHEF Testbed 1 Plans for the coming three months.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – Title – n° 1 Grid Data Management in Action Experience in Running and.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
L.M.Barone – INFN Rome October 2000 ACAT FNAL Management of Large Scale Data Productions for the CMS Experiment Presented by L.M.Barone Università.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
1 Object Level Physics Data Replication in the Grid Koen Holtman Caltech/CMS ACAT’2000, Fermilab October 16-20, 2000.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid Applications Federico Carminati WP6 WorkShop December 11, 2000.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
File and Object Replication in Data Grids Chin-Yi Tsai.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
1 / 18 Federal University of Rio de Janeiro – COPPE/UFRJ Author : Wladimir S. Meyer – Doctorate Student Advisors : Jano Moreira de Souza – Ph.D. Milton.
Grid Workload Management Massimo Sgaravatto INFN Padova.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
1 Grid Related Activities at Caltech Koen Holtman Caltech/CMS PPDG meeting, Argonne July 13-14, 2000.
4/20/02APS April Meeting1 Database Replication at Remote sites in PHENIX Indrani D. Ojha Vanderbilt University (for PHENIX Collaboration)
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Magda Distributed Data Manager Status Torre Wenaus BNL ATLAS Data Challenge Workshop Feb 1, 2002 CERN.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.
Status of the US-CMS “Core Applications Software” Project Ian Fisk UCSD Acting Deputy Level 2 Project Manager US-CMS FNAL Oversight Panel October 24, 2000.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
Oracle to MySQL synchronization Gianni Pucciani CERN, University of Pisa.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
A proposal: from CDR to CDH 1 Paolo Valente – INFN Roma [Acknowledgements to A. Di Girolamo] Liverpool, Aug. 2013NA62 collaboration meeting.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
Data Management The European DataGrid Project Team
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
1 Application status F.Carminati 11 December 2001.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Current Globus Developments Jennifer Schopf, ANL.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
WP2: Data Management Gavin McCance University of Glasgow.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
CASTOR: possible evolution into the LHC era
U.S. ATLAS Grid Production Experience
Moving the LHCb Monte Carlo production system to the GRID
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Status and plans for bookkeeping system and production tools
Presentation transcript:

GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000

Mission Statement Use case (CMS) requirements Data Model Architecture and Middle-ware Integration into the CMS environment Deliverables, milestones and status Performance results Conclusions Introduction

A prototype project for data management modules of other projects like DataGrid, PPDG and GriPhyn. Can be used to test new ideas, strategies and tools to be later adopted. A project with low inertia and the team is open to everyone for collaboration. CMS as the first use-case so currently a CMS specific implementation. Discussions with Babar, ROOT and LHCb are in progress. Mission Statement

Data Management Requirements –Data production (currently tens of Terabytes). –Managing the data locally at regional centers. –Replicating this data to other centers. High speed transfers. Secure access. Disk management facilities Minimizing human interference. Fault tolerance and error recovery mechanisms. –Data integration on the destination. –Logging and book-keeping. Submit job Replicate data Replicate data Site A Site B Site C " Jobs are executed locally or remotely " Data is always written locally " Data is replicated to remote sites Job writes data locally Use Case (CMS) Requirements

Assumptions –The replicated files are Objectivity files only (current restriction to be removed by December). –All participating sites have their own Objectivity federations All sites are schema compatible. Database file names are globally unique. Database Ids are globally unique. Data Model

…Data Model Subscription Model –All the sites that subscribe to a particular site get notified whenever there is an update in its catalog. –The sites that don’t subscribe have to poll themselves for any changes in the catalog. –Support both a pull and a push mechanism. Site 1 Site 3 Site 2 Export/Import Catalog Model Subscriber list Subscriber list subscribe Poll for changes

…Data Model Catalog Model –Export Catalog Contains information about the new files produced. –Import Catalog Contains the information about the files which have been published by other sites but not yet transferred locally. As soon as the file is transferred locally, it is removed from the import catalog. –Possible to pull the information about new files in your import catalog. Site 1 Site 3 Export catalog Import catalog Site 2 Export catalog 1) Publish new files 2) Transfer files Export/Import Catalog Model 1) Get info about new files 3) Delete files

Layered Architecture Modular Flexibility Extensibility Re-usability Blah blah… Globus solves most middleware problems Architecture and Middle-ware Information Service Application Request Manager Replica Manager gssapi GIS Globus Rep. Manager Security Globus-ftp Globus_io Layered Architecture for Distributed Data Management Control Comm. Data Mover DB Manager Objy API Globus-dc Globus- threads

Physics software CheckDB script Production federation User federation MSS Stage & Purge scripts catalog Copy file to MSS Update catalog Purge file Generate new catalog Publish new catalog Subscriber’s list Write DBDB completeness check Stage file (opt) trigger GDMP export catalog GDMP import catalog GDMP server Generate import catalog Replicate files trigger User federation catalog MSS Stage & Purge scripts Copy file to MSS Purge file Transfer & attach trigger write read CMS environment GDMP system CMS/GDMP interface wan Site B Site A Integration into the CMS environment

Performance Results pccit1 cmsb21 cmsun1 FNAL suncms66jasper CERN Caltech kBytes/sec 790 KB/sec 1152KB/sec802 KB/sec 590 KB/sec Comparison with plain manual FTPs

Deliverables and Milestones First Prototype –Released in September 2000 Second Prototype –Jan 2001 Updated Replica Manager (Globus Replica Catalog) File transfer updates (GridFTP libraries) Final Prototype –Jun 2001 Information services –Network monitoring (using NWS) –Data server loads Replica selection

A prototype project for other Data Grid projects A production system for CMS Design is flexible enough to incorporate extensions and/or modifications Automates the replication process Ready to be used…download from Conclusions