ATLAS Magda Distributed Data Manager Torre Wenaus BNL PPDG Robust File Replication Meeting Jefferson Lab January 10, 2002.

Slides:



Advertisements
Similar presentations
WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Advertisements

Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Author - Title- Date - n° 1 GDMP The European DataGrid Project Team
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Repositories.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Grappa: Grid access portal for physics applications Shava Smallen Extreme! Computing Laboratory Department of Physics Indiana University.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
File and Object Replication in Data Grids Chin-Yi Tsai.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Event Data History David Adams BNL Atlas Software Week December 2001.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
INFNGrid Constanza Project: Status Report A.Domenici, F.Donno, L.Iannone, G.Pucciani, H.Stockinger CNAF, 6 December 2004 WP3-WP5 FIRB meeting.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.
Magda Distributed Data Manager Status Torre Wenaus BNL ATLAS Data Challenge Workshop Feb 1, 2002 CERN.
US ATLAS Grid Projects Rob Gardner Indiana University Mid Year Review of US ATLAS Computing NSF Headquarters, Arlington VA June 20, 2002
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Globus Replica Management Bill Allcock, ANL PPDG Meeting at SLAC 20 Sep 2000.
Magda status and related work in PPDG year 2 Torre Wenaus, BNL/CERN US ATLAS Core/Grid Software Workshop, BNL May 6-7, 2002 CERN.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
An information and monitoring system for static and dynamic information about grid resources, applications, networks … RDBMS Servlet aware of API during.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Replica Management Kelly Clynes. Agenda Grid Computing Globus Toolkit What is Replica Management Replica Management in Globus Replica Management Catalog.
STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
10 May 2001WP6 Testbed Meeting1 WP5 - Mass Storage Management Jean-Philippe Baud PDP/IT/CERN.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Data Management The European DataGrid Project Team
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
Oxana Smirnova, Jakob Nielsen (Lund University/CERN)
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Status and plans for bookkeeping system and production tools
Presentation transcript:

ATLAS Magda Distributed Data Manager Torre Wenaus BNL PPDG Robust File Replication Meeting Jefferson Lab January 10, 2002

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 2  MAnager for Grid-based DAta  Focused on the principal PPDG year 1 deliverable  Deploy to users a production distributed data management service  Designed for rapid development of components to support users quickly, with components later replaced by Grid Toolkit elements  Deploy as an evolving production tool and as a testing ground for Grid Toolkit components  Designed for ‘managed production’ and ‘chaotic end-user’ usage  Adopted by ATLAS for 2002 ATLAS Data Challenges  Developers - T. Wenaus and soon W. Deng (pdoc) and new hire Magda Info: The system:

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 3 Architecture & Schema  MySQL database at the core of the system  DB interaction via perl, C++, java, cgi (perl) scripts  C++ and Java APIs autogenerated off the MySQL DB schema  User interaction via web interface and command line  Principal components:  File catalog covering any file types  Data repositories organized into sites, each with its locations  Computers with repository access: a host can access a set of sites  Logical files can optionally be organized into collections  Replication operations organized into tasks

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 4 Magda Architecture Location Site Location Site Location Site Host 2 Location Cache Disk Site Location Mass Store Site Source to cache stagein Source to dest transfer MySQL Synch via DB Host 1 Replication task Collection of logical files to replicate Spider scp, gsiftp Register replicas Catalog updates WAN

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 5 Files and Collections  Files & replicas  Logical name is arbitrary string, usually but not necessarily the filename  In some cases with partial path (eg. for code, path in CVS repository)  Logical name plus virtual organization (=atlas.org) defines unique logical file  File instances include a replica number  Zero for the master instance; N=locationID for other instances  Notion of master instance is essential for cases where replication must be done off of a specific (trusted or assured current) instance  Not currently supported by Globus replica catalog  Several types of file collections  Logical collections: arbitrary user-defined set of logical files  Location collections: all files at a given location  Key collections: files associated with a key or SQL query

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 6 Distributed Catalog  Catalog of ATLAS data at CERN, BNL (also LBNL, ANL, BU, UTA)  Supported data stores: CERN Castor, CERN stage, BNL HPSS (rftp service), disk, code repositories, …  Current content: physics TDR data, test beam data, ntuples, …  About 200k files currently cataloged representing >6TB data  Has run without problems with ~1.5M files cataloged  ‘Spider’ crawls data stores to populate and validate catalogs  Catalog entries can also be added or modified directly  Single MySQL DB serves entire system in present implementation  ‘MySQL accelerator’ provides good catalog loading performance over WAN; 2k files in <1sec. Sends bunched actions and initiates remotely with cgi  Globus replica catalog ‘loader’ written for evaluation; not used yet

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 7

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 8

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 9

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 10 Other Metadata  Simple user-defined metadata support: ‘keys’ (strings) can be assigned to logical files  Will integrate with external application metadata catalogs for ‘metadata about the data’ (eg. physics generator, run type, …)  In ATLAS, a MySQL/phpMyAdmin based tool being developed by Grenoble for DC1  Parenthetically… not part of the PPDG work but benefiting from it…  New Magda derivative begun: Hemp, Hybrid Event store Metadata Prototype, for the RDBMS part of a ROOT/RDBMS event store  Close ties to data signature work (‘history info’) as well as file management

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 11 File Replication  Replication operations organized as user-defined tasks specifying source collection and host, transfer tool, pull/push, destination host and location, and intermediate caches  User-specified logical file collections are replicated  e.g. a set of files with a particular physics channel key  Designed to support multiple file transfer tools, user-selectable, which are useful in different contexts (eg. scp for transfers ‘outside the grid’)  In use between CERN, BNL, and among US ATLAS testbed sites  CERN stage, Castor, HPSS  cache  scp  cache  BNL HPSS  BNL HPSS or disk  cache  gsiftp  testbed disk  ~270GB replicated to date  GDMP integration just underway

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 12 Replication Steps  Replication steps for each logical file, coordinated via state info in DB:  Mark as ‘processing’ in DB collection  Find the least-cost replica instance accessible at source host (ie. disk instance preferred over MSS); stage into cache if necessary  On stage complete, mark as available for transfer  Independent transfer script (running on source or destination side) transfers files as they become available, and marks as available on destination side  If final destination is MSS, transferred files are deposited in a cache, and an independent destination-side script archives them  Caches have ‘maximum size’ to throttle to available space  If any stage breaks, others wait until file flow resumes and then proceed  File validation is by checking file size  Failed transfers are re-queued

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 13

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 14 GDMP & Magda  Integration as a data mover underway  Characteristics of present implementation limit scope of its application in Magda  One root disk directory per site  Subscription updates bring in all new data for a site  File collections not used  LFN fixed as ‘dir/filename’ (RC constraint)  Doesn’t catalog or directly manage files in MSS  Write access to tmp, etc disk areas required for all GDMP users  System state info (in files) only available locally  Will try it initially for managed-production transfers between large centers

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 15 Command Line Tools  Deployed command line tools for data access:  magda_findfile  Search catalog for logical files and their instances  magda_getfile  Retrieve file via catalog lookup and (as necessary) staging from MSS or (still to come) remote replication into disk cache  Creates local soft link to disk instance, or a local copy  Usage count maintained in catalog to manage deletion  magda_releasefile  Removes local soft link, decrements usage count in catalog, deletes instance (optionally) if usage count goes to zero  magda_putfile  Archive files (eg. in Castor or HPSS) and register them in catalog

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 16 Data Grid Reference Architecture

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 17 DGRA Components & MagdaComponent What’s in Magda Application of Grid Toolkit items Planner Task definitions DAGman, SQL service Executor Scripts coordinated via MySQL for multi-stage replication and driven by task definition DAGman, remote execution Catalog services MySQL with schema interfaces in perl, C++, Java SQL service Info services dittoditto Policy Built in ‘least cost’ behaviour; command options ? Security MySQL security, htpasswd GSI as used in grid tools Monitoring Web presentation of MySQL-maintained info Relevant monitoring tools that emerge Replica management MySQL RC, RM, GDMP Reliable transfer service Transfer script driven by task def’n gsiftp, GDMP, reliable file transfer Storage resources Scripts using command line tools for disk, HPSS, Castor, stagein HSM Application of GT: green = currently used or in progress, brown = currently in mind to be looked at and tried out

JLAB, Jan 2002 Torre Wenaus, BNL PPDG ATLAS 18 Near Term Activity  Application in DC0 (deployed)  File management in production; replication to BNL; CERN, BNL data access  Interface with Grenoble application metadata catalog  GDMP integration - ready for DC1 (in progress)  Application in DC1 (beginning ~March)  As DC0, but add replication and end-user data access at testbed sites  Interface with hybrid ROOT/RDBMS event store  Evaluation/use of other grid tools: Globus RC, SQL service, HSM, …  Athena (ATLAS offline framework) integration