Claudio Grandi INFN Bologna CSN1 - Perugia 11/11/2002 Gli esperimenti LHC hanno qualcosa in comune? (HEPCAL RTAG di LCG) C. Grandi INFN - Bologna.

Slides:



Advertisements
Similar presentations
CHEP 2000, Roberto Barbera Roberto Barbera (*) GENIUS: a Web Portal for the GRID Meeting Grid.it, Bologna, (*) work in collaboration.
Advertisements

Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
IEEE NSS 2003 Performance of the Relational Grid Monitoring Architecture (R-GMA) CMS data challenges. The nature of the problem. What is GMA ? And what.
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Data Management Expert Panel - WP2. WP2 Overview.
1 CRAB Tutorial 19/02/2009 CERN F.Fanzago CRAB tutorial 19/02/2009 Marco Calloni CERN – Milano Bicocca Federica Fanzago INFN Padova.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
1 LHC requirements for GRID middleware F.Carminati, P.Cerello, C.Grandi, O.Smirnova, J.Templon, E.Van Herwijnen CHEP 2003 La Jolla, March 24-28, 2003.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
CHEP 2000, Giuseppe Andronico Grid portal based data management for Lattice QCD data ACAT03, Tsukuba, work in collaboration with A.
RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
SLICE Simulation for LHCb and Integrated Control Environment Gennady Kuznetsov & Glenn Patrick (RAL) Cosener’s House Workshop 23 rd May 2002.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid Applications Federico Carminati WP6 WorkShop December 11, 2000.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Grid Workload Management Massimo Sgaravatto INFN Padova.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
29 May 2002Joint EDG/WP8-EDT/WP4 MeetingClaudio Grandi INFN Bologna LHC Experiments Grid Integration Plans C.Grandi INFN - Bologna.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Common Use Cases for a HEP Common Architecture Layer J. Templon, NIKHEF/WP8.
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008.
A university for the world real R © 2009, Chapter 9 The Runtime Environment Michael Adams.
Metadata Mòrag Burgon-Lyon University of Glasgow.
14-May-2003 AWG FH, JT, JJB DataGrig Barcelona 1 HEP GRID use cases Common GRID use cases F.Harris, J.Templon, J.J Blaising.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
VOX Project Tanya Levshina. 05/17/2004 VOX Project2 Presentation overview Introduction VOX Project VOMRS Concepts Roles Registration flow EDG VOMS Open.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
1 CMS Virtual Data Overview Koen Holtman Caltech/CMS GriPhyN all-hands meeting, Marina del Rey April 9, 2001.
David Adams ATLAS AJDL: Abstract Job Description Language David Adams BNL June 29, 2004 PPDG Collaboration Meeting Williams Bay.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
14 June 2001LHCb workshop at Bologna1 LHCb and Datagrid - Status and Planning F Harris(Oxford)
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
(on behalf of the POOL team)
Moving the LHCb Monte Carlo production system to the GRID
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
OGSA Data Architecture Scenarios
HEPCAL, PPDG CS11 & the GAE workshop
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

Claudio Grandi INFN Bologna CSN1 - Perugia 11/11/2002 Gli esperimenti LHC hanno qualcosa in comune? (HEPCAL RTAG di LCG) C. Grandi INFN - Bologna

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 2 What is HEPCAL OS & Net services Bag of Services (GLOBUS, Codor-G,…) DataGRID middleware PPDG, GriPhyn, EU-DataGRID ALICEATLASCMSLHCbOtherHEPOther Apps HEP Common Application Layer VO common application layer … ALICEATLASCMSLHCb Specific application layer OtherHEPOther Apps

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 3 How to proceed CMSATLAS ALICELHCb CMS ATLAS ALICELHCb Core common Use Cases

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 4 Physicist domain Computer Scientist domain Domains interface Why Use Cases? OS & Net services Bag of Services (GLOBUS, Codor-G,…) DataGRID middleware PPDG, GriPhyn, EU-DataGRID HEP Common Application Layer… ALICEATLASCMSLHCbOtherHEPOther Apps

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 5 Domains interface Definition of the applications that analyze the data is in the physicist domain Definition of the tools for accessing the data and the resources in a transparent way is in the computing scientist domain An interface is needed to make physicists and computing scientists to collaborate! Use a syntax taken from the CS domain: Use cases

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 6 Use cases What Use Cases are: –a standard technique for gathering requirements in software development methodologies –narrative documents that describe the sequence of events of an actor using a system [...] to complete processes (*) What Use Cases are NOT: –the description of an architecture –the representation of an implementation (*) Jacobson, I., et al. Object-Oriented Software Engineering Addison Wesley. Reading, MA.

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 7 LCG HEPCAL RTAG Mandate Identify use cases for distributed (Grid) computing common to the LHC experiments Focus on goals Try to be implementation independent Two months time: First meeting on April 3 rd, delivered on May 24 th About 10 full days of meetings

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 8 HEPCAL RTAG membership Chair: Federico Carminati Members: Piergiorgio Cerello (ALICE) Oxana Smirnova (ATLAS) Claudio Grandi (CMS) Eric VanErvijnen (LHCb) Jeff Templon (DataGrid WP8)

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 9 RTAG Report

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 10 Datasets Collection of files treated as a whole Read-only once uploaded to the grid Identified by a unique Logical Dataset Name May be replicated in many physical locations May be a Virtual Dataset the algorithm to produce it is registered to the Grid along with the input data and/or parameters May contain references to objects in other datasets May be associated with a default remote access protocol

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 11 Catalogues The only read/write entities on the grid May be defined by applications but: Implementation not under application control Replication not under application control Read/write datasets not discussed! Examples of grid-defined catalogues Dataset Metadata Catalogue Associates to each Logical Dataset Name a list of attributes ( key=value pairs) of the dataset Job Catalogue Associates to each Job Identifier a list of attributes

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 12 Jobs A single invocation of job submission May be composite (e.g. Direct Acyclic Graph) May be split Official productions are a special case of job submission Interactive jobs not discussed! Actually we discussed a lot on interactivity but we decided to leave it out because of time…

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 13 Job splitting Automatic splitting: Based on location of replicas of input datasets Using applications plug-in for splitting and for joining of the results. If interactivity is supported, splitting may be done by the running application (spawning processes on the grid, a-la PROOF) Job splitting not discussed in detail!

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 14 Persistency layer Grid tools will be used to navigate from one dataset to objects in other datasets The application persistency layer provides a mapping between the target object identifier and the Logical Dataset Name The Grid provides the mapping between the LDN and the physical copy of the dataset files

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 15 Identifying datasets The Dataset Metadata Catalogue maps attributes to Logical Dataset Names The input dataset of a job may be specified as a query that acts on the dataset attributes, e.g.: “give me all the datasets corresponding to events acquired during the period 22/11/2007 through 18/07/2008 using the XYZ trigger configuration” Applications may add attributes to datasets Special fields of the Dataset Metadata Catalogue may be used for virtual dataset materialization: Executable=…; StdIn=…; etc…

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 16 Identifying jobs The Job Catalogue maps attributes to Job Identifiers A query to the job scheduler may be specified as a query that acts on the job attributes, e.g.: “give me the status of all the jobs analyzing dataset XYZ using the application program version 1.2.3” Applications may add attributes to jobs Information used both for monitoring and for book- keeping (a-la BOSS)

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 17 Use cases Obtain Grid authorisation Ask for revocation of Grid authorisation Grid login Browse Grid resources DS metadata update DS metadata access Dataset registration to the Grid Virtual dataset declaration Virtual dataset materialization Dataset upload User-defined catalogue creation Data set access Dataset transfer to non-Grid storage Dataset replica upload to the Grid Data set access cost evaluation Data set replication Physical data set instance deletion Data set deletion (complete) User defined catalogue deletion (complete) Data retrieval from remote Datasets Data set verification Data set browsing Browse condition database Job catalogue update Job catalogue query Job submission Job Output Access or Retrieval Error Recovery for Aborted or Failing Production Jobs Job Control Steer job submission Job resource estimation Job environment modification Job splitting Production job Analysis 1 Data set transformation Job monitoring Simulation Job Experiment software development for the Grid VO wide resource reservation VO wide resource allocation to users Condition publishing Software publishing

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 18 Example A user needs to submit a job that analyzes a dataset and produces a file to be saved on the grid for further analysis Input dataset Input dataset Output file Output file Job Input dataset Input dataset Input dataset Input dataset

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 19 User operations sequence User logs into the grid User writes the job description –executable and arguments –logical name of input dataset or complex query –name and metadata of output file to be saved –... User submits the job Job output is made available to the user, including the logical file name of the output User uses the logical file name to access it

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 20 Running job operations sequence Information is made available to the job: –The physical file names for the input datasets The physical file names are either the names of a local files to be POSIX-opened (e.g. if they have been copied locally) or names that are accessible remotely via some protocol (e.g. AMS) The job is run The output file is uploaded to the grid –The output file is registered to the grid catalogues The user-defined metadata of the output are stored in the dataset metadata catalogue

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 21 Formal representation

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 22 Joint EU-US HEPCAL response Draft available on nov 1 st. Requested: quantitative performance metrics more details on VO management more details on devious flows and error handling use cases for site administrators more details on software publishing and versioning Among other suggestions: mantain ongoing collaboration among LCG, EDG, US grid projects and RTAG groups! …this means having a permanent body with the same competence of the HEPCAL RTAG…

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 23 Use CaseEDG 1.2 VDT EDG 2 planned VDT Spring 2003 Other US gridware Compatibilit y between VDT and EDG After EDG 2 VDT later plans Comments AAA and VO Obtain Grid Authorizatio n Yes Work with DOESG Revoke Grid Authorizatio n Yes Grid LoginPartial Roles, expiration issues Yes Neither Extensions nor Additional Requirements supported by both Browse Grid Resources Basic Yes Meta-Data, Data Mgmt. and Access DS Metadata Update Yes?Basic Yes - [1]Basic EDG: User must do all the work DS Metadata Access Yes?Basic YesYes – [1]Basic EDG: User must do all the work! EDG: Primitive (J.T.) Dataset (DS) Registration Yes?Basic Yes?Yes – [1]Basic Joint response: use cases tables

Claudio Grandi INFN Bologna 11/11/2002 CSN1 - Perugia 24 Conclusions Use cases drawn after about one year of experience with grid tools Some items need more thinking: Read/write datasets Interactive jobs Job splitting First feed-back from grid projects available: The use cases are being used by computing scientists working in the grid projects to build tools useful for the physicists. Some use cases already implemented and used by the experiments (see Mario’s talk)