CDF Grid Status Stefan Stonjek 05-Jul-2005 13 th GridPP meeting / Durham.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

IEEE NSS 2003 Performance of the Relational Grid Monitoring Architecture (R-GMA) CMS data challenges. The nature of the problem. What is GMA ? And what.
Physics with SAM-Grid Stefan Stonjek University of Oxford 6 th GridPP Meeting 30 th January 2003 Coseners House.
13 March 2002CDF-Grid Meeting at CERN CDF and the Grid Requirements and Anti-Requirements CDF-o-Centric View The Project The manpower Conclusion: CDF/D0.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',
JIM Deployment for the CDF Experiment M. Burgon-Lyon 1, A. Baranowski 2, V. Bartsch 3,S. Belforte 4, G. Garzoglio 2, R. Herber 2, R. Illingworth 2, R.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
SAMGrid – A fully functional computing grid based on standard technologies Igor Terekhov for the JIM team FNAL/CD/CCF.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
5 November 2001GridPP Collaboration Meeting1 CDF and the Grid Requirements and Anti-Requirements CDF-o-Centric View Proposal Conclusion: CDF/D0 Deliverables.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
YuChul Yang Oct KPS 2006 가을 EXCO, 대구 The Current Status of KorCAF and CDF Grid 양유철, 장성현, 미안 사비르 아메드, 칸 아딜, 모하메드 아즈말, 공대정, 김지은,
RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
China Grid Activity on SIG Presented by Guoqing Li At WGISS-21, Budapest 8 May, 2006.
Nick Brook Current status Future Collaboration Plans Future UK plans.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
A Brief Documentation.  Provides basic information about connection, server, and client.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
4 March 2004GridPP 9th Collaboration Meeting SAMGrid:JIM and CDF Development CDF Accepts the Need for the Grid –Requirements How to Meet the Need –Status.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
Lesson 1 Operating Systems, Part 1. Objectives Describe and list different operating systems Understand file extensions Manage files and folders.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
SAM - Sequential Data Access via Metadata Schema Metadata Functionality Workshop Glasgow University April 26-28,2004.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
19 February 2004SAMGrid Project Review SAMGrid: Future Plans CDF Accepts the Need for the Grid –Requirements D0 Relies on the Grid –Requirements How to.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
The GridPP DIRAC project DIRAC for non-LHC communities.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
Adapting SAM for CDF Gabriele Garzoglio Fermilab/CD/CCF/MAP CHEP 2003.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
SAM projects status Robert Illingworth 29 August 2012.
U.S. ATLAS Grid Production Experience
Job workflow Pre production operations:
Presentation transcript:

CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)2 Outline  SAM: Sequential Access via Metadata  file catalogue  metadata  CAF: Central Analysis Farm  JIM: Job Information and Monitoring  Lessons learned  Summary

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)3 CDF is running  CDF is an experiment currently taking data  For a limited time  Stable offline computing is high priority  Limited resources for Grid development  Limited possibilities to introduce new software  New software is accepted if it provides new functionality  CDF is using some Grid technology  Large parts of the software will stay non-Grid aware  We can learn from the experience gained at CDF

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)4 SAM  SAM is currently used by DØ, CDF and MINOS  SAM was originally developed for DØ  SAM is used in production at CDF  Production output is going directly into SAM  SAM is now the only supported data- handling system at CDF  Some users know how to circumvent SAM

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)5 SAM problems  Performance problems with db-servers  db-server = CORBA to SQL bridge  Large queries (many files) consume much memory  Currently solved by creating multiple db-server instances, this is not optimal  Recover from failed projects  Project covers many input files in many jobs  SAM “thinks” file based  Several input, one output file and crash in the middle causes a problem

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)6 SAM points of failure  SAM strongly depends on central services  Database is single point of failure  SAM writes to the database for every action  To solve the problem  complete replication (with write access)  distributed database  No “of the shelf” solution  CORBA naming service is single point of failure  Needed by every client to talk to the rest of the SAM universe  To solve the problem  redundant naming service  distributed naming service  Not enough manpower

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)7 SAM upload  Tool to insert files into SAM from arbitrary nodes  Important for the acceptance of SAM at CDF  Intense use  Causes performance problems  Each client starts thread in db-server

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)8 Metadata  SAM selects files based upon file metadata  Two types of metadata  Physical file parameters (file size, checksum etc.)  Physics file parameters (run and event numbers, event information, time etc.)  Only physical file parameters schema is fix  Physics file parameter schema has to be dynamic (many changes required)

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)9 Metadata (cont.)  SAM uses metadata query language  Called “dimensions”  Protect user from SQL difficulties  Protect database from user mistakes  Therefore less flexible that plain SQL  Require constant adoption to new requirements

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)10 Leason Learned (SAM, metadata)  Avoid single point of failure  Not new, but difficult with database  Keep a many information a possible local  Minimizing the impact of problems in the central database  Need a flexible metadata query language

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)11 CAF  CAF  Central (or CDF) Analysis Farm  Good sandbox technology  Good graphical job submission interface  Does job multiplication for the user  Submit once, execute multiple times

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)12 CAF (cont.)  Distributed CAF (DCAF)  Many sites around the world  In use for Monte-Carlo production  Human based resource brockering  CondorCAF (Glide ins)  New CAF version uses Condor  Allow Glide-Ins  GridCAF  “edg-*” compatibale job submission  CAF-GUI submits to the grid, no job-multiplication

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)13 JIM  JIM: Job Information and Monitoring  Together with SAM the system which produces CDF Monte-Carlo  Requires additional software being installed on Grid sites  SAM  Small differences in resource advertising  Working towards interoperability between JIM and LCG-Grid sites

Tue 05-Jul-2005CDF Grid status report (Stefan Stonjek)14 Summary  CDF is using some Grid-tools  LHC experiments can learn from CDF experience  SAM  central database  metadata  CAF  submission GUI  job multiplication