Adapting SAM for CDF Gabriele Garzoglio Fermilab/CD/CCF/MAP CHEP 2003.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

13 March 2002CDF-Grid Meeting at CERN CDF and the Grid Requirements and Anti-Requirements CDF-o-Centric View The Project The manpower Conclusion: CDF/D0.
The Sam-Grid project Gabriele Garzoglio ODS, Computing Division, Fermilab PPDG, DOE SciDAC ACAT 2002, Moscow, Russia June 26, 2002.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Database Infrastructure Major Current Projects –CDF Connection Metering, codegen rewrite, hep w/ TRGSim++ – Dennis –CDF DB Client Monitor Server and MySQL.
JIM Deployment for the CDF Experiment M. Burgon-Lyon 1, A. Baranowski 2, V. Bartsch 3,S. Belforte 4, G. Garzoglio 2, R. Herber 2, R. Illingworth 2, R.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Grid Job, Information and Data Management for the Run II Experiments at FNAL Igor Terekhov et al (see next slide) FNAL/CD/CCF, D0, CDF, Condor team, UTA,
SAMGrid – A fully functional computing grid based on standard technologies Igor Terekhov for the JIM team FNAL/CD/CCF.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
S. Veseli - SAM Project Status SAMGrid Developments – Part I Siniša Veseli CD/D0CA.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham.
3rd Nov 2000HEPiX/HEPNT CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department
Deploying and Operating the SAM-Grid: lesson learned Gabriele Garzoglio for the SAM-Grid Team Sep 28, 2004.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
Nick Brook Current status Future Collaboration Plans Future UK plans.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
A Design for KCAF for CDF Experiment Kihyeon Cho (CHEP, Kyungpook National University) and Jysoo Lee (KISTI, Supercomputing Center) The International Workshop.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Workshop on Computing for Neutrino Experiments - Summary April 24, 2009 Lee Lueking, Heidi Schellman NOvA Collaboration Meeting.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Remote Control Room and SAM DH Shifts at KISTI for CDF Experiment 김현우, 조기현, 정민호 (KISTI), 김동희, 양유철, 서준석, 공대정, 김지은, 장성현, 칸 아딜 ( 경북대 ), 김수봉, 이재승, 이영장, 문창성,
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
4 March 2004GridPP 9th Collaboration Meeting SAMGrid:JIM and CDF Development CDF Accepts the Need for the Grid –Requirements How to Meet the Need –Status.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Lee Lueking 1 The Sequential Access Model for Run II Data Management and Delivery Lee Lueking, Frank Nagy, Heidi Schellman, Igor Terekhov, Julie Trumbo,
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Database Server Concepts and Possibilities Lee Lueking D0 Data Browser Workshop April 8, 2002.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Storage and Data Movement at FNAL D. Petravick CHEP 2003.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
DCAF (DeCentralized Analysis Farm) Korea CHEP Fermilab (CDF) KorCAF (DCAF in Korea) Kihyeon Cho (CHEP, KNU) (On the behalf of HEP Data Grid Working Group)
3D Testing and Monitoring Lee Lueking LCG 3D Meeting Sept. 15, 2005.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
DCAF(DeCentralized Analysis Farm) for CDF experiments HAN DaeHee*, KWON Kihwan, OH Youngdo, CHO Kihyeon, KONG Dae Jung, KIM Minsuk, KIM Jieun, MIAN shabeer,
Grid Job, Information and Data Management for the Run II Experiments at FNAL Igor Terekhov et al FNAL/CD/CCF, D0, CDF, Condor team.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
Sept Wyatt Merritt Run II Computing Review1 Status of SAMGrid / Future Plans for SAMGrid  Brief introduction to SAMGrid  Status and deployments.
9/20/04Storage Resource Manager, Timur Perelmutov, Jon Bakken, Don Petravick, Fermilab 1 Storage Resource Manager Timur Perelmutov Jon Bakken Don Petravick.
Conditions Data access using FroNTier Squid cache Server
Dzero Data Handling and Databases
Lee Lueking D0RACE January 17, 2002
Presentation transcript:

Adapting SAM for CDF Gabriele Garzoglio Fermilab/CD/CCF/MAP CHEP 2003

Overview SAM for CDF: why ? Goals of the pilot project The path (and pitfalls...) to integration (summer 02) Current status and future vision

History End Pre-Pilot Project: UK starts an evaluation of the needs of CDF for distributed computing: the Grid and SAM April Pilot Project: adaptation of SAM for CDF August Pilot SAM Deployment complete: move toward deployment of SAM for the collaboration

Why considering SAM ? Collaborating institutions can provide local computing resources to process data. An example: estimated size of the datasets UK institutes wish to access PhysicsPhysics Trigger SetNo. EvtsSecondary Data Size (GB) B Lifetime B oscillations CP violation Central J/Psi Displaced Vertex 12 M 1,000 M 1, ,000 W/Z Higgs SUSY B Physics High Pt leptons Inclusive electrons Inclusive muons 2 M 50 M 14 M 200 5,000 1,400 SUSY Calibrations High Et photons58 M5,800 HiggsZ 0 ---> b bbar6 M600 SAM was/is being actively developed and integrated with Grid middleware (SAM-Grid)

What SAM Provides SAM is a Data Handling System (project started in 1997) used in production by DZero (see Lee Lueking’s talk) Main characteristics –data movement and caching –meta-data catalogue –bookkeeping of analysis projects –set of tools for users and administrators

Highlights of the mapping between DFC and SAM Problem: how do we map the CDF (DFC) and DZero (SAM) views of the data ? DFC: files organized in Datasets which contain Filesets SAM: provides virtual files + metadata parameters (datastreams, data tiers, applications, …) DFC has the concept of Books to implement group/user-specific metadata and resource management. The complete mapping between the DFC and SAM on CDF note 6169.

Goals of the pilot phase (by summer 2002) 1.Supporting 5 “remote” groups to do data analysis 2.Enabling access to datasets of interest: read access to secondary data + read/write to higher order data 3.Production quality availability of the system: key machines maintained 24x7 4.Controllable limited impact on the CDF Mass Storage System (Enstore)

Areas of work  Designing and implementing an architecture  Adding /Adjusting features to SAM  Design / Load CDF SAM Database  Enable CDF clients to access SAM  Installation / Configuration Group Coordination during development Support/Shifter Organization

Architecture this architecture was designed with the goals in mind. 2 MSS: –CDFen: 1-ary/2-ary data via DCache –STKen: higher order; 5TB tapes

Architecture 1 routing station for >2-ary data: –1TB disk cache –1GBs connectivity 1 FNAL analysis station: –dual 1GHz Pentium –160 GB Disk.

Architecture SAM services on a sun machine (like DZero) supporting development/ integration/ production DB. PC were the offline CDF Oracle machines.

Feature adjustment Integration of SAM with DCache: SAM transports files to local caches from the weakly authenticated ftp door of DCache Enable Enstore “discipline” module to limit access to SAM (see Don Petravick’s talk for more on FNAL MSS) Enable direct SAM station to station file transfer: implement better file routing in SAM

Database Use development/integration/production databases for schema evolution Periodically read data from the DFC and translate them to the SAM schema: a java based program, Predator, runs every 3 hours to keep SAM up-to- date Loading the db using the SAM interface served as proof of principle but turned out to be too slow

Integration with the CDF analysis framework: AC++ AC++ does not use exception handling SAM was developed for DZero, whose analysis framework support exceptions SAM communication is based on CORBA and the idl interfaces used exception To manage communication, AC++ I/O modules fork a CDF Project Protocol Converter (CPPC). CPPC is a finite state machine that communicates with SAM via CORBA and with AC++ via pipes CPPC was generalized to allow communication with multiple processes The UI required a dataset definition + project name

Installation and Configuration Version management resulted to be a sensitive issue DZero used to maintain the current cooperating product versions tagging them as “current” in the UPS/UPD repository To promote independency of software upgrades, CDF needed a different way of tagging consistent cooperating versions Cooperating versions were hardcoded in SAM installation script.

Current status Today, CDF uses SAM for physics at Oxford and Karlsruhe Currently Testing SAM on the FNAL CAF (see Frank Wuerthwein’s talk) Next step is a tighter integration of SAM with DCache.

Future vision: the SAM-Grid The integration of SAM with Grid middleware to enable Job handling and Information Monitoring (JIM) was demonstrated at SC2002 in November (see Igor Terekhov talk on JIM; Fedor Ratnikov’s talks on JIM and DCAF; Stefan Stonjek’s talk on SAM-Grid at sc2002) SAM was integrated with CDF DCAF and deployed at ~10 sites around the world The production version of SAM-Grid is going to be deployed for DZero in April, we look forward to CDF