Artem Trunov Computing Center IN2P3

Slides:

Advertisements

Similar presentations

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.

Advertisements

Introduction to CMS computing CMS for summer students 7/7/09 Oliver Gutsche, Fermilab.

Distributed Tier1 scenarios G. Donvito INFN-BARI.

Experiment support at IN2P3 Artem Trunov CC-IN2P3

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.

BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.

November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,

D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.

Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.

Overview of grid activities in France in relation to FKPPL FKPPL Workshop Thursday February 26th, 2009 Dominique Boutigny.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.

EGEE is a project funded by the European Union under contract IST VO box: Experiment requirements and LCG prototype Operations.

VO Box Issues Summary of concerns expressed following publication of Jeff’s slides Ian Bird GDB, Bologna, 12 Oct 2005 (not necessarily the opinion of)

6 Sep Storage Classes implementations Artem Trunov IN2P3, France

Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.

GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.

The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.

Gestion des jobs grille CMS and Alice Artem Trunov CMS and Alice support.

Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.

Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.

CMS data access Artem Trunov. CMS site roles Tier0 –Initial reconstruction –Archive RAW + REC from first reconstruction –Analysis, detector studies, etc.

The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.

CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.

1-2 March 2006 P. Capiluppi INFN Tier1 for the LHC Experiments: ALICE, ATLAS, CMS, LHCb.

Federating Data in the ALICE Experiment

WLCG IPv6 deployment strategy

(Prague, March 2009) Andrey Y Shevel

BaBar-Grid Status and Prospects

The EDG Testbed Deployment Details

Eleonora Luppi INFN and University of Ferrara - Italy

LCG Service Challenge: Planning and Milestones

Overview of the Belle II computing

Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.

Data Challenge with the Grid in ATLAS

EGEE VO Management.

Bernd Panzer-Steindel, CERN/IT

Model (CMS) T2 setup for end users

CMS transferts massif Artem Trunov.

Farida Fassi, Damien Mercie

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

CC IN2P3 - T1 for CMS: CSA07: production and transfer

Artem Trunov and EKP team EPK – Uni Karlsruhe

Simulation use cases for T2 in ALICE

Survey on User’s Computing Experience

ALICE – FAIR Offline Meeting KVI (Groningen), 3-4 May 2010

Ákos Frohner EGEE'08 September 2008

The INFN Tier-1 Storage Implementation

Data Management cluster summary

N. De Filippis - LLR-Ecole Polytechnique

R. Graciani for LHCb Mumbay, Feb 2006

LHC Data Analysis using a worldwide computing grid

Pierre Girard ATLAS Visit

CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019

The LHCb Computing Data Challenge DC06

Presentation transcript:

Artem Trunov Computing Center IN2P3 CMS Grid Artem Trunov Computing Center IN2P3

Topics Numbers, policies (overview) Tools Sites challenges Some flashbacks to BaBar, Alice Accents on data management

CMS numbers – 2008 (Credit: Dave Nebold, Bristol) Data reduction chain RAW – 1.5 MB RECO - 0.25 MB AOD - 0.05 MB Events 92 days of running ~300Hz trigger rate 1.2 B events - 450 MB/s Size FEVT (RAW+RECO) = 1.2 PB ReReco = 300TB x 3 = .9 PB AOD = 60 x 4 = 240 TB Sim to real ratio is 1:1 (+30% of size overhead)

Policies RAW on disk 100% RECO on disk 100% AOD on disk Copy at CERN (on tape) and each T1 RECO on disk 100% Second pass RECO on disk and tape at production T1 AOD on disk “Current” (most recent version + 1 older) 100% On tape at production T1 site Replicated on disk at all other T1s SIM data – 10% on disk Totals: ~6PB of disk, ~10 PB on tape

Site roles Tier0 Initial reconstruction Archive RAW + REC from first reconstruction Analysis, detector studies, etc Tier1 Archive a fraction of RAW (2nd copy) Subsequent reconstruction “Skimming” (off AOD) Archiving Sim data produced at T2s Serve AOD data to other T1 and T2s Analysis Tier2 Simulation Production

Transfers T0 T1 (number for a typical T1 like IN2P3) RAW+RECO to T1s at ~250MB/s Some services to “orphan” T2s T1 (number for a typical T1 like IN2P3) RAW+RECO from CERN – 25 MB/s AOD exchange - ~100MB/s in and out SIM data from T2s – 15MB/s All data to T2s - ~100MBs In CMS any T2 can transfer from Any T1 – not a rigid Tier model

CMS Tools UI Oracle WebInterface FTS CRAB Phedex srmcp DBS/DLS Oracle Bookkeeping tools Description of data datasets, provenance, production information, data locatoin DBS/DLS (Data Bookkeeping Service/Data Location Service) Production tools Manage scheduled or personal production of MC, reconstruction, skimming ProdAgent Transfer tools All scheduled intersite transfers Phedex User Tools For end user to submit analysis jobs. CRAB UI Oracle WebInterface FTS CRAB Phedex srmcp DBS/DLS Oracle UI ProdAgent LFC

Bookkeeping Tools Centrally set up database Has different scopes (“local”, “global”), to differentiate official data from personal/group data. Use Oracle server at CERN. Data location service uses either Oracle or LFC. Very complex system

Production Agent Managing all production jobs at all sites Instances of it are used by production operators 1 for OSG + 3 EGEE sites Merging production job output to create a reasonable size files. Registering new data in Phedex for transfer to final destination MC data produced at T2 to T1. All prod info goes to DBS

Phedex The only tool that is required to run at all sites Work in progress to remove this requirement Set of site-customizable agents that perform various transfer related tasks. Uses Oracle at CERN to keep it’s state information Can use FTS to perform transfer, or srmcp Or another mean, like direct gridftp, but CMS requires SRM at sites. Uses ‘pull’ model of transfers, i.e. transfers are initiated at the destination site. One of the oldest and stable SW component of CMS Secret of success: development is carried out by the CERN+site people who are/were involved in daily operations Uses someone’s personal proxy certificate

CRAB A user tools to submit analysis jobs Users specifies a data set to be analyzed, his application and config. CRAB: locates the data with DLS splits jobs submits the to the Grid tracks jobs collects results Can be configured to upload job output to some Storage Element.

Other CMS tools Software installation service Dashboard ASAP In CMS software is installed in VO area via grid jobs. Dashboard Collecting monitoring information (mostly jobs) and presenting through a web interface. ASAP Automatic submission of analysis jobs using web interface.

CMS site services “CMS does not require a VO Box” But it needs a machine at site to run Phedex agents However VO Box is a very useful concept that CMS can take advantage of. Gsissh access Passwordless! Uses your grid proxy Proxy-renewal service Automatically retrieves your personal proxy, required to make transfer, from myproxy server. Other services can be run on the VO Box, like ProgAgent. At CC-IN2P3 we have migrated CMS services to a standard VO Box, and also made a setup shared with Alice services, so one can interchange a VO Box in case of failure.

CMS also requires a site support person Some needs to install Phedex and look after transfers. It may seem to much, but this works for CMS Big collaboration, complex computing model, and some one needs to bring this in order

Having a support person on site pays! Usually had working experience with the experiments computing Understand experiment’s computing models and requirements Proactive in problem solving Work to make own colleges happy Bring mutual benefit to the site and the experiment Current level of Grid and experiments’ middleware really requires a lot of efforts!

CMS Tier1 people – who are they? Little survey on CMS computing support at sites Spain (PIC) – Have dedicated person Italy (CNAF) – Have dedicated person France (IN2P3) - yes Germany (FZK) – no CMS support is done by University of Karlsruhe, Alice – by a GSI person. SARA/NIKEF – no Nordic countries – terra incognita US (FNAL) - yes UK (RAL) – yes, but virtually no experimental support at UK Tier2 sites

Latitude defines attitude? Less VO support ? More VO support

Storage story Data management is a corner stone of HEP experiments Never seemed difficult? BaBar (SLAC) ~ 1PB of data ~25 MC production centers First reconstruction in Italy Transfers use ssh authentification (with keys) Bbcp or bbftp (multistreaming feature) 6 servers in total for import and migration of data to MSS. The same tools used for MC, Data, skimming imports May be it was easy for BaBar because BaBar developed only what serves this particular experiment, without pretension to make whole world happy?

How did we get there? Choice of authentification model (gsi) → gridftp But the protocol requires some many ports open and difficult to deal from behind NAT Interoperability → SRM Did not initially target LCH, not compatible implementations, obscure semantics Transfer management → FTS (developed at CERN) Does not fully address bandwidth management

FTS deployment scheme In this scheme CERN takes advantage of managing it’s own traffic in or out. But T1 site only manages incoming traffic! Your outgoing traffic is not under your control! Big issue – one has to account for potential increase in traffic and take measures to guarantee incoming traffic Increase number of servers in transfer pools Make separate write pools (import) and read pools (export) CERN Traffic managed by CERN FTS T1 Traffic managed by T1 FTS T2

SRM SRM (Storage Resource Management) was emerged at LBNL(Nersk) as a proposal for a common interface to managing storage resources. It was also viewed as a great tool in LCG to provide inter-site operability and give experiments a storage-independent way of managing their data at numerous institutions participating in LCH and other experiments. Now it’s a baseline service. SRM V1 was specified and implemented in Castor, dCache and DMP. Some interoperability issues were uncovered as well as experiments raised concerns that functionality available in SRM v1 interface doesn’t cover experiment’s use cases At Mumbai SC4 workshop it was agreed to make a certain convention to carry out SC4, and then come up with a new standard proposal. FNAL workshop in May came up with SRM v2.2 standard and introduced a concept of storage classes (see also pre-GDB meetings in Oct, Nov for more details) Have finally started to speak in storage classes formalism. But this concept and implementation also introduce some inconveniencies, and some experiments are not keen on using it.

Alice – an xrootd alternative Xrootd is a data access system developed at SLAC and first used in BaBar. Xrootd is thought to be better scalable with number of clients (especially for simultaneous open requests) and also needs much less management efforts. However it doesn’t have decent authorization and data management functions The good question – do you really need them? Alice has developed own authorization plugin for xrootd and uses xrootd in production (for both jobs data access and inter-site transfers). Nowdays however going to switch to gridftp/fts for scheduled transfers from CERN to T1s. We at CC-IN2P3 also expiriment with xrootd and deployed a solution suitable for all LHC experiments Later on this, but first – our xrootd setup

CMS storage at CC-IN2P3 T0, T1, T2 transfers srm:// Phedex managed HPSS DMZ rfcp IFZ prod pool Imp/exp pool Analysis pool srmcp dcap:// dcap: small files big files Prod-merging jobs Production jobs Analysis jobs

Experiments with xrootd We see the following use of xrootd: large read-only disk cache in front of tape storage No need of security overhead Transfer (srm) and fancy data management functions provided by dcache For xrootd setup and deployment details, see my tutorial during CERN T2 workshop in June 2006. HPSS rfcp Analysis pool -dcache dccp Analysis pool – xrootd root:// Analysis jobs

Another tests - PROOF PROOF is a part of ROOT and gives a mechanism to parallelize analysis of ROOT trees and ntuples. Requires high speed access to data, not often possible in current setups WN and server bandwidth limitation WN: 1GB per rack -> 1.5MB/s per cpu core. Servers: usually have 1Gb/s, but disk subsystem may perform better. Solution endorsed by PROOF team: Large dedicated clusters with locally pre-loaded data Not preferable at the moment.

Our solution in test PROOF agents are run on the xrootd cluster and take advantage of the following: free cpu cycles due to low overhead of xrootd zero cost solution - no new hardware involved Direct access to data on disk, not using bandwidth 1GB node interconnect when inter-server access is required. transparent access to full data set stored at our T1 for all experiments via xrootd-dcache link deployed on this xrootd cluster and dynamic staging management of infrastructure conveniently fit into existing xrootd practices this setup is more close to possible 2008 PROOF solution because of 1GB node connection and large "scratch" space. this is a kind of setup that T2 sites may also considers to deploy

Common sense advises to a site developing storage infrastructure Big world, few choices Tape management Castor? HPSS! Used happily by large computing centers (SLAC, BNL, LBNL, IN2P3) and is a major contributor to overall success of those sites. Disk management Nothing is perfect, but if SRM is required, a hybrid dCache+xrootd solution seems to be the best. Decision making Work close with experiments!

The End! CMS will be ok!