CMS transferts massif Artem Trunov.

Slides:



Advertisements
Similar presentations
Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
Advertisements

Introduction to CMS computing CMS for summer students 7/7/09 Oliver Gutsche, Fermilab.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
December Pre-GDB meeting1 CCRC08-1 ATLAS’ plans and intentions Kors Bos NIKHEF, Amsterdam.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
1 INDIACMS-TIFR TIER-2 Grid Status Report IndiaCMS Meeting, Sep 27-28, 2007 Delhi University, India.
WLCG/8 July 2010/MCSawley WAN area transfers and networking: a predictive model for CMS WLCG Workshop, July 7-9, 2010 Marie-Christine Sawley, ETH Zurich.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
Claudio Grandi INFN Bologna CMS Operations Update Ian Fisk, Claudio Grandi 1.
CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
SC4 Planning Planning for the Initial LCG Service September 2005.
6 Sep Storage Classes implementations Artem Trunov IN2P3, France
Storage Classes report GDB Oct Artem Trunov
Production Activities and Results by ALICE Patricia Méndez Lorenzo (on behalf of the ALICE Collaboration) Service Challenge Technical Meeting CERN, 15.
CMS Computing Model summary UKI Monthly Operations Meeting Olivier van der Aa.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
Gestion des jobs grille CMS and Alice Artem Trunov CMS and Alice support.
Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.
CMS data access Artem Trunov. CMS site roles Tier0 –Initial reconstruction –Archive RAW + REC from first reconstruction –Analysis, detector studies, etc.
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
1-2 March 2006 P. Capiluppi INFN Tier1 for the LHC Experiments: ALICE, ATLAS, CMS, LHCb.
Baseline Services Group Status of File Transfer Service discussions Storage Management Workshop 6 th April 2005 Ian Bird IT/GD.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
The Beijing Tier 2: status and plans
WP18, High-speed data recording Krzysztof Wrona, European XFEL
LCG Service Challenge: Planning and Milestones
Flavia Donno CERN GSSD Storage Workshop 3 July 2007
Service Challenge 3 CERN
Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.
Data Challenge with the Grid in ATLAS
CMS — Service Challenge 3 Requirements and Objectives
ATLAS activities in the IT cloud in April 2008
Main next computing activities
Bernd Panzer-Steindel, CERN/IT
Update on Plan for KISTI-GSDC
Status and Prospects of The LHC Experiments Computing
Model (CMS) T2 setup for end users
SRM2 Migration Strategy
Farida Fassi, Damien Mercie
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Olof Bärring LCG-LHCC Review, 22nd September 2008
CMS staging from tape Natalia Ratnikova, Fermilab
CC IN2P3 - T1 for CMS: CSA07: production and transfer
Artem Trunov and EKP team EPK – Uni Karlsruhe
ALICE Computing Upgrade Predrag Buncic
Data Management cluster summary
N. De Filippis - LLR-Ecole Polytechnique
R. Graciani for LHCb Mumbay, Feb 2006
LCG Service Challenges Overview
Artem Trunov Computing Center IN2P3
lundi 25 février 2019 FTS configuration
ATLAS DC2 & Continuous production
The ATLAS Computing Model
The LHCb Computing Data Challenge DC06
Presentation transcript:

CMS transferts massif Artem Trunov

CMS site roles Tier0 Tier1 Tier2 Initial reconstruction Archive RAW + REC from first reconstruction Analysis, detector studies, etc Tier1 Archive a fraction of RAW (2nd copy) Subsequent reconstruction “Skimming” (off AOD) Archiving Sim data produced at T2s Serve AOD data to other T1 and T2s Analysis Tier2 Simulation Production

Data Distribution T0 – T1 T1 – T1 T1 – T2 T2 – T1 RAW + first pass RECO,AOD T1 – T1 subsequent RECO pass AOD exchange T1 – T2 AOD and other data T2 – T1 MC upload

Storage at a typical T1 (tape) Dave 1027= SIM + tapefill*(AnaStore + Nstreams * (streamRAW +NReco * ( sRECO + sAOD + AnaGroupSpace)) + Tape2007 1027= 249 + 1*(50 + 5*(36 +3*(6 + 1.2 + 10) )) + 290 = = 249 + 50+ 180 + 108 + 150 + 290 Custodial: RAW + re-Reco + MC 537 = Dave’s total without Tape2007 and AnaStore and AnaGroupSpace SIM 249 RAW 180 (+30 + 6 of first reco&aod?) RECO 3pass x 30 = 90 AOD 3pass x 6 = 18 Tape subtotal Custodial + 2007 tape = 827 Tape for AOD exchange? AOD exchange worth of 1y 4 x 54 =216 Or current RECO exchange = 270 With AOD exchange – 1043 With RECO exchange - 1097

Nominal rates for a typical T1 T0 – T1 RAW ~30 MB/s (2100 TB, 92 days of running, 260MB/s out of CERN during run days) T1 – T1 AOD exchange In ~50 MB/s (60TB sample from other T1 during 14 days) Out ~50 MB/s (6TB x 7sites for 14 days) T1 – T2 ~100 MB/s (60TB x 30 T2s / 7 T1s / 30days) T2 - T1 MC In ~10 MB/s (250TB for 356 days) Resume: 100 MB/s always to tape + disk MAX 150 MB/s mostly from disk MAX

Phedex CMS data transfer tool The only tool that is required to run at all sites Work in progress to ease this requirement – with SRM, fully remote operation is possible. However local support is still need to debug problems. Set of site-customizable agents that perform various transfer related tasks. download files to site produce SURL of local files for other sites to download follow migration of files to the MSS staging of files to the MSS removing local files Uses ‘pull’ model of transfers, i.e. transfers are initiated at the destination site by Phedex running at this site. Uses Oracle at CERN to keep it’s state information Can use FTS to perform transfer, or srmcp Or another mean, like direct gridftp, but CMS requires SRM at sites. One of the oldest and stable SW component of CMS Secret of success: development is carried out by the CERN+site people who are/were involved in daily operations Uses someone’s personal proxy certificate

Results so far – SC4 and CSA06 From FNAL From CERN

IN2P3 transfer rate - CSA06 23 MB/s average The goal was stability problems with new HW 23 MB/s average The goal was stability

T1 – T1 and T1 – T2 CMS has not yet demonstrated great performance in this area. So far tests involved simultaneous transfers of T2 and T1, and broken links with hanging transfers sometimes “jam” sites and real performance can not be determined. Besides, overwhelming amount of errors simply prevents from debugging. This year started with new format of LoadTest, where dedicated transfer links are centrally activated and subscribed, so this will bring much better understanding of individual links performance

Problems – SRM instability very complex system poor implementation

How did we get there? Choice of authentification model (gsi) → gridftp But the protocol requires some many ports open and difficult to deal from behind NAT Interoperability → SRM Did not initially target LCH, not compatible implementations, obscure semantics Transfer management → FTS (developed at CERN) Does not fully address bandwidth management

Problems - FTS deployment scheme In this scheme CERN takes advantage of managing it’s own traffic in or out. But T1 site only manages incoming traffic! Your outgoing traffic is not under your control! Big issue – one has to account for potential increase in traffic and take measures to guarantee incoming traffic Increase number of servers in transfer pools Make separate write pools (import) and read pools (export) CERN Traffic managed by CERN FTS T1 Traffic managed by T1 FTS T2 Neither Phedex is managing outgoing traffic!

Throughput tests T0 – T1 When it comes to records, Phedex-driven transfers made to 250 MB/s from CERN into a common SC buffer in absence of other VO transfers. The credit is split between Lyon storage admins, CERN FTS people and CMS coordination people

Plans until Fall’07 Transfer tests inline with LCG plans Feb-March – 65% of 2008 rates ~21MB/s April-May – same with SRM 2.2 LoadTest for testing inter-site links - permanent CSA07 – July 50MB/s from CERN to tape 50MB/s aggregate from T1s to tape 50MB/s aggregate to T1s 100MB/s aggregate to (5) T2s in 8h bursts 50MB/s from (5) T2s to tape Basically testing 2008 numbers during July