Project: COMP_01 R&D for ATLAS Grid computing

Slides:



Advertisements
Similar presentations
CERN – June 2007 View of the ATLAS detector (under construction) 150 million sensors deliver data … … 40 million times per second.
Advertisements

Overview of LCG-France Tier-2s and Tier-3s Frédérique Chollet (IN2P3-LAPP) on behalf of the LCG-France project and Tiers representatives CMS visit to Tier-1.
Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
Grid Applications for High Energy Physics and Interoperability Dominique Boutigny CC-IN2P3 June 24, 2006 Centre de Calcul de l’IN2P3 et du DAPNIA.
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Connect. Communicate. Collaborate perfSONAR MDM Service for LHC OPN Loukik Kudarimoti DANTE.
LHC Computing, CERN, & Federated Identities
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
ICEPP regional analysis center (TOKYO-LCG2) ICEPP, The University of Tokyo 2013/5/15Tomoaki Nakamura ICEPP, Univ. of Tokyo1.
GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.
Hiroyuki Matsunaga (Some materials were provided by Go Iwai) Computing Research Center, KEK Lyon, March
Grid Computing 4 th FCPPL Workshop Gang Chen & Eric Lançon.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Grid Computing at NIKHEF Shipping High-Energy Physics data, be it simulated or measured, required strong national and trans-Atlantic.
THE ATLAS COMPUTING MODEL Sahal Yacoob UKZN On behalf of the ATLAS collaboration.
T0-T1 Networking Meeting 16th June Meeting
Bob Jones EGEE Technical Director
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
Status of WLCG FCPPL project
WLCG Tier-2 Asia Workshop TIFR, Mumbai 1-3 December 2006
LHCOPN/LHCONE status report pre-GDB on Networking CERN, Switzerland 10th January 2017
Computing Operations Roadmap
Ian Bird WLCG Workshop San Francisco, 8th October 2016
“A Data Movement Service for the LHC”
ALICE internal and external network
Virtualization and Clouds ATLAS position
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Status report on LHC_2: ATLAS computing
Grid site as a tool for data processing and data analysis
Status Report on LHC_2 : ATLAS computing
Belle II Physics Analysis Center at TIFR
LHCOPN update Brookhaven, 4th of April 2017
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
Update on SINET5 implementation for ICEPP (ATLAS) and KEK (Belle II)
Pasquale Migliozzi INFN Napoli
Status and Plans on GRID related activities at KEK
Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.
Data Challenge with the Grid in ATLAS
Future challenges for the BELLE II experiment
LCG-France activities
Dagmar Adamova, NPI AS CR Prague/Rez
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Readiness of ATLAS Computing - A personal view
Network between CC-IN2P3 and KEK
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
UK Status and Plans Scientific Computing Forum 27th Oct 2017
A high-performance computing facility for scientific research
3rd Asia Tier Centre Forum Summary report LHCONE meeting in Tsukuba 17th October 2017
K. Schauerhammer, K. Ullmann (DFN)
An introduction to the ATLAS Computing Model Alessandro De Salvo
Cloud Computing R&D Proposal
The latest developments in preparations of the LHC community for the computing challenges of the High Luminosity LHC Dagmar Adamova (NPI AS CR Prague/Rez)
Grid Canada Testbed using HEP applications
LHC Data Analysis using a worldwide computing grid
Grid Computing 6th FCPPL Workshop
Wide-Area Networking at SLAC
The ATLAS Computing Model
GRIF : an EGEE site in Paris Region
IPv6 update Duncan Rand Imperial College London
The LHC Computing Grid Visit of Professor Andreas Demetriou
The LHCb Computing Data Challenge DC06
Presentation transcript:

Project: COMP_01 R&D for ATLAS Grid computing Tetsuro Mashimo International Center for Elementary Particle Physics (ICEPP), The University of Tokyo on behalf of the COMP_01 project team 2016 Joint Workshop of FKPPL and TYL/FJPPL May 18-20, 2016 @Korea Institute for Advanced Study, Seoul

COMP_01 “R&D for ATLAS Grid computing” Cooperation between French and Japanese teams in R&D on ATLAS distributed computing for the LHC Run 2 era (2015~2018) Goal Tackle important challenges of the next years: new computing model, hardware, software, and networking issues Partners The International Center for Elementary Particle Physics (ICEPP), the University of Tokyo (WLCG (Worldwide LHC Computing Grid) Japanese Tier-2 center for ATLAS) and French Tier-2 centers/Tier-1 center (Lyon)

COMP_01: members French group Lab. Japanese group E. Lançon* Irfu * leader French group Lab. Japanese group E. Lançon* Irfu T. Mashimo* ICEPP L. Poggioli IN2P3 I. Ueda KEK R. Vernet T. Nakamura M. Jouvin N. Matsui S. Jézéquel H. Sakamoto C. Biscarat T. Kawamoto J.-P. Meyer E. Vamvakopoulos

LHC Run 2 (2015~2018) Started in 2015 with collision energy of 13 TeV, but the goal of the year 2015 was to establish important running parameters of LHC for Run 2. Integrated luminosity delivered to ATLAS: ~ 4.2 fb-1 2016: ~ 25 fb-1 for ATLAS will put more burden on computing than 2015 Run 3 (2021 ~ ): even bigger challenges Cooperation must be strengthened on R&D for basic day-to-day technical/operational aspects of Grid

COMP_01 “R&D for ATLAS Grid computing” Cooperation between French and Japanese teams on ATLAS distributed computing has been lasting for 10 years (previous projects “LHC_02” (2006 ~ 2012) and “LHC_07” (2013 ~ 2014) ) ICEPP Tier-2 provides resources for ATLAS only (one of the largest Tier-2 centers for ATLAS) and is “associated with” the Tier-1 center in Lyon A main interest in the past was efficient use of wide area network, especially for the transfer between the ICEPP Tier-2 and the remote sites in Europe, etc. e.g. Round Trip Time (RTT) ~ 300 msec. between the ICEPP Tier-2 and the Tier-1 in Lyon

Network between Lyon and Tokyo (in an old era) New York SINET GEANT RENATER Lyon 10 Gb/s RTT=300 ms Lyon ASGC (Taiwan) BNL (USA-Long Island) Triumf (Canada-Vancouver) Exploiting the bandwidth is not a trivial thing: packet loss at various places, which leads to lower transfer speed, directional asymmetry in transfer performance, performance fluctuations in time, …

ATLAS Computing Model - Tiers

Implementation of the ATLAS computing model: tiers and clouds Hierarchical tier organization based on Monarc network topology Sites are grouped into clouds for organizational reasons Possible communications: Optical Private Network T0-T1 T1-T1 National networks Intra-cloud T1-T2 Restricted communications: General public network Inter-cloud T1-T2 Inter-cloud T2-T2

Detector Data Distribution Tier-0 O(2to4GB) files (with exceptions) RAW and reconstructed data generated at CERN and dispatched at T1s. Reconstructed data further replicated downstream to T2s of the SAME cloud Tier-1 Tier-1 Tier-1 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2

Data distribution after Reprocessing and Monte Carlo Reconstruction RAW data is re-processed at T1s to produce a new version of derived data Derived data are replicated to T2s of the same cloud Derived data are replicated to a few other T1s (or CERN) And, from there, to other T2s of the same cloud Tier-0 O(2to4GB) files (with exceptions) Tier-1 Tier-1 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2

Monte Carlo production Simulation (and some reconstruction) run at T2s Input data hosted at T1s is transferred (and cached) at T2s Output data are copied and stored back to T1s For reconstruction, derived data are replicated to few other T1s (or CERN) And, from there, to other T2s of the same cloud Tier-0 INPUT OUTPUT Tier-1 Tier-1 Tier-2 Tier-2 Tier-2

Analysis The paradigm is “jobs go to data” i.e. No WAN involved. Jobs are brokered at sites where data have been pre-placed Jobs access data only from the local storage of the site where they run Jobs store the output in the storage the site where they run No WAN involved. (by Simone Campana, ATLAS TIM, Tokyo, May 2013)

Issues - I You need data at some T2 (normally “your” T2) The inputs are at some other T2 in a different cloud Examples: Outputs of analysis jobs Replication of particular samples on demand According to the model you should: Tier-1 Tier-1 Tier-2 Tier-2 (by Simone Campana, ATLAS TIM, Tokyo, May 2013)

Issues - II You need to process data available only at a given T1 All sites of that cloud are very busy You assign jobs to some T2 of a different cloud INPUT According to the model you should: OUTPUT Tier-1 Tier-1 Tier-2 (by Simone Campana, ATLAS TIM, Tokyo, May 2013)

Evolution of the ATLAS computing model ATLAS decided to relax the “Monarch model” Allow T1-T2 and T2-T2 traffic between different clouds (growth of network bandwidth) Any site can exchange data with any site if the system believes it is convenient In the past ATLAS asked (large) T2s To be well connected to their T1 To be well connected to the T2s of their cloud Now ATLAS are asking large T2s: To be well connected to all T1s To foresee non negligible traffic from/to other (large) T2s

COMP_01: R&D for ATLAS Grid computing Networking therefore remains as a very important issue, especially for a large Tier-2 like ICEPP which now takes various roles (part of which were responsibility of mainly Tier-1s): careful monitoring is necessary with e.g. the “perfsonar” tool Other topics addressed by the collaboration Use of virtual machines for operating WLCG services Improvement of reliability of the middleware for storage Evolution toward a new batch system Performance of data access from analysis jobs through various protocols (http, FAX (Federated ATLAS storage systems using XRootD) ) Preparation of the evolution needed for Run 3

New internet connections for Tokyo Tier-2 “SINET” (Science Information Network), the Japanese academic backbone network provided by National Institute for Informatics (“NII”), renewed: “SINET4” → “SINET5” (started in April 2016) Backbone network inside Japan: “SINET4”: 40 Gbps + 10 Gbps → “SINET5”: 100 Gbps International connections: Direct connection from Japan to Europe (10 Gbps x 2 via Siberia instead of via US in SINET4) Round Trip Time (RTT) to Lyon: ~ 290 msec → ~ 190 msec Japan to US: 100 Gbps + 10 Gbps ICEPP Tier-2’s connection to outside: soon 10 Gbps → 20 Gbps

WAN for TOKYO Tier-2 (“SINET4” era) ASGC BNL TRIUMF NDGF RAL CCIN2P3 CERN CANF PIC SARA NIKEF LA Pacific Atlantic 10Gbps WIX Additional new line (10Gbps) since the end of March 2013 OSAKA 40Gbps 20 Gbps 10 Gbps Amsterdam Geneva Dedicated line 14:50 Overview of the SINET 20' 14:50 Overview of the SINET 20' LHCONE: New dedicated (virtual) network for Tier-2 centers, etc. “perfSONAR” tool put in place for network monitoring

Budget plan in the year 2016 Item Euro Support-ed by k Yen Travel 1,000 160 Nb travels 3 3,000 IN2P3 480 ICEPP Per-diem 235 22.7 Nb days 15 3,525 12 272 Nb Travels 2 2,000 Irfu 10 2,350 Total 10,875 752

Cost of the project Cost for hardware not needed The project uses the existing computing facilities at the Tier-1 and Tier-2 centers in France and Japan and the existing network infrastructure provided by NRENs and GEANT, etc. Good communication is the key issue e-mails and video-conferences are widely used, but face-to-face meetings are necessary usually once per year (a small workshop), therefore the cost for travel and stay