ALICE internal and external network

Slides:



Advertisements
Similar presentations
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
Advertisements

Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
GridFTP Guy Warner, NeSC Training.
ALICE Tier-2 at Hiroshima Toru Sugitate of Hiroshima University for ALICE-Japan GRID Team LHCONE workshop at the APAN 38 th.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Pilot Test-bed Operations and Support Work.
AP Network Report Aries Hung, Min Tsai ASGC WLCG T2 Asia Workshop TIRF, Dec 2, 2006.
ALICE data access WLCG data WG revival 4 October 2013.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
Enabling Grids for E-sciencE Overview of System Analysis Working Group Julia Andreeva CERN, WLCG Collaboration Workshop, Monitoring BOF session 23 January.
Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.
Project Management Sarah Pearce 3 September GridPP21.
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
N EWS OF M ON ALISA SITE MONITORING
ALICE – networking LHCONE workshop 10/02/ Quick plans: Run 2 data taking Both for Pb+Pb and p+p – Reach 1 nb -1 integrated luminosity for rare.
Update on replica management
ALICE Grid operations: last year and perspectives (+ some general remarks) ALICE T1/T2 workshop Tsukuba 5 March 2014 Latchezar Betev Updated for the ALICE.
Status of PDC’07 and user analysis issues (from admin point of view) L. Betev August 28, 2007.
Online-Offsite Connectivity Experiments Catalin Meirosu *, Richard Hughes-Jones ** * CERN and Politehnica University of Bucuresti ** University of Manchester.
Network to and at CERN Getting ready for LHC networking Jean-Michel Jouanigot and Paolo Moroni CERN/IT/CS.
ALICE Operations short summary ALICE Offline week June 15, 2012.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
2012 RESOURCES UTILIZATION REPORT AND COMPUTING RESOURCES REQUIREMENTS September 24, 2012.
ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.
DJ: WLCG CB – 25 January WLCG Overview Board Activities in the first year Full details (reports/overheads/minutes) are at:
HP Openview NNM: Scalability and Distribution. Reference  “HP Openview NNM: A Guide to Scalability and Distribution”,
tons, 150 million sensors generating data 40 millions times per second producing 1 petabyte per second The ATLAS experiment.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
GridFTP Guy Warner, NeSC Training Team.
Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.
GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.
ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPIX, BNL 13 Oct 2015.
ALICE computing Focus on STEP09 and analysis activities ALICE computing Focus on STEP09 and analysis activities Latchezar Betev Réunion LCG-France, LAPP.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"
Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.
ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.
Storage discovery in AliEn
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 24/05/2015.
1 LCG-France 22 November 2010 Tier2s connectivity requirements 22 Novembre 2010 S. Jézéquel (LAPP-ATLAS)
Federating Data in the ALICE Experiment
Dynamic Extension of the INFN Tier-1 on external resources
Extending the farm to external sites: the INFN Tier-1 experience
WLCG Workshop 2017 [Manchester] Operations Session Summary
Track 3 summary Distributed Computing
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
WLCG Network Discussion
Ian Bird WLCG Workshop San Francisco, 8th October 2016
Xiaomei Zhang CMS IHEP Group Meeting December
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Belle II Physics Analysis Center at TIFR
ALICE Monitoring
Data Challenge with the Grid in ATLAS
The transfer performance of iRODS between CC-IN2P3 and KEK
Update on Plan for KISTI-GSDC
Disk capacities in 2017 and 2018 ALICE Offline week 12/11/2017.
Storage elements discovery
Simulation use cases for T2 in ALICE
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
GGF15 – Grids and Network Virtualization
The LHCb Computing Data Challenge DC06
Presentation transcript:

ALICE internal and external network requirements @T2s WLCG Workshop 2017 University of Manchester 20 June 2017 Latchezar Betev

The ALICE Grid today T0+ 8xT1s+59xT2s 54 in Europe 3 in North America 9 in Asia 1 in Africa 2 in South America T0+ 8xT1s+59xT2s

Average CPU count ~130 CPU cores - April 2017 60% @T0+T1s, 40% @T2s

WAN @T2s 6PB = data traffic to external sites and off-Grid export 4PB = data traffic from external sites (to local SE and clients) 6PB = data traffic to external sites and off-Grid export (to remote SE + Grid and non-Grid clients)

WAN Numbers @T2s Total WAN traffic are 1/11 of LAN traffic WAN total T2s = 650MB/sec => 12.5KB/sec/core (52K cores) Our ‘advisory rule’ for T2s is “100Mb WAN network per 1K cores” Usually the available bandwidth is superior to the above ~Independent of T2 role – a diskless T2 has to export ~2x the produced data; T2 with an SE exports one copy and accepts copies from other centres.

LAN @T2s 100PB = reading from local SE 7PB = writing to local SE 100PB = reading from local SE Note the spiky structure – these are analysis trains running during the night (most of the load on the SEs)

LAN Numbers @T2s LAN sustains by far the highest traffic (jobs go to data) About 2% of data is read remotely (for example, local SE issue) 90% of the LAN traffic is analysis, 10% is local + remote clients write (for a T2 with SE) Diskless T2s - LAN=WAN traffic Figure of merit (for 85%+ CPU efficiency): 2.5MB/sec/core from local SE Remote access penalty RTT is introducing efficiency penalty of ~15% per 20ms Remote read is feasible for small data samples and as failover

Role of tiers T0/T1/T2s – only differentiation is RAW data storage and processing (T0/T1s) All tiers with storage do MC and analysis Diskless T2s can do only MC, all data is exported to close T0/T1s/T2s

Role of network monitoring ALICE is using in-house measured network map Through all-to-all VO-box periodic bandwidth tests Used to determine the best placement for data from every client (Storage Auto-discovery) in real time Based on network topology   Traceroute/tracepath for RTT (bidirectional)  1 TCP stream bandwidth tests (like jobs do) => Distance metric function of the above measurements plus storage availability and occupancy levels

Network paths

Network tuning Comprehensive WAN network picture is essential for efficient data transfers and placement ALICE depends on and strongly encourages the full implementation of LHCONE at all sites Plus the associated tool (like PerfSONAR), properly instrumented to be used in the individual Grid frameworks Huge progress (Thank you network experts!) Number of sites in well-connected parts of the world still have to join Entire regions with high local growth of computing resources, in particular Asia, require a lot of further work