Grid Operations in Germany T1-T2 workshop 2015 Torino, Italy Kilian Schwarz WooJin Park Christopher Jung.

Slides:



Advertisements
Similar presentations
National Grid's Contribution to LHCb IFIN-HH Serban Constantinescu, Ciubancan Mihai, Teodor Ivanoaica.
Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
ALICE Operations short summary and directions in 2012 Grid Deployment Board March 21, 2011.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week –
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Status of the DESY Grid Centre Volker Guelzow for the Grid Team DESY IT Hamburg, October 25th, 2011.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
Status Report of WLCG Tier-1 candidate for KISTI-GSDC Sang-Un Ahn, for the GSDC Tier-1 Team GSDC Tier-1 Team 12 th CERN-Korea.
Status Report of WLCG Tier-1 candidate for KISTI-GSDC Sang-Un Ahn, for the GSDC Tier-1 Team GSDC Tier-1 Team ATHIC2012, Busan,
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT.
PRIN STOA-LHC: STATUS BARI BOLOGNA-18 GIUGNO 2014 Giorgia MINIELLO G. MAGGI, G. DONVITO, D. Elia INFN Sezione di Bari e Dipartimento Interateneo.
Pledged and delivered resources to ALICE Grid computing in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
ALICE WLCG operations report Maarten Litmaath CERN IT-SDC ALICE T1-T2 Workshop Torino Feb 23, 2015 v1.2.
Grid Operations in Germany T1-T2 workshop 2016 Bergen, Norway Kilian Schwarz Sören Fleischer Raffaele Grosso Christopher Jung.
LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
Alice Operations In France
Compute and Storage For the Farm at Jlab
Kilian Schwarz ALICE Computing Meeting GSI, October 7, 2009
Dynamic Extension of the INFN Tier-1 on external resources
COMPUTING FOR ALICE IN THE CZECH REPUBLIC in 2015/2016
COMPUTING FOR ALICE IN THE CZECH REPUBLIC in 2016/2017
on behalf of the NRC-KI Tier-1 team
Ian Bird WLCG Workshop San Francisco, 8th October 2016
LCG Service Challenge: Planning and Milestones
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Grid site as a tool for data processing and data analysis
Operations and plans - Polish sites
ALICE Monitoring
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Status of the CERN Analysis Facility
ALICE FAIR Meeting KVI, 2010 Kilian Schwarz GSI.
Vanderbilt Tier 2 Project
GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.
Update on Plan for KISTI-GSDC
Luca dell’Agnello INFN-CNAF
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
RDIG for ALICE today and in future
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
1 VO User Team Alarm Total ALICE ATLAS CMS
ATLAS Sites Jamboree, CERN January, 2017
GSIAF "CAF" experience at GSI
Project Status Report Computing Resource Review Board Ian Bird
Competences & Plans at GSI
ALICE – FAIR Offline Meeting KVI (Groningen), 3-4 May 2010
Ákos Frohner EGEE'08 September 2008
ALICE Computing Model in Run3
The INFN Tier-1 Storage Implementation
ALICE Computing Upgrade Predrag Buncic
ETHZ, Zürich September 1st , 2016
This work is supported by projects Research infrastructure CERN (CERN-CZ, LM ) and OP RDE CERN Computing (CZ /0.0/0.0/1 6013/ ) from.
Dirk Duellmann ~~~ WLCG Management Board, 27th July 2010
The LHCb Computing Data Challenge DC06
Presentation transcript:

Grid Operations in Germany T1-T2 workshop 2015 Torino, Italy Kilian Schwarz WooJin Park Christopher Jung

Table of contents Overview GridKa T1 GSI T2 Summary

Table of contents Overview GridKa T1 GSI T2 Summary

Map of German Grid sites T1: GridKa/FZK in Karlsruhe T2: GSI in Darmstadt

Job contribution (last year) Running jobs: GSI: 1%, FZK: 6.5%  7.5% of ALICE FZK: 9.5% of all DONE jobs, > 9 million jobs, largest contributor of ALICE FZK

Storage contribution Total size: GridKa: 3 PB Disk SE including 0.7 PB tape buffer – – all 10 xrootd servers have been upgraded to v4.0.4 – duplicated name space problem was solved with new xrootd version GSI: xrootd shows the capacity of the complete Lustre cluster. Can this be limited to the size of a directory or some quota space ? – an extra quota for the ALICE SE has been introduced, now 3.3 PB disk based SE 5.25 PB tape capacity 700 TB disk buffer with Tape backend

Table of contents Overview GridKa T1 GSI T2 Summary

8 GridKa Usage (annual statistics 2014) ● ALICE and CMS use more than the nominal share ● ATLAS and LHCb use about their nominal share -The two largest users are ATLAS und ALICE -The 4 LHC experiments together use > 90% total used wall time: hours

Jobs at GridKa within last year max number of concurrent jobs: 9100 average number of jobs: 4000 (+500 since Tsukuba) nominal share: ca jobs

ALICE Job - average job efficiency: 77% - 3 major efficiency drops: - before April 2014: bug in the SE selecting algorithm (fixed by ALICE Offline team April 2014, created heavy load on GridKa firewall) - December 2014 and February 2015: raw data processing (quite a number of mismatched/missing files between Alien catalog and GridKa tape storage leads to many jobs reading data from remote)

xrootd based SEs work fine and are used intensively. - Disk-based xrootd-SE has a size of 2.16 PB (+ 660 TB tape buffer) - 28 PB have been read from ALICE::FZK::SE in 2014 (average reading rate: ~1 GB/s) - (almost twice as much as reported in Tsukuba) - Tape usage increased again

XRootD Architecture at GridKa -08/14: all 10 xrd servers of disk SE upgraded to v flawless operation -duplicated name space problem solved -ALICE::FZK::Tape -upgrade to v4.0.4 planned -plan to use FRM (File Residency Manager) feature of xrootd -local scripts need to be modifed/tested -staging, reading, writing DONE -migrating, purging tests still ongoing -file consistency check alien FC  FZK::Tape done -number of inconsistencies found -list handed over to ALICE::Offline -Internal monitoring : individual server monitoring page has been setup: - -number of active files, daemon status, internal reading/writing test and network throughputs, internal/external data access are displayed. -will be connected to the icinga alarm system

Various points of interest ● Resources at GridKa: ● ALICE has requested (pledged number in WLCG REBUS) 4.01 PB of disk space in 2015 ● GridKa will fullfill the pledges later this year. ● The deadline (1st of April) will be missed. ● WLCG has been informed. ● IPv6 ● GridKa internal tests ongoing (currently with ATLAS services) ● can provide a testbed for ALICE services (but due to lack of manpower not with high priority)

Table of contents Overview GridKa T1 GSI T2 Summary

GSI: a national Research Centre for heavy ion research FAIR: Facility for Ion and Antiproton Research ~2018 GSI computing today ALICE T2/NAF HADES ~10500 cores ~ 7.2 PB lustre ~ 9 PB archive capacity FAIR computing 2018 CBM PANDA NuSTAR APPA LQCD cores 40 PB disk 40 PB archive Open source and community software budget commodity hardware support different communities scarce manpower View of construction site GreenCube Computing Centre Currently being constructed

FAIR Computing: T0/T1 MAN (Metropolitan Area Network) & Grid/Cloud TB/s 1Tb/s

GSI Grid Cluster – present status vobox Prometheus Cluster: 380 nodes/10500 cores Infiniband local Batch Jobs GridEngine Grid CERN GridKa 10 Gbps GSI CE Grid Icarus cluster: 100 nodes/1600 cores Hera Cluster: 7.2 PB ALICE::GSI:: SE2 120 Gbps LHCOne Frankfurt general Internet: 2 Gbps technical details later

GSI Darmstadt: 1/3 Grid, 2/3 NAF ● GSI Darmstadt : - Batch System: Grid Engine - average: 800 T2 Jobs - max (1500 jobs) - about 1/3 Grid (red), 2/3 NAF (blue) - Intensive usage of the local GSI cluster through ALICE NAF (ca. 30% of the cluster capacity)

GSI Storage Element - ALICE::GSI::SE2 works well and is intensively used - in order to fullfill new requirements the setup of the SE has been changed - technical details follow - Lustre quota for SE has been introduced (seperation of local and Grid usage) - in 2014 about 4 PB have been written, 3 PB read - (both about 0.5 PB more than in Tsukuba report) - remaining issue: ALICE SE tests fail once in a while (under investigation) - Remaining issue

new setup of GSI ALICE T2 ● old test cluster (Icarus) has been decomissioned – cluster was a Grid only cluster with almost no Firewall protection ● ALICE T2 needs to run in GSI HPC environment (completely firewall protected Infiniband world) – needs to fullfill HPC requirements ● AliEn services will be tunnelled through ● data access needs to go through xrootd proxy ● redirector needs to distinguish between inside/outside access – advantage: parasitical use of local resources

HPC Cluster: Grid Jobs Firewall LHC One Net HPC Net: Infiniband xrootd Proxy Server Storage Element xrootd Redirector Data access to outside world external requests internal requests ALICE T2 – new storage setup

GSI: next activities ● Fix pending issues with new ALICE T2 setup – a new set of AliEn code has been provided in order to be able to create xrootd Proxy URLs when using remote SEs ● work in progress ● new manpower will be hired – also ALICE T2 will profit from this ● plans for IPv6: – first test environment end of 2015

Table of contents Overview GridKa T1 GSI T2 Summary

German sites provide a valuable contribution to ALICE Grid – Thanks to the centres and to the local teams new developments are on the way Pledges for 2015 can be fullfilled to a large extent FAIR will play an increasing role (funding, network architecture, software development and more...)