Experiments and User Support

Slides:



Advertisements
Similar presentations
Estonian Grid Mario Kadastik On behalf of Estonian Grid Tallinn, Jan '05.
Advertisements

Line Efficiency     Percentage Month Today’s Date
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
KISTI-GSDC SITE REPORT Sang-Un Ahn, Jin Kim On the behalf of KISTI GSDC 24 March 2015 HEPiX Spring 2015 Workshop Oxford University, Oxford, UK.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
Participation of JINR in CERN- INTAS project ( ) Korenkov V., Mitcin V., Nikonov E., Oleynik D., Pose V., Tikhonenko E. 19 march 2004.
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
Daniele Cesini - INFN CNAF. INFN-CNAF 20 maggio 2014 CNAF 2 CNAF hosts the Italian Tier1 computing centre for the LHC experiments ATLAS, CMS, ALICE and.
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
Project Execution Methodology
Dynamic Extension of the INFN Tier-1 on external resources
Extending the farm to external sites: the INFN Tier-1 experience
WLCG IPv6 deployment strategy
Deployment timelines LHCb CMS ATLAS 2007 Dec Nov Oct Sep Aug Jul Jun
Status: ATLAS Grid Computing
Key Project Drivers - FY10 Ruth Pordes, June 15th 2009
Jan 2016 Solar Lunar Data.
LCG Service Challenge: Planning and Milestones
Daniele Cesini – INFN-CNAF - 19/09/2017
IT Strategy Roadmap Template
INFN Computing infrastructure - Workload management at the Tier-1
Incident Response Plan for the Open Science Grid
Christos Markou Institute of Nuclear Physics NCSR ‘Demokritos’
Update on Plan for KISTI-GSDC
KISTI-GSDC Tier-1 SITE REPORT
LHCb Software & Computing Status
ITI Portfolio Plan Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Current Date Visibility of ITI Projects ITI Projects.
Readiness of ATLAS Computing - A personal view
Castor services at the Tier-0
Update from the HEPiX IPv6 WG
The CCIN2P3 and its role in EGEE/LCG
Universita’ di Torino and INFN – Torino
Introduction to Cloud Computing
LCG Operations Workshop, e-IRG Workshop
Change Network Timeline
Average Monthly Temperature and Rainfall
Broward County Consolidated Communication Committee Update
LHC Data Analysis using a worldwide computing grid



Gantt Chart Enter Year Here Activities Jan Feb Mar Apr May Jun Jul Aug
WLCG Collaboration Workshop: Outlook for 2009 – 2010
Q1 Q2 Q3 Q4 PRODUCT ROADMAP TITLE Roadmap Tagline MILESTONE MILESTONE
FY 2019 Close Schedule Bi-Weekly Payroll governs close schedule
CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019



Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Luca dell’Agnello, Daniele Cesini – INFN-CNAF CCR - 23/05/2017
Text for section 1 1 Text for section 2 2 Text for section 3 3
Q1 Q2 Q3 Q4 PRODUCT ROADMAP TITLE Roadmap Tagline MILESTONE MILESTONE
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3

Port Back Data Collection & Transition
Q1 Q2 Q3 Q4 PRODUCT ROADMAP TITLE Roadmap Tagline MILESTONE MILESTONE
Port Back Data Collection & Transition
Presentation transcript:

Experiments and User Support D.Cesini CNAF Review, May 2015

Outline Experiments activity at CNAF User Support Team Resource Usage Monitored Availability for LHC User Support Team Support activities CNAF internal activities Activities performed within the experiments Criticalities 04/05/2015 CNAF Review 2015, D.Cesini

Experiments Resource Usage 04/05/2015 CNAF Review 2015, D.Cesini

Experiments @CNAF CNAF is officially supporting 31 experiments 4 LHC 27 non-LHC Ten Virtual Organizations in opportunistic usage via Grid services 04/05/2015 CNAF Review 2015, D.Cesini

Experiments per discipline (I) 04/05/2015 CNAF Review 2015, D.Cesini

Experiments per discipline (II) Accelerator Babar, Belle2, CDF, LHCf, KLOE, NA62 Cosmic Ray AMS-02, ARGO-YBJ, Auger, CTA, PAMELA, LHAASO, EEE Gamma Ray Fermi/GLAST, MAGIC, AGATA Neutrino Physics Borexino, GERDA, ICARUS, OPERA, CUORE, KM3NeT/NEMO, JUNO Dark Matter XENON100, DarkSide-50 Gravitational Waves Virgo Bioinformatics Biomed 04/05/2015 CNAF Review 2015, D.Cesini

Data center access Main access is performed via Grid services For both storage and CPU Some non-LHC experiments use local access CPU is accessed in batch mode but … …interactive access is becoming a common requirement in particular for smaller VOs Small collaborations also request dedicated CPU for quick interactive analysis Some of them are also requesting interactive graphical access 04/05/2015 CNAF Review 2015, D.Cesini

Usage Statistics (CPU-I) CPU USAGE (2014-2015) – ALL VOs HS06 WCT JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 In March 2014 a large part of the overpledge was switched off Part of the overpledge was maintained online experiments peak testing activities Overpledge assigned dynamically via batch system fairshare Pledges of the year are considered to start in April 04/05/2015 CNAF Review 2015, D.Cesini

Usage Statistics (CPU-II) CPU USAGE (2014-2015) – non-LHC VOs HS06 WCT JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 AMS-02 and VIRGO main CPU users CDF phase out CTA activity increased Under-usage during the first months of 2014 Typical burst activity for these VOs Pledges of the year are considered to start in April 04/05/2015 CNAF Review 2015, D.Cesini

Usage Statistics (DISK-I) DISK USAGE (2014-2015) – ALL VOs TB JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 Pledges of the year are considered to start in April Underpledge in 2015 is due to ALICE: +1.5PB with respect to 2015 04/05/2015 CNAF Review 2015, D.Cesini

Usage Statistics (DISK-II) DISK USAGE (2014-2015) – non-LHC VOs JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 TB Underpledge in 2014 due to DARKSIDE assignment: -370TB All non-LHC VOs at pledge in 2015 AMS-02, VIRGO, CDF main disk users 04/05/2015 CNAF Review 2015, D.Cesini

Usage Statistics (TAPE) TAPE USAGE (2014-2015) – ALL VOs TB Pledges of the year are considered to start in April The increasing tape usage for the CDF Long Term Data Preservation activities is visible 04/05/2015 CNAF Review 2015, D.Cesini

Experiments availability (I) Incident on a DDN system 04/05/2015 CNAF Review 2015, D.Cesini

Experiments availability (II) 17-21/2: Scheduled downtime to add CMS disk 04/05/2015 CNAF Review 2015, D.Cesini

The User Support Team 04/05/2015 CNAF Review 2015, D.Cesini

User Support Team It’s the primary link between the users and the data center operations On the users side Helps and assists users in accessing the computing and storage resources at CNAF Collaborates with the users in creating their computing models and in adopting the most appropriate technologies for their needs Participates to experiments collaborations on specific tasks Mostly connected to computing On the CNAF side: Tracks user requests and takes care of communications with the experiments Collaborates in operating some of the Tier1 components and services, those closer to the users Takes care of the documentation needed to access the center 04/05/2015 CNAF Review 2015, D.Cesini

CNAF Run Coordinator Oversee the VOs activities Represent CNAF at the Daily (now bi-weekly) WLCG calls Report about resource usage and problems at the Tier1 management body (Comitato di Gestione, CdG) Edit the CdG monthly report 04/05/2015 CNAF Review 2015, D.Cesini

People and experiments assignment 5 group members (post-docs) 3 group members, one per experiment, dedicated to ATLAS, CMS, LHCb 2 group members dedicated to all the other experiments 1 close external collaboration for ALICE 1 group coordinator from the Tier1 staff CNAF 04/05/2015 CNAF Review 2015, D.Cesini

The sharing model Each group member is embedded for a 50% of his/her working time into at least one experiment Creation of computing models in distributed and Cloud environments Development of code Operation and development of monitoring frameworks Porting experiment software to novel parallel architectures Remaining 50% spent according to the group mandate Day by day support activities CNAF internal activities 04/05/2015 CNAF Review 2015, D.Cesini

Support activities The group acts as a first level of support for the users Incident initial analysis and escalation if needed Provides information to access and use the data center Takes care of communications between users and CNAF operations Tracks middleware bugs if needed Reproduces problematic situations Can create proxy for all VOs or belong to local account groups Provides consultancy to users for computing models creation Collects and tracks user requirements towards the datacenter 04/05/2015 CNAF Review 2015, D.Cesini

Communication channels GGUS Mailing lists Direct mail Meetings Skype Telephone Whatsapp Smoke signals …… JIRA Tracking System User Support GGUS Mailing lists JIRA Tracking Experiments Operation Staff Broadcast through GOCDB and cumulative mailing lists for non-LHCB experiments Many of the experiments have a CNAF dedicated mailing list There is a communication channels zoo to handle on the experiment side… ….but it’s working fine Scalability could be an issue if the number of experiments will increase significantly Everything is tracked into the group internal JIRA system GGUS JIRA Tracking Middleware Dev. 04/05/2015 CNAF Review 2015, D.Cesini

CNAF internal activities StoRM Storage Element testing Including tests for deployment on VMs Recently involved in the development of the internal monitoring framework Training and documentation Documentation, FAQ, Knowledge base to be improved Plans to organize CNAF training events for users 04/05/2015 CNAF Review 2015, D.Cesini

Direct experiment collaboration (I) The attitude of “sharing” the group members with the experiments is adopted also for other CNAF units Provides a very effective way for short-circuiting communications and debugging reduced time-to-solution for the detected issues Results in a higher involvement and motivation in the support activities for CNAF personnel 04/05/2015 CNAF Review 2015, D.Cesini

Direct experiment collaboration (II) Application porting to computing accelerators and novel architectures Computing model creation Software Development Virtualized Infrastructure dev and ops - CMS virtual Datacenter - Extreme Energy Events project (EEE) - KM3-Net TriDAS -ATLAS SAM probe & Monitoring - LHCB event building on Infiniband - ATLAS track reconstruction - X-Ray Tomography applications - COSA project - OPERA data management and Montecarlo toolchain -EEE Data management and data store 04/05/2015 CNAF Review 2015, D.Cesini

Criticalities Heterogeneity in how the experiments use the resources grid vs local, shared vs dedicated, posix vs srm vs gridftp vs xrootd vs webdav, batch vs interactive, etc.. This is getting worse with the request of interactive graphical access from some experiments Time consuming to acquire competence and support Burst activity from non-LHC VOs to manage Communication channels zoo to handle Documentation for end-users to be improved A User Support Team closer integration with the datacenter operations staff could be achieved i.e collaborating actively in the operations of some services (UIs) 04/05/2015 CNAF Review 2015, D.Cesini