Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiments and User Support

Similar presentations

Presentation on theme: "Experiments and User Support"— Presentation transcript:

1 Experiments and User Support
D.Cesini CNAF Review, May 2015

2 Outline Experiments activity at CNAF User Support Team
Resource Usage Monitored Availability for LHC User Support Team Support activities CNAF internal activities Activities performed within the experiments Criticalities 04/05/2015 CNAF Review 2015, D.Cesini

3 Experiments Resource Usage
04/05/2015 CNAF Review 2015, D.Cesini

4 Experiments @CNAF CNAF is officially supporting 31 experiments 4 LHC
27 non-LHC Ten Virtual Organizations in opportunistic usage via Grid services 04/05/2015 CNAF Review 2015, D.Cesini

5 Experiments per discipline (I)
04/05/2015 CNAF Review 2015, D.Cesini

6 Experiments per discipline (II)
Accelerator Babar, Belle2, CDF, LHCf, KLOE, NA62 Cosmic Ray AMS-02, ARGO-YBJ, Auger, CTA, PAMELA, LHAASO, EEE Gamma Ray Fermi/GLAST, MAGIC, AGATA Neutrino Physics Borexino, GERDA, ICARUS, OPERA, CUORE, KM3NeT/NEMO, JUNO Dark Matter XENON100, DarkSide-50 Gravitational Waves Virgo Bioinformatics Biomed 04/05/2015 CNAF Review 2015, D.Cesini

7 Data center access Main access is performed via Grid services
For both storage and CPU Some non-LHC experiments use local access CPU is accessed in batch mode but … …interactive access is becoming a common requirement in particular for smaller VOs Small collaborations also request dedicated CPU for quick interactive analysis Some of them are also requesting interactive graphical access 04/05/2015 CNAF Review 2015, D.Cesini

8 Usage Statistics (CPU-I)
CPU USAGE ( ) – ALL VOs HS06 WCT JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 In March 2014 a large part of the overpledge was switched off Part of the overpledge was maintained online experiments peak testing activities Overpledge assigned dynamically via batch system fairshare Pledges of the year are considered to start in April 04/05/2015 CNAF Review 2015, D.Cesini

9 Usage Statistics (CPU-II)
CPU USAGE ( ) – non-LHC VOs HS06 WCT JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 AMS-02 and VIRGO main CPU users CDF phase out CTA activity increased Under-usage during the first months of 2014 Typical burst activity for these VOs Pledges of the year are considered to start in April 04/05/2015 CNAF Review 2015, D.Cesini

10 Usage Statistics (DISK-I)
DISK USAGE ( ) – ALL VOs TB JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 Pledges of the year are considered to start in April Underpledge in 2015 is due to ALICE: +1.5PB with respect to 2015 04/05/2015 CNAF Review 2015, D.Cesini

11 Usage Statistics (DISK-II)
DISK USAGE ( ) – non-LHC VOs JAN14 FEB14 MAR14 APR14 MAY14 JUN14 JUL14 AUG14 SEP14 OCT14 NOV14 DEC14 JAN15 FEB15 MAR15 APR15 TB Underpledge in 2014 due to DARKSIDE assignment: -370TB All non-LHC VOs at pledge in 2015 AMS-02, VIRGO, CDF main disk users 04/05/2015 CNAF Review 2015, D.Cesini

12 Usage Statistics (TAPE)
TAPE USAGE ( ) – ALL VOs TB Pledges of the year are considered to start in April The increasing tape usage for the CDF Long Term Data Preservation activities is visible 04/05/2015 CNAF Review 2015, D.Cesini

13 Experiments availability (I)
Incident on a DDN system 04/05/2015 CNAF Review 2015, D.Cesini

14 Experiments availability (II)
17-21/2: Scheduled downtime to add CMS disk 04/05/2015 CNAF Review 2015, D.Cesini

15 The User Support Team 04/05/2015 CNAF Review 2015, D.Cesini

16 User Support Team It’s the primary link between the users and the data center operations On the users side Helps and assists users in accessing the computing and storage resources at CNAF Collaborates with the users in creating their computing models and in adopting the most appropriate technologies for their needs Participates to experiments collaborations on specific tasks Mostly connected to computing On the CNAF side: Tracks user requests and takes care of communications with the experiments Collaborates in operating some of the Tier1 components and services, those closer to the users Takes care of the documentation needed to access the center 04/05/2015 CNAF Review 2015, D.Cesini

17 CNAF Run Coordinator Oversee the VOs activities
Represent CNAF at the Daily (now bi-weekly) WLCG calls Report about resource usage and problems at the Tier1 management body (Comitato di Gestione, CdG) Edit the CdG monthly report 04/05/2015 CNAF Review 2015, D.Cesini

18 People and experiments assignment
5 group members (post-docs) 3 group members, one per experiment, dedicated to ATLAS, CMS, LHCb 2 group members dedicated to all the other experiments 1 close external collaboration for ALICE 1 group coordinator from the Tier1 staff CNAF 04/05/2015 CNAF Review 2015, D.Cesini

19 The sharing model Each group member is embedded for a 50% of his/her working time into at least one experiment Creation of computing models in distributed and Cloud environments Development of code Operation and development of monitoring frameworks Porting experiment software to novel parallel architectures Remaining 50% spent according to the group mandate Day by day support activities CNAF internal activities 04/05/2015 CNAF Review 2015, D.Cesini

20 Support activities The group acts as a first level of support for the users Incident initial analysis and escalation if needed Provides information to access and use the data center Takes care of communications between users and CNAF operations Tracks middleware bugs if needed Reproduces problematic situations Can create proxy for all VOs or belong to local account groups Provides consultancy to users for computing models creation Collects and tracks user requirements towards the datacenter 04/05/2015 CNAF Review 2015, D.Cesini

21 Communication channels
GGUS Mailing lists Direct mail Meetings Skype Telephone Whatsapp Smoke signals …… JIRA Tracking System User Support GGUS Mailing lists JIRA Tracking Experiments Operation Staff Broadcast through GOCDB and cumulative mailing lists for non-LHCB experiments Many of the experiments have a CNAF dedicated mailing list There is a communication channels zoo to handle on the experiment side… ….but it’s working fine Scalability could be an issue if the number of experiments will increase significantly Everything is tracked into the group internal JIRA system GGUS JIRA Tracking Middleware Dev. 04/05/2015 CNAF Review 2015, D.Cesini

22 CNAF internal activities
StoRM Storage Element testing Including tests for deployment on VMs Recently involved in the development of the internal monitoring framework Training and documentation Documentation, FAQ, Knowledge base to be improved Plans to organize CNAF training events for users 04/05/2015 CNAF Review 2015, D.Cesini

23 Direct experiment collaboration (I)
The attitude of “sharing” the group members with the experiments is adopted also for other CNAF units Provides a very effective way for short-circuiting communications and debugging reduced time-to-solution for the detected issues Results in a higher involvement and motivation in the support activities for CNAF personnel 04/05/2015 CNAF Review 2015, D.Cesini

24 Direct experiment collaboration (II)
Application porting to computing accelerators and novel architectures Computing model creation Software Development Virtualized Infrastructure dev and ops - CMS virtual Datacenter - Extreme Energy Events project (EEE) - KM3-Net TriDAS -ATLAS SAM probe & Monitoring - LHCB event building on Infiniband - ATLAS track reconstruction - X-Ray Tomography applications - COSA project - OPERA data management and Montecarlo toolchain -EEE Data management and data store 04/05/2015 CNAF Review 2015, D.Cesini

25 Criticalities Heterogeneity in how the experiments use the resources
grid vs local, shared vs dedicated, posix vs srm vs gridftp vs xrootd vs webdav, batch vs interactive, etc.. This is getting worse with the request of interactive graphical access from some experiments Time consuming to acquire competence and support Burst activity from non-LHC VOs to manage Communication channels zoo to handle Documentation for end-users to be improved A User Support Team closer integration with the datacenter operations staff could be achieved i.e collaborating actively in the operations of some services (UIs) 04/05/2015 CNAF Review 2015, D.Cesini

Download ppt "Experiments and User Support"

Similar presentations

Ads by Google