Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBES The Common Solutions Strategy of the Experiment Support group.

Slides:



Advertisements
Similar presentations
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Advertisements

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Feasibility Study on a Common Analysis Framework for ATLAS & CMS.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
A tool to enable CMS Distributed Analysis
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
YAN, Tian On behalf of distributed computing group Institute of High Energy Physics (IHEP), CAS, China CHEP-2015, Apr th, OIST, Okinawa.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
1 Andrea Sciabà CERN Towards a global monitoring system for CMS computing Lothar A. T. Bauerdick Andrea P. Sciabà Computing in High Energy and Nuclear.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Automated Grid Monitoring for LHCb Experiment through HammerCloud Bradley Dice Valentina Mancinelli.
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Successful Common Projects: Structures and Processes WLCG Management.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
ATLAS Grid Data Processing: system evolution and scalability D Golubkov, B Kersevan, A Klimentov, A Minaenko, P Nevski, A Vaniachine and R Walker for the.
Experiment Support ANALYSIS FUNCTIONAL AND STRESS TESTING Dan van der Ster, CERN IT-ES-DAS for the HC team: Johannes Elmsheuser, Federica Legger, Mario.
Towards a Global Service Registry for the World-Wide LHC Computing Grid Maria ALANDES, Laurence FIELD, Alessandro DI GIROLAMO CERN IT Department CHEP 2013.
Impact of end of EMI+EGI-SA3 April 2013: EMI project finishes EGI-Inspire-SA3 finishes (mainly CERN affected) EGI-Inspire continues until April 2014 EGI.eu.
Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.
LHCbComputing Manpower requirements. Disclaimer m In the absence of a manpower planning officer, all FTE figures in the following slides are approximate.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
CERN IT Department CH-1211 Geneva 23 Switzerland t DBES LHC(b) Grid operations Roberto Santinelli IT/ES 5 th User Forum – Uppsala April.
Julia Andreeva on behalf of the MND section MND review.
CERN IT Department CH-1211 Genève 23 Switzerland t Experiment Operations Simone Campana.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,
DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Alarm framework requirements Andrea Sciabà Tony Wildish.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Monitoring Overview: status, issues and outlook Simone Campana.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Our experience with NoSQL and MapReduce technologies Fabio Souto.
(on behalf of the POOL team)
POW MND section.
for the Offline and Computing groups
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
D. van der Ster, CERN IT-ES J. Elmsheuser, LMU Munich
System Analysis and Design:
Presentation transcript:

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group at CERN for the LHC Experiments Maria Girone, CERN On behalf of the CERN IT-ES Group CHEP, New York City, May 2012

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Motivation 2Maria Girone, CERN Despite their differences as experiments at the LHC, from a computing perspective a lot of the workflows are similar and can be done with common services While the collaborations are huge and highly distributed, effort available in ICT development is limited and decreasing –Effort is focused on analysis and physics Common solutions are a more efficient use of effort and more sustainable in the long run

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Anatomy of Common Solution Most common solutions can be diagrammed as the interface layer between common infrastructure elements and the truly experiment specific components –One of the successes of the grid deployment has been the use of common grid interfaces and local site service interfaces –The experiments have a environments and techniques that are unique –In common solutions we target the box in between. A lot of effort is spent in these layers and there are big savings of effort in commonality not necessarily implementation, but approach & architecture –LHC schedule presents a good opportunity for technology changes Maria Girone, CERN3 Higher Level Services that translate between Experiment Specific Elements Common Infrastructure Components and Interfaces

CERN IT Department CH-1211 Geneva 23 Switzerland t ES The Group IT-ES is a unique resource in WLCG –The group is currently supported with substantial EGI-InSPIRE project effort –Careful balance of effort embedded in the experiments & on common solutions –Development of expertise in experiment systems & across experiment boundaries –People uniquely qualified to identify and implement common solutions Matches well with the EGI-InSPIRE mandate of developing sustainable solutions A strong and enthusiastic team Maria Girone, CERN4 EGI-InSPIRE INFSO-RI

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Activities Monitoring and Experiment Dashboards –Allows experiments and sites to monitor and track their production and analysis activities across the grid Including services for data popularity, data cleaning and data integrity and site test stressing Distributed Production and Analysis –Design and development for experiment workload management and analysis components Data Management support –Covers development and integration of the experiment specific and shared grid middleware The LCG Persistency Framework –Handles the event and detector conditions data from the experiments Maria Girone, CERN5

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Examples: Data Popularity Experiments want to know which datasets are used, how much, and by whom –Good chance of a common solution Data popularity uses the fact that all experiments open files and access storage The monitoring information can be accessed in a common way using generic and common plug-ins The experiments have systems that identify how those files are mapped onto logical objects like datasets, reprocessing and simulation campaigns Maria Girone, CERN6 Files accessed, users and CPU used Experiment Booking Systems Mapping Files to Datasets File Opens and Reads

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Popularity Service Used by the experiments to assess the importance of computing processing work, and to decide when the number of replicas of a sample needs to be adjusted either up or down Maria Girone, CERN7 See D. Giordano et al., [176] Implementing data placement strategies for the CMS experiment based on a popularity model

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Cleaning Service The Site Cleaning Agent is used to suggest obsolete or unused data that can be safely deleted without affecting analysis. The information about space usage is taken from the experiment dedicated data management and transfer system Maria Girone, CERN8

CERN IT Department CH-1211 Geneva 23 Switzerland t ES D. Tuckett et al., [300], Designing and developing portable large-scale JavaScript web applications within the Experiment Dashboard framework Dashboard Framework and Applications Dashboard is one of the original common services –All experiments execute jobs and transfer data –Dashboard services rely on experiment specific information for site names, activity mapping, error codes –The job monitoring system collects centrally information from workflows about the job status and success Database, framework and visualization are common Maria Girone, CERN9 Framework & visualization Sites and activities Job submission & data transfers

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Site Status Board Another example of a good common service –Takes specific lower level checks on the health of common services –Combines with some experiment specific workflow probes –Includes links into the ticketing system –Combines to a common view Maria Girone, CERN10

CERN IT Department CH-1211 Geneva 23 Switzerland t ES HammerCloud HammerCloud is a common testing framework for ATLAS (PanDA), CMS (CRAB) and LHCb (Dirac) Common layer for functional testing of CEs and SEs from a user perspective Continuous testing and monitoring of site status and readiness. Automatic Site exclusion based on defined policies Same development, same interface, same infrastructure  less workforce, Maria Girone, CERN11 Testing and Monitoring Framework Distributed analysis Frameworks Computing & Storage Elements

CERN IT Department CH-1211 Geneva 23 Switzerland t ES HammerCloud D. van der Ster et al. [283], Experience in Grid Site Testing for ATLAS, CMS and LHCb with HammerCloud

CERN IT Department CH-1211 Geneva 23 Switzerland t ES New Activities – Analysis Workflow Up to now services have generally focused on monitoring activities –All of these are important and commonality saves effort –Not normally in the core workflows of the experiment Success with the self contained services has provided confidence moving into a core functionality –Looking at the Analysis Workflow Feasibility Study for a Common Analysis Framework between ATLAS and CMS Maria Girone, CERN13 Job Tracking, Resubmission, and scheduling Data discovery, environment configuration, and job splitting Job submission and Pilots

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Analysis Workflow Progress Looking at ways to make the workflow engine common between the two experiments –Improving the sustainability of the central components that interface to low-level services A thick layer that handles prioritization, job tracking and resubmission –Maintaining experiment specific interfaces Job splitting, environment, and data discovery would continue to be experiment specific Maria Girone, CERN14 Job Tracking, Resubmission, and scheduling Data discovery, job splitting and packaging of user environment Job submission and Pilots

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Proof of Concept Diagram Maria Girone, CERN15 Feasibility Study proved that there are no show- stoppers to design a common analysis framework Next step is a proof of concept

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Even Further Ahead As we move forward, we would also like to assess and document the process –This should not be the only common project The diagram for data management would look similar –A thick layer between the experiment logical definitions of datasets and the service that moves files Deals with persistent location information and tracks files in progress and validates file consistency Currently no plans for common services, but has the right properties Maria Girone, CERN16 File locations and files in transfer Datasets to file mapping File Transfer Service (FTS)

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Outlook IT-ES has a good record of identifying and developing common solutions between the LHC experiments –Setup and expertise of the group have helped Several services focused primarily on monitoring have been developed and are in production use As a result, more ambitious services that would be closer to the experiment core workflows are under investigation The first is a feasibility study and proof of concept of a common analysis framework between ATLAS and CMS Both better and more sustainable solutions could result – with lower operational and maintenance costs Maria Girone, CERN17