U.S. ATLAS S&C Planning Meeting - June 20151 ATLAS Software Infrastructure : Requirements and Goals at Run 2 Period Alex Undrus.

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
U.S. ATLAS S&C/PS Support Meeting – August ATLAS Software Infrastructure : LS1 Upgrade Challenges Alex Undrus.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
From Entrepreneurial to Enterprise IT Grows Up Nate Baxley – ATLAS Rami Dass – ATLAS
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
SPI Software Process & Infrastructure GRIDPP Collaboration Meeting - 3 June 2004 Jakub MOSCICKI
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Abstract The automated multi-platform software nightly build system is a major component in the ATLAS collaborative software organization, validation and.
SPI Software Process & Infrastructure EGEE France - 11 June 2004 Yannick Patois
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
REVIEW OF NA61 SOFTWRE UPGRADE PROPOSAL. Mandate The NA61 experiment is contemplating to rewrite its fortran software in modern technology and are requesting.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
RLS Tier-1 Deployment James Casey, PPARC-LCG Fellow, CERN 10 th GridPP Meeting, CERN, 3 rd June 2004.
NICOS System of Nightly Builds for Distributed Development Alexander Undrus CHEP’03.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
The LCG SPI project in LCG Phase II CHEP’06, Mumbai, India Feb. 14, 2006 Andreas Pfeiffer -- for the SPI team
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Nightly System Growth Graphs Abstract For over 10 years of development the ATLAS Nightly Build System has evolved into a factory for automatic release.
ATLAS in LHCC report from ATLAS –ATLAS Distributed Computing has been working at large scale Thanks to great efforts from shifters.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Organization and Management of ATLAS Nightly Builds F. Luehring a, E. Obreshkov b, D.Quarrie c, G. Rybkine d, A. Undrus e University of Indiana, USA a,
Feedback from the POOL Project User Feedback from the POOL Project Dirk Düllmann, LCG-POOL LCG Application Area Internal Review October 2003.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Microsoft Management Seminar Series SMS 2003 Change Management.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
Alex Undrus – Nightly Builds – ATLAS SW Week – Dec Preamble: Code Referencing Code Referencing is a vital service to cope with 7 million lines of.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Software Engineering Overview DTI International Technology Service-Global Watch Mission “Mission to CERN in Distributed IT Applications” June 2004.
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
A. Aimar - EP/SFT LCG - Software Process & Infrastructure1 SPI Software Process & Infrastructure for LCG Project Overview LCG Application Area Internal.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Alex Undrus – Shifters Meeting – 16 Oct ATLAS Nightly System Integration LS1 Ugrade SIT Task Force Objective: increase efficiency, flexibility,
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Marco Cattaneo Core software programme of work Short term tasks (before April 2012) 1.
The Claromentis Digital Workplace An Introduction
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Feedback from CMS Andrew Lahiff STFC Rutherford Appleton Laboratory Contributions from Christoph Wissing, Bockjoo Kim, Alessandro Degano CernVM Users Workshop.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Predrag Buncic (CERN/PH-SFT) CernVM Status. CERN, 24/10/ Virtualization R&D (WP9)  The aim of WP9 is to provide a complete, portable and easy.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
SPI Software Process & Infrastructure Project Plan 2004 H1 LCG-PEB Meeting - 06 April 2004 Alberto AIMAR
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Comments on SPI. General remarks Essentially all goals set out in the RTAG report have been achieved. However, the roles defined (Section 9) have not.
1 ALICE Summary LHCC Computing Manpower Review September 3, 2003.
Software Release Build Process and Components in ATLAS Offline Emil Obreshkov for the ATLAS collaboration.
Ian Bird GDB Meeting CERN 9 September 2003
SPI Software Process & Infrastructure
GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.
Readiness of ATLAS Computing - A personal view
Leigh Grundhoefer Indiana University
Presentation transcript:

U.S. ATLAS S&C Planning Meeting - June ATLAS Software Infrastructure : Requirements and Goals at Run 2 Period Alex Undrus

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Outline  Current status  Plans for the next year  Long term perspectives This presentation reports for the following WBS items:  – Software Validation  – Librarian and Infrastructure Services

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Current Status  Some statistical data  Tendencies  U.S. contribution to ATLAS infrastructure

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Number of files in ATLAS offline release Calculated by cloc-1.62 for ATLAS nightly development release created on June 15, 2015 (cmt, InstalllArea, NICOS, platform specific. genConf, dict areas excluded). External packages are not included.

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Number of lines in ATLAS offline release Calculated by cloc-1.62 for ATLAS nightly development release created on June 15, 2015 (cmt, InstalllArea, NICOS, platform specific. genConf, dict areas excluded, comments and blank lines excluded). External packages are not included.

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Number of submitters and commits to offline SVN repository

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June New and persistent submitters to offline SVN repository (period of 05/16 – 06/15, 2015)  number of unique submitters during 1 month (05/16/15 – 06/15/15)  number of unique submitters during 1 year (06/16/14 – 06/15/15)

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June ATLAS Nightly System at a glance 59 in total NEW

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June ATLAS Nightly System at a glance (2) Number of ATLAS nightly jobs. Record high number of 100 daily jobs is registered on 08/01/2014. As of 06/22/2015 the Nightly System run 83 daily jobs.

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Statistics Interpretation  Athena development releases contain 7 million lines  The number of lines is not growing last 5 years, but they keep changing  Software development activity was always high and it is increasing since the Run 2 start  70% increase in SVN commits  50% increase of developers community  Hundreds of new developers joined  Many new nightly branches are opened (total number reached the record high 67)  Unprecedented rate of new stable releases: 1.4 per day (~ 250 stable release Jan. 1, 2015 – June 24, 2015)  New kinds of release for targeted use

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Intensity of Run 2 Software Development ... is high  It is considerable higher than at LS1 period:  Some important LS1 software projects continue: CMAKE, ROOT 6 integration  New kind of releases for targeted use  Athena Simulation  Athena Analysis  AtlasP1Mon (for Tier 1 online monitoring)  RootCore based releases are actively developed  New compilers are probed (gcc 4.9, clang)  Release configuration management is under major change (cmake)

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Infrastructure Challenges and U.S. Contribution  All areas (code configuration, documentation, externals, platforms&compilers, QA/QC, release building and distribution, code repositories, information protection, etc.) must cope with increased workloads when the work force remains the same  Infrastructure needs both support and development (tools must be updated in sync with software they support)  ATLAS Software Infrastructure Team includes 19 persons contributing 7.5 FTE (with 50 % shortage: additional 3.7 FTE are needed for an adequate support) as of 04/02/2015  U.S. ATLAS contribution to SIT is currently 1 FTE as it appears in OTP (Alex Undrus and Shuwei Ye)  U.S. contributes to critical areas: the Nightly System (Alex Undrus), environment setup (Shuwei Ye), LXR service

 Expertise Gain  Influence on ATLAS-wide policies and decisions  Parallel and effective user support for U.S. based physicists  Librarian and User services at U.S. Analysis Center, BNL PROOF farm and Tier I Center  Capture innovations and new ideas Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Rationale of U.S. Participation in Infrastructure Projects

14 Plans for Next Year  Permanent goal: create supportive environment for code development, data processing and analysis jobs across all ATLAS sites and file systems (local, afs, cvmfs)  Key areas:  Nightly builds (details in the next slides)  Build and run-time (details in Shuwei's presentation)  Criteria of success: users satisfaction and absence of complaints  Explore new innovative technologies and tools Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015

15 Plans for the Nightly System Further improve ATLAS Nightly web and database services brought by successfully completed Nightly System LS1 upgrade Add new on-demand functionality to the System Current system is for daily builds at fixed times Software coordinators increasingly request urgent nightly builds Up to 100 nightly jobs are manually restarted monthly Some branches need irregular builds separated by few days New on-demand functionality will be demonstrated on July 9 at the Annual Nightlies Workshop ( )

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015 NEW ADMIN PANEL Administrative functions for privileged users, authentication via CERN SSO Main task button (restart, cancel, etc.) Current and detailed progress information

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Plans for the Nightly System (2) Complete projects according to the schedule New Nightlies CVMFS server Hot spare for the Nightlies CVMFS server (greatly improves the system reliability) ATLAS Nightly Mail Facility (personalized s about nightlies results) Adapt the System for cmake builds Experimental cmake nightlies support compilations, no tests yet Assess recent requests Expand doxygen documentation builds git repository support Continue to encourage users to use ATLAS Nightlies DB for customized views (successful experience with Trigger developers)  example on the next slide

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015 New panel of Yasu Okumura with summary of Trigger related problems across major nightly branches

Alex Undrus – U.S. ATLAS S&C Planning Meeting – June Plans for the Nightly System (technical work)  Update NICOS for new Tag Collector 3  Builds on CC7 (CERN CentOS 7)  Optimization and testing new machines on the nightly farm  CERN IT plans to replace all real hardware machines with VMs  VMs performs differently, need a lot of testing and optimization  Key problem: I/O bottleneck prevents full CPU usage  New VM machines with SSD disks allow to reach 60% CPU utilization (a success after some VM types could not be loaded above 20%)

20 Long Term Perspectives  Rising concerns:  Unclear relationship between releases  Confusion where essential parts of software are located (simulation, digitization, reconstruction, derivations)  Dissatisfaction with standard release coordination bureaucracy, ways for development of new algorithms and techniques  Single platform support, essentially no software portability  All-inclusive offline releases ( installation size 12 GB, with few tens of externals ) Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015

21 Long Term Perspectives (2)  Increasing requests for smaller releases with targeted purpose (Simulation, Athena Analysis)  Ideas about software restructuring Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015 Core Event Reco Analysis Core Simula tion HLT Analysis 2 Analysis 1 Reco ???

22 Long Term Targets  Improved software structure  Compact releases with targeted purpose  Expanded multiplatform support  Decrease of centralized tag validation bureaucracy and effort  Discussion on July 2 at 13: :00 – "Release build: technical session" Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015

23 Perspectives for the Nightly System  Expansion to new platforms (e.g. PowerPC)  90% of the Nightly System is already portable  Moving some operations to the GRID  Making and testing releases on sites where jobs run would bring better results and save human and machine efforts  Nightly testing is already available on the GRID but attracted few tests so far  Desirable: developing communality with CMS (CMSSW), LCG (Jenkins) nightly builds  Keep users and management happy with the System is always a priority

24 Summary  U.S. contributes to key areas of ATLAS software infrastructure Infrastructure  Despite thin manpower the ATLAS Nightly System and Environment Setup procedures sustain increasing load and demand successfully  Keeping abreast of new technologies trends is at the top of U.S. contributors priorities  The Nightly System LS1 upgrade brought new database and web technologies and greatly improved user experience  The Nightly System is proactively prepared for meeting demands of new platforms (PowerPC), builds tools (cmake) and software restructuring Alex Undrus – U.S. ATLAS S&C Planning Meeting – June 2015