U.S. ATLAS Computing Facilities Bruce G. Gibbard Brookhaven National Laboratory Mid-year Review of U.S. LHC Software and Computing Projects NSF Headquarters,

Slides:



Advertisements
Similar presentations
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
Advertisements

Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Experience with ATLAS Data Challenge Production on the U.S. Grid Testbed Kaushik De University of Texas at Arlington CHEP03 March 27, 2003.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
US ATLAS Distributed IT Infrastructure Rob Gardner Indiana University October 26, 2000
DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.
U.S. ATLAS Physics and Computing Budget and Schedule Review John Huth Harvard University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
U.S. ATLAS Computing Facilities Bruce G. Gibbard DOE/NSF LHC Computing Review Germantown, MD 8 July, 2004.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
U.S. ATLAS Computing Facilities Bruce G. Gibbard Brookhaven National Laboratory Review of U.S. LHC Software and Computing Projects LBNL, Berkeley, California.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.
Tier 1 Facility Status and Current Activities Rich Baker Brookhaven National Laboratory NSF/DOE Review of ATLAS Computing June 20, 2002.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
LHC Computing Review Recommendations John Harvey CERN/EP March 28 th, th LHCb Software Week.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
U.S. ATLAS Tier 1 Planning Rich Baker Brookhaven National Laboratory US ATLAS Computing Advisory Panel Meeting Argonne National Laboratory October 30-31,
Atlas CAP Closeout Thanks to all the presenters for excellent and frank presentations Thanks to all the presenters for excellent and frank presentations.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
BNL Tier 1 Service Planning & Monitoring Bruce G. Gibbard GDB 5-6 August 2006.
Tony Doyle - University of Glasgow 8 July 2005Collaboration Board Meeting GridPP Report Tony Doyle.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory Review of U.S. LHC Software and Computing Projects Fermi National Laboratory November.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
ATLAS Tier 1 at BNL Overview Bruce G. Gibbard Grid Deployment Board BNL 5-6 September 2006.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
U.S. ATLAS Computing Facilities Bruce G. Gibbard GDB Meeting 16 March 2005.
23.March 2004Bernd Panzer-Steindel, CERN/IT1 LCG Workshop Computing Fabric.
ATLAS Midwest Tier2 University of Chicago Indiana University Rob Gardner Computation and Enrico Fermi Institutes University of Chicago WLCG Collaboration.
U.S. ATLAS Computing Facilities (Overview) Bruce G. Gibbard Brookhaven National Laboratory US ATLAS Computing Advisory Panel Meeting Argonne National Laboratory.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
LHC Computing, CERN, & Federated Identities
U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.
U.S. ATLAS Computing Facilities (Overview) Bruce G. Gibbard Brookhaven National Laboratory Review of U.S. LHC Software and Computing Projects Fermi National.
Tier 1 at Brookhaven (US / ATLAS) Bruce G. Gibbard LCG Workshop CERN March 2004.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
EGEE is a project funded by the European Union under contract IST EGEE Summary NA2 Partners April
Participation of JINR in CERN- INTAS project ( ) Korenkov V., Mitcin V., Nikonov E., Oleynik D., Pose V., Tikhonenko E. 19 march 2004.
IAG – Israel Academic Grid, EGEE and HEP in Israel Prof. David Horn Tel Aviv University.
US ATLAS Tier 1 Facility Rich Baker Deputy Director US ATLAS Computing Facilities October 26, 2000.
U.S. ATLAS Computing Facilities DOE/NFS Review of US LHC Software & Computing Projects Bruce G. Gibbard, BNL January 2000.
U.S. ATLAS Computing Facilities U.S. ATLAS Physics & Computing Review Bruce G. Gibbard, BNL January 2000.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
A Computing Tier 2 Node Eric Fede – LAPP/IN2P3. 2 Eric Fede – 1st Chinese-French Workshop Plan What is a Tier 2 –Context and definition To be a Tier 2.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.
Hall D Computing Facilities Ian Bird 16 March 2001.
LQCD Computing Project Overview
Bob Jones EGEE Technical Director
Simulation use cases for T2 in ALICE
LQCD Computing Operations
Nuclear Physics Data Management Needs Bruce G. Gibbard
Collaboration Board Meeting
Presentation transcript:

U.S. ATLAS Computing Facilities Bruce G. Gibbard Brookhaven National Laboratory Mid-year Review of U.S. LHC Software and Computing Projects NSF Headquarters, Arlington, Virginia July 8, 2003

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 2 Mission of US ATLAS Computing Facilities  Supply capacities to the ATLAS Distributed Virtual Offline Computing Center  At levels agreed to in a computing resource MoU (Yet to be written)  Guarantee the Computing Required for Effective Participation by U.S. Physicists in the ATLAS Physics Program  Direct access to and analysis of physics data sets  Simulation, re-reconstruction, and reorganization of data as required to support such analyses

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 3 ATLAS Facilities Model  ATLAS Computing Will Employ the ATLAS Virtual Offline Computing Facility to process and analyze its data  “Cloud” mediated set of resources including:  CERN Tier 0  All Regional Facilities (Tier 1’s) - Typically ~200 users each  Some National Facilities (Tier 2’s)  All members of ATLAS Virtual Organization (VO) must contribute in funds or in kind (personnel, equipment), proportional to author count  All members of ATLAS VO will have defined access rights  Typically only a subset of resources at a regional or national center are Integrated into the Virtual Facility  Non-integrated portion over which regional control is retained is expected to be used to augment resources supporting analyses of region interest

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 4 Analysis Model: All ESD Resident on Disk  Enables ~24 hour selection/regeneration passes (versus ~month if tape stored) – faster, better tuned, more consistent selection  Allows navigation for individual events (to all processed, though not Raw, data) without recourse to tape and associated delay – faster more detailed analysis of larger consistently selected data sets  Avoids contention between analyses over ESD disk space and the need to develop complex algorithms to optimize management of that space – better result with less effort  Complete set on disk at US Tier 1 cost impact discussed later  Reduced sensitivity to performance of multiple Tier 1’s, intervening network (transatlantic) & middleware – improved system reliability, availability, robustness and performance – cost impact discussed later

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA US ATLAS Facilities  A Coordinated Grid of Distributed Resources Including … Rich Baker / Bruce Gibbard  Tier 1 Facility at Brookhaven – Rich Baker / Bruce Gibbard  Currently operational at ~1% of required 2008 capacity Saul Youssef  5 Permanent Tier 2 Facilities – Saul Youssef  Scheduled for selection beginning in 2004  Currently there are 2 Prototype Tier 2’s  Indiana U – Fred Luehring / University of Chicago – Rob Gardner  Boston U – Saul Youssef  7 Currently Active Tier 3 (Institutional) Facilities Shawn McKee  WAN Coordination Activity – Shawn McKee Rob Gardner  Program of Grid R&D Activities – Rob Gardner  Based on Grid Projects ( PPDG, GriPhyN, iVDGL, EU Data Grid, EGEE, etc.) Kaushik De/Pavel Nevski  Grid Production & Production Support Effort – Kaushik De/Pavel Nevski

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 6 Facilities Organization Chart

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 7

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 8 WBS 2.3 Personnel Increase for FY ‘04 ( ) Important not fully funded request

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA Tier 1 Facility  Functions  Primary U.S. data repository for ATLAS  Programmatic event selection and AOD & DPD regeneration from ESD  Chaotic high level analysis by individuals  Especially for large data set analyses  Significant source of Monte Carlo  Re-reconstruction as needed  Technical support for smaller US computing resource centers  Co-located and operated with the RHIC Computing Facility  To date a very synergistic relationship  Some recent increased divergence  Substantial benefit from cross use of idle resources (2000 CPU’s)

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 10 Tier 1 Facility Evolution for FY ‘04  No staff increase nor equipment procurement for FY ‘03  Only new equipment for FY ‘02 based on DOE end-of-year funding supplement; 10 TBytes disk addition & upgrade of single tape drive  Result has been capacities lower than expected and needed  Compute capacities applied to ATLAS Data Challenge 1 (DC 1) were ~x 2 less than expected by ATLAS based on US author count  Only very efficient facility utilization and supplemental production at Tier 2’s & 3’s resulted in an acceptable level of US contribution  Modest equipment upgrades planned for FY ’04 (for DC 2)  Disk: 12 TBytes  25 TBytes (factor of 2)  CPU Farm: 30 kSPECint2000  130 kSPECint2000 (factor of 4)  First processor farm upgrade since FY ’01 (3 years)  Robotic Tape Storage: 30 MBytes/sec  60 MBytes/sec (factor of 2)

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 11 Capital Equipment

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 12

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 13 Need for Tier 1 Facility Staff Increase  Procurement, Installation and Operation of additional equipment  Need for ATLAS specific Linux OS - RH 7.3 versus RHIC RH 9  Investigation of alternate disk technologies  In particular CERN Linux disk server-like approaches  Increased complexity of cyber security and AAA for Grid  Major increases in user base and level of activity in 2004  Grid 3/PreDC2, Grid demonstration exercise in preparation for DC2  DC2, ATLAS Data Challenge 2  LHC Computing Grid (LCG) deployment ( LCG-0  LCG-1)

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 14

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 15

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 16 Cost Impact of All ESD on Local Disk  Assumptions  Increase from 480 TB to 1 PB of total disk  Some associated increase in CPU and infrastructure  Simple extension of current technology  Using a conservative technology so cost may be over estimated  Personnel requirement unchanged  Alternative is effort spent optimizing transfer and caching schemes  Tier 1 Facility cost differential through 2008 (First full year of LHC operation)  Since facility cost is not dominated by hardware, …reduction to “1/3 disk model” certainly reduces cost but not dramatically

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA Tier 2 Facilities  5 Permanent Tier 2 Facilities  Primary resource for simulation  Empower individual institutions and small groups to do autonomous analyses using more directly accessible and locally managed resources  2 Prototype Tier 2’s selected for ability to rapidly contribute to Grid development  Indiana University / (effective FY ‘03) University of Chicago  Boston University  Permanent Tier 2 will be selected to leverage strong institutional resources  Selection of first two scheduled for spring 2004  Currently 7 active Tier 3’s in addition to prototype Tier 2’s; all candidates Tier 2’s  Aggregate of 5 permanent Tier 2’s will be comparable to Tier 1 in CPU

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 18 Tier 2 Facilities Evolution  First Significant iVDGL Funded Equipment Procurements Now Underway – (Moore’s law  Don’t buy it until you need it)  Second Round Scheduled for Summer FY ’04  At time of DC2, aggregate Tier 2 capacities comparable to those of Tier 1; later in 2004, very significantly more

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA Networking  Responsible for:  Specifying both the national and international WAN requirements of US ATLAS  Communicating requirement to appropriate Network Infrastructure suppliers (ESnet, Internet 2, etc.)  Monitoring the extent to which WAN requirements …  … are currently being met  … will continue to be met as they increase in the future  Small base program support effort includes:  Interacting with ATLAS facility site managers and technical staff  Participating in HENP networking forums  Adopt/adapt/develop, deploy, & operate WAN monitoring tools  WAN upgrades not anticipate during next year  Currently Tier 1 & 2 sites are at OC12 except UC, now planning OC3  OC12 by Fall  Upcoming exercises require ~1 TByte/day (~15% of OC12 theoretical capacity)  RHIC competitive utilization at BNL current also in ~15% range

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA Grid Tools & Services  Responsible for development, evaluation, and creation of integrated Grid-based system for distributed production processing and user analysis  Primary point of contact and coordination with Grid projects ( PPDG, GriPhyN, iVDGL, EDG, EGEE, etc.)  Accept, evaluate, and integrate tools & services from Grid projects  Transmit requirements and feedback to Grid projects  Responsibility for supporting the integration of ATLAS application with Grid tools & services

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 21

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA Grid Production  Responsible for deploying, production scale testing & hardening, operating, monitoring and documenting the performance of systems for production processing & user analysis  Primary point of contact to ATLAS production activities including the transmission of …  … production requests to, and facilities availability from, the rest of US ATLAS computing management  … requirements to ATLAS production for optimal use of US resources  … feedback to Tools & Service effort regarding production scale issues  Responsible for integration, on an activity by activity basis, of US ATLAS production contributions into overall ATLAS production  Requested increase by 2.65 Project supported FTE’s for FY ’04 to address growing production demands but budget supports only 1.65

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 23 Increasing Grid Production  Two significant production activities in FY ’04 (Only DC1 in FY ’03)  Grid3/PreDC2 exercise  DC2  While each is anticipated to be a few months in duration, experience from DC1 indicates that near continuous ongoing production is more likely  Production is moving from being centric to being centric  Production is moving from being Facility centric to being Grid centric  In its newness, Grid computing is a more complex and less stable production environment and currently requires more effort  Level of effort  During DC1 (Less than 50% Grid using 5 sites) – 3.35 FTE’s (0.85 Project)  For Grid3/PerDC2/DC2 (~100% Grid using 11 sites) – Minimum of 6 FTE’s  Reductions below this level (forced by budget constraint)  Will reduce efficiency of resource utilization  Will force some fallback from Grid to Facility type production to meet commitments

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 24

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 25 3 Major Near Term Milestones  LCG Deployment including US Tier 1  LCG-0, exercise deployment mechanisms – completed May ’03  Substantial comment on mechanisms offer and seemed well received  LCG-1, initial deployment beginning – July ‘03  LCG-1, full function, reliable, manageable service – Jan ‘04  PreDC2/Grid3 exercise – Nov ‘03  Full geographic chain Tier 2  Tier 1  Tier 0  Tier 1  Tier 2 + analysis  Goals: Test DC2 model, forge Tier0 / Tier1 staff link, initiate Grid analysis  ATLAS DC2 – April ’04 (Slippage by ~3 months is not unlikely)  DC1 scale in number of events ~10 7 but x 2 in CPU & storage for G  ant 4  Exercising complete geographic chain (Tier 2  Tier 1  Tier 0  Tier 1  Tier 2)  Goal: Use of LCG-1 for Grid computing as input to Computing Model Document

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 26 Near Term Schedule   

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 27

8 July 2003 B. Gibbard Review of U.S. LHC Computing - Arlington VA 28 A U.S. ATLAS Physics Analysis Center at BNL  Motivation:  Position the U.S. to insure active participation in ATLAS physics analysis  Builds on existing Tier 1 ATLAS Computing Center, CORE Software leadership at BNL, and theorists who already are working closely with experimentalists.  This BNL Center will become a place where U.S. physicists come with their students and post-docs.  Scope and Timing:  Hire at least 1 key physicist/year starting in 2003 to add to excellent existing staff to cover all aspects of ATLAS physics analysis: tracking, calorimetry, muons, trigger, simulation, etc.  Expect the total staff including migration from D0 will reach ~25 by 2007  First hire will arrive on August 26, 2003  The plan is to have a few of the members in residence at CERN for 1-2 years on a rotating basis.  Cost: base funding  Will need DOE increment to the declining BNL HEP base program. Additional base funding of ~$200k/year FY03 => $1.5M in FY07. H. Gordon, BNL DOE Annual HEP Program Review, April 22, 2002