Download presentation
Presentation is loading. Please wait.
Published byCynthia Griffith Modified over 9 years ago
1
U.S. ATLAS Computing Facilities (Overview) Bruce G. Gibbard Brookhaven National Laboratory US ATLAS Computing Advisory Panel Meeting Argonne National Laboratory October 30-31, 2001
2
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 2 US ATLAS Computing Facilities Mission …to enable effective participation by US physicists in the ATLAS physics program ! Direct access to and analysis of physics data sets Simulation, re-reconstruction, and reorganization of data as required to complete such analyses Facilities procured, installed and operated …to meet U.S. “MOU” obligations to ATLAS Direct IT support (Monte Carlo generation, for example) Support for detector construction, testing, and calibration Support for software development and testing
3
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 3 US ATLAS Computing Facilities Overview A Hierarchy of Grid Connected Distributed Resources Including: Tier 1 Facility Located at Brookhaven – Rich Baker / Bruce Gibbard Operational at ~ 0.5% level 5 Permanent Tier 2 Facilities ( to be Selected in April ’03 ) 2 Prototype Tier 2’s selected earlier this year and now active Indiana University – Rob Gardner Boston University – Jim Shank Tier 3 / Institutional Facilities Several currently active; most candidate to become Tier 2’s Univ. of California at Berkeley, Univ. of Michigan, Univ. of Oklahoma, Univ. of Texas at Arlington, Argonne Nat. Lab. Distribute IT Infrastructure – Rob Gardner US ATLAS Grid Testbed – Ed May HEP Networking – Shawn McKee Coupled to Grid Projects with designated liaisons PPDG – Torre Wenaus GriPhyN – Rob Gardner iVDGL – Rob Gardner EU Data Grid – Craig Tull
4
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 4 Evolution of US ATLAS Facilities Plan In Respond to Changes or Potential Changes in Schedule Requirements/Computing Model Technology Budgetary Guidance Changes in Schedule LHC start-up projected to be a year later, 2005/2006 2006/2007 ATLAS Data Challenges (DC’s) have, so far, stayed fixed DC0 – Nov/Dec 2001 – 10 5 events – Continuity Test DC1 – Feb/Jul 2002 – 10 7 events ~ 1% DC2 – Jan/Sep 2003 – 10 8 events ~ 10% - - a serious Functionality/Capacity exercise
5
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 5 Changes in Computing Model and Requirements Requirements Defined by International ATLAS Computing Model Nominal model and requirements for a Tier 1 (Expect there to be ~6) Raw ESD/AOD/TAG pass done at CERN, result shipped to Tier 1’s TAG/AOD/~25% of ESD on Disk, Tertiary storage for remainder of ESD Selection passes through ESD monthly Analysis of TAG/AOD/Selected ESD/etc. (n-tuples) on disk within 4 hours by ~200 users requires …
6
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 6 Changes in Computing Model and Requirements (2) Revised model and requirements for a Tier 1 (under consideration) Raw ESD/AOD/TAG pass done at CERN, result shipped to Tier 1’s TAG/AOD/33% of ESD on Disk at each Tier 1 (3 sites in aggregate contain 100% of ESD on Disk) Selection passes through ESD daily using data resident on disk locally and at 2 complementary Tier 1’s Analysis of TAG/AOD/All ESD/etc. (n-tuples) on disk within 4 hours by ~200 users requires …
7
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 7 Comparing Models All ESD, AOD, and TAG data on disk greatly speeds/improves analyses Enables one day selection passes (rather than one month) and reduces the tape requirement imposed by selection processing – better/faster selection Allows navigation of individual events (for all processed, but not Raw, data) without recourse to tape and associated delay – more detailed/faster analysis Avoids contention between analyses over ESD disk space and the need for complex algorithms to optimize use of that space – less effort for better result But there are potentially significant cost and operational drawbacks Additional disk is required to hold 1/3 of ESD Additional CPU is required to support more frequent selection passes It introduces major dependencies between Tier 1’s It increases sensitivity to performance of the network and associated Grid middleware (particularly when separate by a “thin” pipe across an ocean) What is optimal for US ATLAS computing?
8
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 8 Changes in Technology No dramatic new technologies Previously assumed technologies are tracking Moore’s Law well Recent price performance points from RHIC Computing Facility CPU: IBM procurement - $33/SPECint95 310 Dual 1 GHz Pentium III nodes @ 97.2 SPECint95/Node Delivered Aug 2001, now fully operational $1M fully racked including cluster management hardware & software Disk: OSSI/LSI procurement - $27k/TByte 33 Usable TB of high availability Fibre Channel RAID 5 @ 1400 MBytes/sec Delivered Sept 2001, first production use this week $887k including SAN switch Strategy is to project, somewhat conservatively, from these points for facilities design and costing Using somewhat longer than the observed <18 month price/performance halving time – detailed capacity & costing will be presented by Rich Baker
9
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 9 Changes in Budgetary Assumptions Assumed Funding Profiles ($K) For revise LHC startup schedule, new profile is better In new profile, funding for each year generally matches or exceeds that for one year earlier in the old profile. Funds are more effective when spend 1 year later (Moore’s Law) For ATLAS DC 2 which stayed fixed in ’03, new profile is worse Hardware capacity goals of DC 2 cannot be met Personnel intensive facility development may be up to 1 year behind Again, Rich Baker will discuss details Hope/expectation is that another DC will be added allowing validation of more nearly fully developed Tier 1 and US ATLAS facilities Grid
10
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 10 Capacities of US ATLAS Facilities for Nominal Model
11
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 11 Revised US ATLAS Tier 1 Model Stretch out of LHC startup schedule combined with DOE late funding ramp-up allows for significantly improved US ATLAS Tier 1 facility in ‘07 (rather than ‘06) while staying within budget (Unfortunately it does not help for DC 2) It is based on the Revised International ATLAS Model with augmentation to address operational drawbacks Increase disk to hold 100% of ESD Removing dependency on other Tier 1’s Reducing dependency on network across the Atlantic Add sufficient CPU to exploit highly improved data access Retain tape storage volume of one STK silo, reduce tape I/O band- width to only that required in new model (Selection from disk not tape)
12
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 12 Revised US ATLAS Tier 1 Model (2) Impact on Overall US ATLAS Computing Model The high availability of the complete ESD set at the Tier 1 makes possible more prompt and detailed analyses by users at coupled Tier 2 and Tier 3 sites as well as those directly running at the Tier 1 Increased CPU capacity to exploit this possibility at these site is desirable and may be feasible given the 1 year delay in delivery date, but such an expansion remains to be studied Exploitation of this capability would increase the network load between Tier 2/3 sites and the Tier 1 and thus the network requirement but the again the added year should help and further study is required Conclusions It is currently our intent to make this revised plan the default US ATLAS Tier 1 plan and to determine what changes in the overall US ATLAS facilities plan should and can efficiently follow from this
13
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 13 Capacities of US ATLAS Facilities for Revised Model
14
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 14 Tier 1 Ramp-up Profile * DC 2
15
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 15
16
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 16
17
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 17 STATUS of Tier 1 Facility Evolution Goal of Planned Technical Evolution in FY ’01 was to Establish US ATLAS Scalability & Independence (from RCF) Users Services – 100 registered users Accounts, passwords, CTS, etc. Documentations Infrastructure Services NIS, DNS, etc. servers SSH gateways SUN/Solaris Services Server NFS disk AFS Service AFS servers AFS disk Network HPSS Service Server Tape/Cache disk
18
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 18
19
30 October, 2001 B. Gibbard US ATLAS Computing Facilities 19 STATUS of Tier 1 Facility Evolution Goal of Planned Technical Evolution in FY ’01 was to Establish US ATLAS Scalability & Independence (from RCF) Users Services – 100 registered users Accounts, passwords, CTS, etc. Documentations Infrastructure Services NIS, DNS, etc. servers SSH gateways SUN/Solaris Services Server * NFS disk * AFS Service AFS servers * AFS disk * Network * HPSS Service Server * Tape/Cache disk *
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.