Download presentation
Presentation is loading. Please wait.
Published byMarlene Potter Modified over 9 years ago
1
US-ATLAS Grid Efforts John Huth Harvard University Agency Review of LHC Computing Lawrence Berkeley Laboratory January 14-17, 2003
2
14 Jan 03 J. Huth LHC Computing Agency Review 2 Outline Goals Data Challenge experience and plans Relation to external groups US ATLAS grid management
3
14 Jan 03 J. Huth LHC Computing Agency Review 3 Goals Primary: establish a robust and efficient platform (fabric and software) for US LHC and International LHC simulation, production and data analysis Data challenge support User analysis support Production support Ready for data taking on “day one” Realizing common grid goals Local control Democratic access to data Support of autonomous analysis communities (private grids) Value added to other disciplines Serving as a knowledgeable user community for testing middleware
4
14 Jan 03 J. Huth LHC Computing Agency Review 4 US ATLAS Testbed Develop a tier’ed, scalable computing platform Test grid middleware integrated with software Test interoperability CMS EDG Other groups Support physics efforts Data Challenges Production User analysis
5
14 Jan 03 J. Huth LHC Computing Agency Review 5 Testbed Fabric Production gatekeepers at ANL, BNL, LBNL, BU, IU, UM, OU, UTA Large clusters at BNL, LBNL, IU, UTA, BU Heterogeneous system - Condor, LSF, PBS Currently > 100 nodes available Could double capacity quickly, if needed + Multiple R&D gatekeepers gremlin@bnl - iVDGL GIIS heppc5@uta - ATLAS hierarchical GIIS atlas10/14@anl - EDG testing heppc6@uta+gremlin@bnl - glue schema heppc17/19@uta - GRAT development few sites - Grappa portal bnl - VO server few sites - iVDGL testbed
6
14 Jan 03 J. Huth LHC Computing Agency Review 6 Lawrence Berkeley National Laboratory Brookhaven National Laboratory Indiana University Boston University Argonne National Laboratory U Michigan University of Texas at Arlington Oklahoma University US -ATLAS testbed launched February 2001 Two new sites joining - UNM, SMU Grid Testbed Sites
7
14 Jan 03 J. Huth LHC Computing Agency Review 7 Testbed Tools Many tools developed by the U.S. ATLAS testbed group during past 2 years GridView - simple tool to monitor status of testbed Kaushik De, Patrick McGuigan Gripe - unified user accounts Rob Gardner Magda - MAnager for Grid DAta Torre Wenaus, Wensheng Deng (see Gardner & Wenaus talks) Pacman - package management and distribution tool Saul Youssef Grappa - web portal using active notebook technology Shava Smallen, Dan Engh GRAT - GRid Application Toolkit Gridsearcher - MDS browser Jennifer Schopf GridExpert - Knowledge Database Mark Sosebee VO Toolkit - Site AA Rich Baker
8
14 Jan 03 J. Huth LHC Computing Agency Review 8 Recent Accomplishments May 2002 Globus 2.0 beta RPM developed at BNL Athena-atlfast grid package developed at UTA Installation using Pacman developed at BU GRAT toolkit for job submission on grid, developed at UTA & OU June 2002 Tested interoperability - successfully ran ATLAS MC jobs on CMS & D0 grid sites ANL demonstrated that U.S. testbed package can run successfully at EDG sites July 2002 New production software released & deployed 2 week Athena-atlfast MC production run using GRAT & GRAPPA Generated 10 million events, thousand files catalogued in Magda, all sites participating
9
14 Jan 03 J. Huth LHC Computing Agency Review 9 Accomplishments contd. August/September 2002 3 week dc1 production run using GRAT Generated 200,000 events, using ~ 30,000 CPU hours, 2000 files, 100 GB storage October/November 2002 Prepare demos for SC2002 Deployed VO server at BNL Tier I facility Deployed new VDT 1.1.5 on testbed Test iVDGL packages Worldgrid, Sciencegrid Interoperability tests with EDG December 2002 Developed software evolution plan during meeting with Condor/VDT team at UW-Madison Generated 75k SUSY and Higgs events for DC1 Total DC1 files generated and stored > 500 GB, total CPU used >1000 CPU days in 4 weeks
10
14 Jan 03 J. Huth LHC Computing Agency Review 10 Lessons Learned Globus, Magda and Pacman make grid production easy! On the grid - submit anywhere, run anywhere, store data anywhere - really works! Error reporting, recovery & cleanup very important - will always lose/hang some jobs Found many unexpected limitations, hangs, software problems - next time, need larger team to quantify these problems and provide feedback to Globus, Condor, and other middleware teams Large pool of hardware resources available on testbed: BNL Tier 1, LBNL (pdsf), IU & BU prototype Tier 2 sites, UTA (new $1.35M NSF-MRI), OU & UNM CS supercomputing clusters... Testbed production effort suffering from severe shortage of human resources. Need people to debug middleware problems and provide feedback to middleware developers
11
14 Jan 03 J. Huth LHC Computing Agency Review 11 DC1 Phase II Provide data for high level trigger studies Data analysis from production at BNL Unified set of grid tools for international ATLAS Magda, PACMAN, cook-book database, VO management There may still be divergences among grids (US/EU/Nordu grid)
12
14 Jan 03 J. Huth LHC Computing Agency Review 12 ATLAS DC1 Phase 2 Pile up production: Ongoing! Both grid & non-grid based Add min. bias events to DC1 Phase 1 sample Simulation in the U.S. Higgs re-simulation - 50k events SUSY simulation - 50k events Athena reconstruction of complete DC1 Phase 2 sample Analysis / user access to data Magda already provides access to ~30k catalogued dc1 files from/to many grid locations (need ATLAS VO to make this universal) Need higher level (web based?) tools to provide easy access to physicist DIAL being developed at BNL http://www.usatlas.bnl.gov/~dladams/dial/
13
14 Jan 03 J. Huth LHC Computing Agency Review 13 DC2 Specifics From new WBS (we will review the situation in early 2003) Test LCG-1 (software and hardware) - in particular POOL ATLAS tests simulation status of G4 (validation, hits, digits) pile-up in Athena relative role of G3/G4 calibration and alignment detector description and EDM distributed analysis
14
14 Jan 03 J. Huth LHC Computing Agency Review 14 Integration Coordination with other grid efforts and software developers - very difficult task! Project centric: Intl. ATLAS – K. De GriPhyN/iVDGL - Rob Gardner (J. Schopf, CS contact) PPDG – John Huth LCG – John Huth (POB), V. White (GDB), L. Bauerdick (SC2) EDG - Ed May, Jerry Gieraltowski ATLAS/LHCb - Rich Baker ATLAS/CMS - Kaushik De ATLAS/D0 - Jae Yu Fabric/Middleware centric: Afs Software installations - Alex Undrus, Shane Canon, Iwona Sakrejda Networking - Shawn McKee, Rob Gardner Virtual and Real Data Management - Wendsheng Deng, Sasha Vaniachin, Pavel Nevski, David Malon, Rob Gardner, Dan Engh, Mike Wilde Security/Site AA/VO - Rich Baker, Dantong Yu
15
14 Jan 03 J. Huth LHC Computing Agency Review 15 NLV Analysis Tool: Plots Time vs. Event Name Menu bar Scale for load-line/ points Events Legend Zoom window controls Zoom box Playback controls Window size Max window size Zoom-box actions Playback speed Summary line Time axis You are here Title
16
14 Jan 03 J. Huth LHC Computing Agency Review 16 Grid Planning For Data Challenge 1, Phase 2, generation of High Level Trigger – using a unified format for International ATLAS computing in a grid configuration. For R+D purposes: increased integration into iVDGL/GriPhyN tools: VDT, Chimera, working with tools in common with US CMS Issue of divergences with EDG software, configurations
17
14 Jan 03 J. Huth LHC Computing Agency Review 17 External Groups Trillium (GriPhyN,iVDGL,PPDG) EU initiatives LCG Intl. ATLAS U.S. CMS (large ITR pre-proposal) Other groups/experiments (HEP/LIGO/SDSS…) Funding agencies
18
14 Jan 03 J. Huth LHC Computing Agency Review 18 Trillium Group PPDG (J. Huth) Metadata catalog (MAGDA) (Wensheng Deng, BNL) Interoperability, middleware evaluation (Jerry Gieraltowski, ANL) Virtual organization, monitoring in grid environment (Dantong Yu, BNL) Distributed analysis (David Adams, BNL) iVDGL (R. Gardner) Package deployment/installation (PACMAN)( S. Youssef, BU) –adopted by VDT and CMS Incorporation of VDT, Chimera in next prototype round Hardware support – prototype Tier 2’s (Indiana, Boston University) NB A tremendous amount of support comes from base efforts at Labs and Universities (netlogger – LBNL, grat- De UTA, support – H. Severini, Oklahoma, S. McKee, Michigan, May, ANL, BNL, LBNL, BU)
19
14 Jan 03 J. Huth LHC Computing Agency Review 19 EU Initiatives Linkages via International ATLAS (using EDG tools/testbed) LCG ITR initiative Middleware representatives (Foster/Kesselman/Livny/Gagliardi) Issues: EDG mandate is larger than the LHC Divergences in middleware, approaches “best in show” concept – David Foster (LCG) Interoperability of testbeds
20
14 Jan 03 J. Huth LHC Computing Agency Review 20 LCG More than just grids – applications group is very large Linkages via Intl. Experiments Middleware representatives Architects’ forum Applications (T. Wenaus) Representation: L Bauerdick (SC2), V. White (GDB), J. Huth (POB) Issues: Communications about LCG-1 What are requirements for Tier 1’s? Divergences in site requirements? Linage of Intl. Experiments with LCG DC1Phase2 versus LCG-1
21
14 Jan 03 J. Huth LHC Computing Agency Review 21 International ATLAS Main focus: Data Challenges G. Poulard (DC coordinator), A. Putzer (NCB chair) Linkages National Computing Board (resources, planning) – J. Huth Grid planning – R. Gardner Technical Board – K. De Issues Divergence in US-EU grid services an issue Commonality vs. interoperability Probably the tightest coupling (nothing focuses the mind like data) Middleware selection Facilities planning
22
14 Jan 03 J. Huth LHC Computing Agency Review 22 US CMS Commonality in many tools (PACMAN, VDT, Chimera) Discussions in common forums (LCG, Trillium, CS groups) Interoperability tests Large ITR proposal: Coordination with EDG, LCG, US CMS, CS community Use existing management structure of US ATLAS+ US CMS Coordinating group Physics-based goal of “private grids” for analysis groups Coordination among funding agencies, LCG, experiments Further funding initiatives coordinated (EGEE, DOE)
23
14 Jan 03 J. Huth LHC Computing Agency Review 23 Funding Agencies Recent initiative to pull together funding agencies involved in grids and HEP applications. Nov. 22 nd meeting of NSF, DOE, Computing, HEP, EU and CERN representatives Follow-up date proposed: Feb. 7 th November ITR workshop, with US CMS, US ATLAS, computer scientists and funding agency representatives Follow-up: M. Kasemann Goal: Coordinated funding of EU/US/CERN efforts on the experiments and grid middleware. Beyond the large ITR: Medium ITR’s DOE initiatives EU funding Question: UK, other funding
24
14 Jan 03 J. Huth LHC Computing Agency Review 24 Other groups Experiments Direct linkage via GriPhyN, iVDGL: SDSS, LIGO Contacts, discussions with D0, RHIC experiments Grid organizations PPARC CrossGrid HICB Interoperability initiatives
25
14 Jan 03 J. Huth LHC Computing Agency Review 25 US ATLAS Grid Organization Present status: WBS projection onto grid topics of software and facilities Efforts have focused on SC2002 demos and ATLAS production – near term milestones I have held off until now on reorganizing the management structure because of the rapid evolution Gain experience with the US ATLAS testbed Creation of new organizations Shake out of the LHC schedule We are now discussing the necessary changes to the management structure to coordinate the efforts
26
14 Jan 03 J. Huth LHC Computing Agency Review 26 Proposed new structure Creation of level-2 management slot for distributed computing applications Interaction with above list of groups (or delegate liaison) Three sub-tasks Architecture – Series of packages created for ATLAS production or prototyping Components Testing and support of grid deliverables (e.g. high level ATLAS-specific interfaces) Grid deliverables associated with IT initiatives Production Running Intl. ATLAS production on the US ATLAS fabric Contributions to Intl. ATLAS production
27
14 Jan 03 J. Huth LHC Computing Agency Review 27
28
14 Jan 03 J. Huth LHC Computing Agency Review 28 Next Steps in Organization Bring this new structure to the US ATLAS Computing Coordination Board, seek advice on level-2 and level-3 managers. Coordinate with “liaison” groups
29
14 Jan 03 J. Huth LHC Computing Agency Review 29 Comments/Summary US ATLAS has made tremendous progress last year in testing grid production SC2002 Demo Interoperability tests Data Challenges Management of diverse sources of effort is challenging!! Mechanisms begun to coordinate further activities Goal: without creating divergences, or proliferation of new groups New US ATLAS management structure under discussion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.