Presentation is loading. Please wait.

Presentation is loading. Please wait.

US-ATLAS Grid Efforts John Huth Harvard University Agency Review of LHC Computing Lawrence Berkeley Laboratory January 14-17, 2003.

Similar presentations


Presentation on theme: "US-ATLAS Grid Efforts John Huth Harvard University Agency Review of LHC Computing Lawrence Berkeley Laboratory January 14-17, 2003."— Presentation transcript:

1 US-ATLAS Grid Efforts John Huth Harvard University Agency Review of LHC Computing Lawrence Berkeley Laboratory January 14-17, 2003

2 14 Jan 03 J. Huth LHC Computing Agency Review 2 Outline  Goals  Data Challenge experience and plans  Relation to external groups  US ATLAS grid management

3 14 Jan 03 J. Huth LHC Computing Agency Review 3 Goals  Primary: establish a robust and efficient platform (fabric and software) for US LHC and International LHC simulation, production and data analysis  Data challenge support  User analysis support  Production support  Ready for data taking on “day one”  Realizing common grid goals  Local control  Democratic access to data  Support of autonomous analysis communities (private grids)  Value added to other disciplines  Serving as a knowledgeable user community for testing middleware

4 14 Jan 03 J. Huth LHC Computing Agency Review 4 US ATLAS Testbed  Develop a tier’ed, scalable computing platform  Test grid middleware integrated with software  Test interoperability  CMS  EDG  Other groups  Support physics efforts  Data Challenges  Production  User analysis

5 14 Jan 03 J. Huth LHC Computing Agency Review 5 Testbed Fabric  Production gatekeepers at ANL, BNL, LBNL, BU, IU, UM, OU, UTA  Large clusters at BNL, LBNL, IU, UTA, BU  Heterogeneous system - Condor, LSF, PBS  Currently > 100 nodes available  Could double capacity quickly, if needed  + Multiple R&D gatekeepers  gremlin@bnl - iVDGL GIIS  heppc5@uta - ATLAS hierarchical GIIS  atlas10/14@anl - EDG testing  heppc6@uta+gremlin@bnl - glue schema  heppc17/19@uta - GRAT development  few sites - Grappa portal  bnl - VO server  few sites - iVDGL testbed

6 14 Jan 03 J. Huth LHC Computing Agency Review 6 Lawrence Berkeley National Laboratory Brookhaven National Laboratory Indiana University Boston University Argonne National Laboratory U Michigan University of Texas at Arlington Oklahoma University US -ATLAS testbed launched February 2001 Two new sites joining - UNM, SMU Grid Testbed Sites

7 14 Jan 03 J. Huth LHC Computing Agency Review 7 Testbed Tools  Many tools developed by the U.S. ATLAS testbed group during past 2 years   GridView - simple tool to monitor status of testbed Kaushik De, Patrick McGuigan   Gripe - unified user accounts Rob Gardner   Magda - MAnager for Grid DAta Torre Wenaus, Wensheng Deng (see Gardner & Wenaus talks)   Pacman - package management and distribution tool Saul Youssef   Grappa - web portal using active notebook technology Shava Smallen, Dan Engh   GRAT - GRid Application Toolkit   Gridsearcher - MDS browser Jennifer Schopf   GridExpert - Knowledge Database Mark Sosebee   VO Toolkit - Site AA Rich Baker

8 14 Jan 03 J. Huth LHC Computing Agency Review 8 Recent Accomplishments  May 2002  Globus 2.0 beta RPM developed at BNL  Athena-atlfast grid package developed at UTA  Installation using Pacman developed at BU  GRAT toolkit for job submission on grid, developed at UTA & OU  June 2002  Tested interoperability - successfully ran ATLAS MC jobs on CMS & D0 grid sites  ANL demonstrated that U.S. testbed package can run successfully at EDG sites  July 2002  New production software released & deployed  2 week Athena-atlfast MC production run using GRAT & GRAPPA  Generated 10 million events, thousand files catalogued in Magda, all sites participating

9 14 Jan 03 J. Huth LHC Computing Agency Review 9 Accomplishments contd.  August/September 2002  3 week dc1 production run using GRAT  Generated 200,000 events, using ~ 30,000 CPU hours, 2000 files, 100 GB storage  October/November 2002  Prepare demos for SC2002  Deployed VO server at BNL Tier I facility  Deployed new VDT 1.1.5 on testbed  Test iVDGL packages Worldgrid, Sciencegrid  Interoperability tests with EDG  December 2002  Developed software evolution plan during meeting with Condor/VDT team at UW-Madison  Generated 75k SUSY and Higgs events for DC1  Total DC1 files generated and stored > 500 GB, total CPU used >1000 CPU days in 4 weeks

10 14 Jan 03 J. Huth LHC Computing Agency Review 10 Lessons Learned  Globus, Magda and Pacman make grid production easy!  On the grid - submit anywhere, run anywhere, store data anywhere - really works!  Error reporting, recovery & cleanup very important - will always lose/hang some jobs  Found many unexpected limitations, hangs, software problems - next time, need larger team to quantify these problems and provide feedback to Globus, Condor, and other middleware teams  Large pool of hardware resources available on testbed: BNL Tier 1, LBNL (pdsf), IU & BU prototype Tier 2 sites, UTA (new $1.35M NSF-MRI), OU & UNM CS supercomputing clusters...  Testbed production effort suffering from severe shortage of human resources. Need people to debug middleware problems and provide feedback to middleware developers

11 14 Jan 03 J. Huth LHC Computing Agency Review 11 DC1 Phase II  Provide data for high level trigger studies  Data analysis from production at BNL  Unified set of grid tools for international ATLAS  Magda, PACMAN, cook-book database, VO management  There may still be divergences among grids (US/EU/Nordu grid)

12 14 Jan 03 J. Huth LHC Computing Agency Review 12 ATLAS DC1 Phase 2  Pile up production:  Ongoing! Both grid & non-grid based  Add min. bias events to DC1 Phase 1 sample  Simulation in the U.S.  Higgs re-simulation - 50k events  SUSY simulation - 50k events  Athena reconstruction  of complete DC1 Phase 2 sample  Analysis / user access to data  Magda already provides access to ~30k catalogued dc1 files from/to many grid locations (need ATLAS VO to make this universal)  Need higher level (web based?) tools to provide easy access to physicist  DIAL being developed at BNL http://www.usatlas.bnl.gov/~dladams/dial/

13 14 Jan 03 J. Huth LHC Computing Agency Review 13 DC2 Specifics  From new WBS (we will review the situation in early 2003)  Test LCG-1 (software and hardware) - in particular POOL  ATLAS tests  simulation  status of G4 (validation, hits, digits)  pile-up in Athena  relative role of G3/G4  calibration and alignment  detector description and EDM  distributed analysis

14 14 Jan 03 J. Huth LHC Computing Agency Review 14 Integration  Coordination with other grid efforts and software developers - very difficult task!  Project centric:  Intl. ATLAS – K. De  GriPhyN/iVDGL - Rob Gardner (J. Schopf, CS contact)  PPDG – John Huth  LCG – John Huth (POB), V. White (GDB), L. Bauerdick (SC2)  EDG - Ed May, Jerry Gieraltowski  ATLAS/LHCb - Rich Baker  ATLAS/CMS - Kaushik De  ATLAS/D0 - Jae Yu  Fabric/Middleware centric:  Afs Software installations - Alex Undrus, Shane Canon, Iwona Sakrejda  Networking - Shawn McKee, Rob Gardner  Virtual and Real Data Management - Wendsheng Deng, Sasha Vaniachin, Pavel Nevski, David Malon, Rob Gardner, Dan Engh, Mike Wilde  Security/Site AA/VO - Rich Baker, Dantong Yu

15 14 Jan 03 J. Huth LHC Computing Agency Review 15 NLV Analysis Tool: Plots Time vs. Event Name Menu bar Scale for load-line/ points Events Legend Zoom window controls Zoom box Playback controls Window size Max window size Zoom-box actions Playback speed Summary line Time axis You are here Title

16 14 Jan 03 J. Huth LHC Computing Agency Review 16 Grid Planning  For Data Challenge 1, Phase 2, generation of High Level Trigger – using a unified format for International ATLAS computing in a grid configuration.  For R+D purposes: increased integration into iVDGL/GriPhyN tools: VDT, Chimera, working with tools in common with US CMS  Issue of divergences with EDG software, configurations

17 14 Jan 03 J. Huth LHC Computing Agency Review 17 External Groups  Trillium (GriPhyN,iVDGL,PPDG)  EU initiatives  LCG  Intl. ATLAS  U.S. CMS (large ITR pre-proposal)  Other groups/experiments (HEP/LIGO/SDSS…)  Funding agencies

18 14 Jan 03 J. Huth LHC Computing Agency Review 18 Trillium Group  PPDG (J. Huth)  Metadata catalog (MAGDA) (Wensheng Deng, BNL)  Interoperability, middleware evaluation (Jerry Gieraltowski, ANL)  Virtual organization, monitoring in grid environment (Dantong Yu, BNL)  Distributed analysis (David Adams, BNL)  iVDGL (R. Gardner)  Package deployment/installation (PACMAN)( S. Youssef, BU) –adopted by VDT and CMS  Incorporation of VDT, Chimera in next prototype round  Hardware support – prototype Tier 2’s (Indiana, Boston University)  NB A tremendous amount of support comes from base efforts at Labs and Universities (netlogger – LBNL, grat- De UTA, support – H. Severini, Oklahoma, S. McKee, Michigan, May, ANL, BNL, LBNL, BU)

19 14 Jan 03 J. Huth LHC Computing Agency Review 19 EU Initiatives  Linkages via  International ATLAS (using EDG tools/testbed)  LCG  ITR initiative  Middleware representatives (Foster/Kesselman/Livny/Gagliardi)  Issues:  EDG mandate is larger than the LHC  Divergences in middleware, approaches  “best in show” concept – David Foster (LCG)  Interoperability of testbeds

20 14 Jan 03 J. Huth LHC Computing Agency Review 20 LCG  More than just grids – applications group is very large  Linkages via  Intl. Experiments  Middleware representatives  Architects’ forum  Applications (T. Wenaus)  Representation: L Bauerdick (SC2), V. White (GDB), J. Huth (POB)  Issues:  Communications about LCG-1  What are requirements for Tier 1’s?  Divergences in site requirements?  Linage of Intl. Experiments with LCG  DC1Phase2 versus LCG-1

21 14 Jan 03 J. Huth LHC Computing Agency Review 21 International ATLAS  Main focus: Data Challenges  G. Poulard (DC coordinator), A. Putzer (NCB chair)  Linkages  National Computing Board (resources, planning) – J. Huth  Grid planning – R. Gardner  Technical Board – K. De  Issues  Divergence in US-EU grid services an issue  Commonality vs. interoperability  Probably the tightest coupling (nothing focuses the mind like data)  Middleware selection  Facilities planning

22 14 Jan 03 J. Huth LHC Computing Agency Review 22 US CMS  Commonality in many tools (PACMAN, VDT, Chimera)  Discussions in common forums (LCG, Trillium, CS groups)  Interoperability tests  Large ITR proposal:  Coordination with EDG, LCG, US CMS, CS community  Use existing management structure of US ATLAS+ US CMS  Coordinating group  Physics-based goal of “private grids” for analysis groups  Coordination among funding agencies, LCG, experiments  Further funding initiatives coordinated (EGEE, DOE)

23 14 Jan 03 J. Huth LHC Computing Agency Review 23 Funding Agencies  Recent initiative to pull together funding agencies involved in grids and HEP applications.  Nov. 22 nd meeting of NSF, DOE, Computing, HEP, EU and CERN representatives  Follow-up date proposed: Feb. 7 th  November ITR workshop, with US CMS, US ATLAS, computer scientists and funding agency representatives  Follow-up: M. Kasemann  Goal: Coordinated funding of EU/US/CERN efforts on the experiments and grid middleware.  Beyond the large ITR:  Medium ITR’s  DOE initiatives  EU funding  Question: UK, other funding

24 14 Jan 03 J. Huth LHC Computing Agency Review 24 Other groups  Experiments  Direct linkage via GriPhyN, iVDGL: SDSS, LIGO  Contacts, discussions with D0, RHIC experiments  Grid organizations  PPARC  CrossGrid  HICB  Interoperability initiatives

25 14 Jan 03 J. Huth LHC Computing Agency Review 25 US ATLAS Grid Organization  Present status: WBS projection onto grid topics of software and facilities  Efforts have focused on SC2002 demos and ATLAS production – near term milestones  I have held off until now on reorganizing the management structure because of the rapid evolution  Gain experience with the US ATLAS testbed  Creation of new organizations  Shake out of the LHC schedule  We are now discussing the necessary changes to the management structure to coordinate the efforts

26 14 Jan 03 J. Huth LHC Computing Agency Review 26 Proposed new structure  Creation of level-2 management slot for distributed computing applications  Interaction with above list of groups (or delegate liaison)  Three sub-tasks  Architecture –  Series of packages created for ATLAS production or prototyping  Components  Testing and support of grid deliverables (e.g. high level ATLAS-specific interfaces)  Grid deliverables associated with IT initiatives  Production  Running Intl. ATLAS production on the US ATLAS fabric  Contributions to Intl. ATLAS production

27 14 Jan 03 J. Huth LHC Computing Agency Review 27

28 14 Jan 03 J. Huth LHC Computing Agency Review 28 Next Steps in Organization  Bring this new structure to the US ATLAS Computing Coordination Board, seek advice on level-2 and level-3 managers.  Coordinate with “liaison” groups

29 14 Jan 03 J. Huth LHC Computing Agency Review 29 Comments/Summary  US ATLAS has made tremendous progress last year in testing grid production  SC2002 Demo  Interoperability tests  Data Challenges  Management of diverse sources of effort is challenging!!  Mechanisms begun to coordinate further activities  Goal: without creating divergences, or proliferation of new groups  New US ATLAS management structure under discussion


Download ppt "US-ATLAS Grid Efforts John Huth Harvard University Agency Review of LHC Computing Lawrence Berkeley Laboratory January 14-17, 2003."

Similar presentations


Ads by Google