1 ATLAS Computing on Grid3 and the OSG Rob Gardner University of Chicago Mardi Gras Conference February 4, 2005.

Slides:



Advertisements
Similar presentations
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Advertisements

CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
ATLAS Experiment at CERN. Why Build ATLAS? Before the LHC there was LEP (large electron positron collider) the experiments at LEP had observed the W and.
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
R. Cavanaugh GriPhyN Analysis Workshop Caltech, June, 2003 Virtual Data Toolkit.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
Experience with ATLAS Data Challenge Production on the U.S. Grid Testbed Kaushik De University of Texas at Arlington CHEP03 March 27, 2003.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
A tool to enable CMS Distributed Analysis
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
Stuart Wakefield Imperial College London1 How (and why) HEP uses the Grid.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Grappa: Grid access portal for physics applications Shava Smallen Extreme! Computing Laboratory Department of Physics Indiana University.
XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.
1 ATLAS DC2 Production …on Grid3 M. Mambelli, University of Chicago for the US ATLAS DC2 team September 28, 2004 CHEP04.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
1 ATLAS Grid Computing and Data Challenges Nurcan Ozturk University of Texas at Arlington Recent Progresses in High Energy Physics Bolu, Turkey. June 23-25,
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
ATLAS Goals and Status Jim Shank US LHC OSG Technology Roadmap May 4-5th, 2005.
ANL/BNL Virtual Data Technologies in ATLAS Alexandre Vaniachine Pavel Nevski US-ATLAS Core/GRID software workshop Brookhaven National Laboratory May 6-7,
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Performance of The NorduGrid ARC And The Dulcinea Executor in ATLAS Data Challenge 2 Oxana Smirnova (Lund University/CERN) for the NorduGrid collaboration.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
ATLAS Midwest Tier2 University of Chicago Indiana University Rob Gardner Computation and Enrico Fermi Institutes University of Chicago WLCG Collaboration.
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
ATLAS Grid Computing Rob Gardner University of Chicago ICFA Workshop on HEP Networking, Grid, and Digital Divide Issues for Global e-Science THE CENTER.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
1 The Capone Workflow Manager M. Mambelli, University of Chicago R. Gardner, University of Chicago J. Gieraltowsky, Argonne National Laboratory 14 th February.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
EXPERIENCE WITH ATLAS DISTRIBUTED ANALYSIS TOOLS S. González de la Hoz L. March IFIC, Instituto.
VO Experiences with Open Science Grid Storage OSG Storage Forum | Wednesday September 22, 2010 (10:30am)
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
ATLAS on Grid3/OSG R. Gardner December 16, 2004.
EDG Project Conference – Barcelona 13 May 2003 – n° 1 A.Fanfani INFN Bologna – CMS WP8 – Grid Planning in CMS Outline  CMS Data Challenges  CMS Production.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
Open Science Grid Progress and Status
U.S. ATLAS Grid Production Experience
ATLAS DC2 ISGC-2005 Taipei 27th April 2005
LCG middleware and LHC experiments ARDA project
N. De Filippis - LLR-Ecole Polytechnique
Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.
ATLAS DC2 & Continuous production
Presentation transcript:

1 ATLAS Computing on Grid3 and the OSG Rob Gardner University of Chicago Mardi Gras Conference February 4, 2005

2 New Physics Discoveries with ATLAS At CERN, Geneva, Switzerland The Large Hadron Collider (LHC) The ATLAS Detector The 27 km long LHC tunnel

3

4 Diameter25 m Barrel toroid length26 m End-cap end-wall chamber span46 m Overall weight 7000 Tons Scale of the detector

5 ATLAS superimposed on a 5 story building at CERN

6

7 ATLAS Collaboration

8

9 To understand the Higgs mechanism, imagine that a room full of physicists chattering quietly is like space filled with the Higgs field... if a rumor crosses the room,...a well-known scientist walks in, creating a disturbance as he moves across the room and attracting a cluster of admirers with each step... this increases his resistance to movement, in other words, he acquires mass, just like a particle moving through the Higgs field... …it creates the same kind of clustering, but this time among the scientists themselves. In this analogy, these clusters are the Higgs particles. The Higgs Particle Credit: The ATLAS Outreach Team ( And Prof. David J. Miller of University College London

10 ATLAS

11 If we can start up at 1/10 th design luminosity, we’ll discover a Higgs with mass greater than 130 GeV within 1 year. Will cover entire theoretically allowed range with 1 year of design luminosity. The Higgs Particle: Discovery

12 New Physics The Higgs Mechanism for generating mass has a problem  It explains the masses of the known particles, but has a mathematical problem (divergence) at high energies  To fix this, there must be new particles These new particles must show up at energies we will explore at the LHC.

13 Supersymmetric Signatures We will discover supersymmetry if it is what stabilizes the Higgs mass. Dramatic event signatures mean we will discover it quickly.

14 Channels of 40 MHz 1.45x ,000 10,000 3, x10 6 Numbers are electronic channel count

15 Distributed Computing Centers… Tier2 Center ~200kSI2k Event Builder Event Filter ~7.5MSI2k T0 ~5MSI2k US Regional Centre (BNL) UK Regional Center (RAL) French Regional Center Dutch Regional Center Tier3 Tier 3 ~0.25TIPS Workstations 10 GB/sec 320 MB/sec MB/s links Each Tier 2 has ~20 physicists working on one or more channels Each Tier 2 should have the full AOD, TAG & relevant Physics Group summary data Tier 2 do bulk of simulation Physics data cache ~Pb/sec ~ 75MB/s/T1 for ATLAS Tier2 Center ~200kSI2k  622Mb/s links Tier 0 Tier 1 Desktops PC (2004) = ~1 kSpecInt2k Other Tier2 ~200kSI2k Tier 2  ~200 Tb/year/T2  ~2MSI2k/T1  ~2 Pb/year/T1  ~5 Pb/year  No simulation  622Mb/s links

16 US common Grid infrastructure Collection of Grid Services via the VDT and other providers

17 Prototype Tier2 Center in 2004 tier2-01 compute01 compute64 tier2-u1 compute01 Grid3/OS G Interactive/ Local users Public/Internet Private Network tier2-02tier2-03 Tier2 Prototype Cluster se1 se4 se2se3 tier2-u2tier2-u3 home tier2-mgt tier2-01 gatekeeper condor master tier2-02 gridFTP tier2-03 SRM gridFTP tier2-u1,2,3 Local job analysis tier2-mgt Rocks frontend webserver ganglia monalisa se1-se4 Grid3 /tmp, /app, /data atlas SE home home directories

18 Simulation Framework for 3 Grids USATLAS GTS

19 ProdDB Condor-G schedd GridMgr CE gsiftp WN SE Chimera RLS Windmill Pegasus VDC DonQuijote MonServers MonALISA gram Grid3 Sites Capone sch GridCat MDS System Architecture for US ATLAS

20 Capone Production on Grid3 ATLAS environment for Grid3 (VDT based) Accept ATLAS DC2 jobs from Windmill Manage all steps in the job life cycle  prepare, submit, monitor, output & register Manage workload and data placement Process messages from Production Supervisor Provide useful logging information to user Communicate executor and job state information

21 Capone System Elements GriPhyN Virtual Data System  VDC – catalog containing all US ATLAS transformations, derivations and job records Transformation  A workflow accepting input data (datasets), parameters and producing output data (datasets) Derivation  Transformation where the parameters have been bound to actual parameters Directed Acyclic Graph (DAG)  Abstract DAG (DAX) created by Chimera, with no reference to concrete elements in the Grid; Concrete DAG (cDAG) created by Pegasus, where CE, SE and PFN have been assigned Globus, RLS, Condor-G all used

22 Capone Architecture Message interface  Web Service  Jabber Translation layer  Windmill schema CPE (Process Engine) Processes  Grid3: GCE interface  Stub: local shell testing  DonQuijote (future) Message protocols Jabber Web Service Translation ADA Windmill Process Execution PXECPE-FSM Stub Grid(GCE-Client) DonQuijote

23 Effective Access for DC2 ~150K ATLAS jobs from July 1 to Dec 8, ’04 Grid3 Sites with >100 successful DC2 jobs: 21

24 U.S. ATLAS Grid Production G. Poulard, 9/21/04 # Validated Jobs total Day 3M Geant4 events of ATLAS, roughly 1/3 of International ATLAS  Over 150K 20 hour jobs executed Competitive with peer European Grid projects LCG and NorduGrid

25 Data Challenge Summary by Grid LCG  Included some non- ATLAS sites, used the LCG-Grid- Canada interface NorduGrid  Scandinavian resources + sites in Australia, Germany, Slovenia, Switzerland Grid3  Used computing resources that are not dedicated to ATLAS

26 Global Production by facility 69 sites ~ Jobs

27 Job Failure Analysis on Grid3 FailuresCumulative till end Nov 2004 Sep 2004Oct.-Nov Submission Exe check428 0 Run-End StageOut RLS reg Capone host interruption Windmill Other TOTAL Not all failures “equal” – some more costly than others…

28 Summary and Outlook ATLAS has made good use of the Grid3 and other grid infrastructures and will continue to accumulate lessons and experience for OSG Challenges for 2005:  Address submit host scalability issues and job recovery  Focus on data management, policy-based scheduling, and user access Support on-going production while developing new services and adapting to changes in the infrastructure

29 References and Acknowledgements PPDG, GriPhyN, iVDGL Collaborations US ATLAS Software and Computing, US ATLAS Grid Tools and Services UC Prototype Tier 2 iVDGL: “ The International Virtual Data Grid Laboratory ” Grid3: “ Application Grid Laboratory for Science ” OSG: Open Science Grid Consortium

30 A job in Capone (1, submission) Reception  Job received from Windmill Translation  Un-marshalling, ATLAS transformation DAX generation  Chimera generates abstract DAG Input file retrieval from RLS catalog  Check RLS for input LFNs (retrieval of GUID, PFN) Scheduling: CE and SE are chosen Concrete DAG generation and submission  Pegasus creates Condor submit files  DAGMan invoked to manage remote steps

31 A job in Capone (2, execution) Remote job running / status checking  Stage-in of input files, create POOL FileCatalog  Athena (ATLAS code) execution Remote Execution Check  Verification of output files and exit codes  Recovery of metadata (GUID, MD5sum, exe attributes) Stage Out: transfer from CE site to destination SE Output registration  Registration of the output LFN/PFN and metadata in RLS Finish  Job completed successfully, communicates to Windmill that jobs is ready for validation Job status is sent to Windmill during all the execution Windmill/DQ validate & register output in ProdDB