Amber Boehnlein, FNAL Accelerator Based Physics: ATLAS CDF CMS DO STAR Amber Boehnlein OSG Consortium Meeting January 24, 2006.

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

First results from the ATLAS experiment at the LHC
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Introduction to the workshop LHCb Generators Tuning Mini Workshop Bucharest 22 nd & 23 rd November 2012 LHCb Generators Tuning Mini Workshop Bucharest.
F Run II Experiments and the Grid Amber Boehnlein Fermilab September 16, 2005.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Preparation of KIPT (Kharkov) computing facilities for CMS data analysis L. Levchuk Kharkov Institute of Physics and Technology (KIPT), Kharkov, Ukraine.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
1 Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh Chulalongkorn University, Thailand July 23, 2015 Workshop on e-Science.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
2004 Xmas MeetingSarah Allwood WW Scattering at ATLAS.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
DØ Computing Model & Monte Carlo & Data Reprocessing Gavin Davies Imperial College London DOSAR Workshop, Sao Paulo, September 2005.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
Hall-D/GlueX Software Status 12 GeV Software Review III February 11[?], 2015 Mark Ito.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Post-DC2/Rome Production Kaushik De, Mark Sosebee University of Texas at Arlington U.S. Grid Phone Meeting July 13, 2005.
High Energy FermiLab Two physics detectors (5 stories tall each) to understand smallest scale of matter Each experiment has ~500 people doing.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
US-CMS T2 Centers US-CMS Tier 2 Report Patricia McBride Fermilab GDB Meeting August 31, 2007 Triumf - Vancouver.
Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
Physics Results from CDF and Prospects for a FY 2011 Run Kevin Pitts / University of Illinois DOE S&T Review of Scientific User Facilities June 30 – July.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Victoria A. White Head, Computing Division, Fermilab Fermilab Grid Computing – CDF, D0 and more..
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?
DØ Computing Model and Operational Status Gavin Davies Imperial College London Run II Computing Review, September 2005.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Readiness of ATLAS Computing - A personal view
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
DØ MC and Data Processing on the Grid
Presentation transcript:

Amber Boehnlein, FNAL Accelerator Based Physics: ATLAS CDF CMS DO STAR Amber Boehnlein OSG Consortium Meeting January 24, 2006

Amber Boehnlein, FNAL Particle Physics  These five physics experiments are physics facilities with the intent of testing the Standard ModelStandard Model  What are the questions: What are the questions  What causes electroweak symmetry breaking ?  Does Quantum Chromodynamics precisely describe the behavior of quarks and gluons?  What is the mechanism of CP violation?  What is the wave function of the proton; of a heavy nucleus? ...  What we measure:  The production and decay of particles and associated properties  Cross sections, spectra measurements (E, Pt, eta,...), angular distributions, particle correlations  the top mass and properties  Properties of the electoweak bosons.  Flavor physics; mixing  …  What we seek:  Higgs Boson  SUSY and other new phenomena beyond the Standard Model

Amber Boehnlein, FNAL The Road to Physics Reconstruction (RECO) detector algorithms particle identifications production farm user ready data format … pedestals gains, linearity … Calibration detector trigger system data acquisition … Raw Data: event generation Geant detector simulation Fast simulations … Physics Analysis event selections efficiencies & backgrounds … Databases Network Releases Operation Data handling & access Trigger simulations Luminosity passes through software and computing … Monte Carlo

Amber Boehnlein, FNAL OSG is a road to Physics Reconstruction (RECO) DO-reconstruction from raw and from derived Data STAR … pedestals gains, linearity … Calibration detector trigger system data acquisition … Raw Data: Atlas, CMS, CDF,DO Star … Physics Analysis Atlas CMS CDF STAR Monte Carlo Atlas CDF CMS DO

Amber Boehnlein, FNAL Implications  Calibration database connectivity via some mechanism is essential for reconstruction  “User” application code/macros distributed as self contained tarballs or as a an advertised local installation of code distribution.  Computations can be compute intensive  ALPGEN simulates multi-parton processes well, but is much slower than other standard packages ALPGEN  Flagship analysis: CDF estimates 84 GHZ-years for top mass and cross section analyses (manipulating about 10 TB of data)  Computations can be data intensive  Reconstruction process typically GBs of data and GB.  Run over terabytes of input data clustered in hundreds of GB of dataset for bookkeeping purposes.  Job management is shaped around this clustering, resulting in bursts (hundreds) of local jobs submitted at the same time.  Jobs typically run for several hours and typically require external network connectivity.  For efficient storage, output files might require merging.  OSG provides a maturing infrastructure to run within this paradigm.  Resources are made available via standard interfaces for job and data management.  Operationally issues such as time synchrony for security, local scratch management.

Amber Boehnlein, FNAL Operations  CDF, DO, STAR  Mature experiments accumulating ~1pb/year, Billions of events, Millions of files…  Well established and stable applications  Anticipating upgrades in detectors, luminosity  All depend on distributed computing  Atlas, CMS  Use of MC data challenges, test beam data to test infrastructure and prepare for physics  Cosmic ray commissioning  Computing scales up dramatically compared to current experiments in all dimensions, including number of collaborators.  My thanks to all those who contributed to this talk!

Amber Boehnlein, FNAL CDF Operational Modes  OSG for MC production,  Targeting other production chain tasks such as generating user level ntuples  Condor-g submission  Self contained tarball for production applications  DB access via squid server or connection to FNAL  Pursuing user analysis using “glide CAF”  Provides familiar user environment  investigating user-level mounting of a remote filesystem using HTTP, and using local squid servers for caching to provide flexibility of the full CDF software distribution  Will rely on SAM for data handling

Amber Boehnlein, FNAL CMS Operations  CMS Relies on OSG for two significant activities  Centralized production of simulated events in the US  CMS is performing both opportunistic submission to non-CMS sites  Centralized submission by a dedicated to US-CMS sites  Remote submission of user simulation on the US-CMS Tier-2 sites  User submission of jobs to access data published as being available at the site  CMS Simulated Event Production  Over the last 4 years CMS has been successfully submitting simulated production jobs to distributed computing sites using ever-improving grid middleware  CMS dedicated infrastructure initially, followed by Grid3, followed by OSG  In 5 months in 2006 we expect to generate 50M events for the next challenge. The OSG share is 15M-20M  CMS Analysis Activities  During the Worldwide LCG (WLCG) service challenge CMS submitted analysis jobs to access local data  Thousands of jobs, 10s of TB of data access  During the challenge only dedicated expert users  Next step will include normal users

Amber Boehnlein, FNAL CMS Simulation  Submitted centrally from UFL by a dedicated team  Adds to 1FTE of effort over three people  Relatively quiet period for CMS over the final quarter of 2005 CMS ran 5M events with three processing steps on OSG resources  Represents about 40CPU years of computing  During ramp for DC04 CMS utilized several hundred years of CPU  More than 100 years of Opportunistic resources CMS expects to generate a Sample roughly the size Of the raw data at start USCMS contribution is Roughly 30% of this 800TB per year of simulation by The start of high lumi running

Amber Boehnlein, FNAL CMS Analysis Experience  Service challenge 3 CMS ran over 18k jobs on OSG connected Tier-2 resources.  Completed 14k, corresponding to ~20 TB  The total data read was ~20TB, Preload data at site using phedex.  Submission and completion efficiency still need to be improved  Many of the failures were uniquely attributable to CMS  First large scale analysis attempt for CMS on OSG  Increasing user participation on OSG analysis to the whole collaboration and improving the experience are part of the 2006 program of work

Amber Boehnlein, FNAL Star Operations SUMS based (STAR Unified Meta-Scheduler)  High level User JDL describes task, code needed, dataset and SUMS submits to appropriate sites depending on user resource requirements or hints  Assumed software installed - Transferred input using GRAM input (achive/ tarball) - Output transferred using GRAM output - Integrated Cataloging possible via RRS = Replica Registration Service) making this fully automated  MC - ALL is SUMS based - MC jobs only, nightly test (QA) moved to Grid - PACMAN packages available for STAR software for one OS (Linux)  - Use Archive SandBox for the specific codes [mostly used]. - Assumes DB connectivity and outbound connections. - More recently: SRM transfer of output - Job submission COndor-G based  Plan to migrate all MC to OSG  Offload from Tier0 and Tier1 center to ANY resource  Allow Tier2 to submit R&D simulations (RHIC-II detector simulation)

Amber Boehnlein, FNAL STAR Analysis Experience  Star has very positive user analysis experience with 10K jobs/user.  User analysis is “expert” only  STAR has strong incentive to encourage generic users  Users already severely constrained  Opportunistic computing for user analysis makes more sense at this stage (jobs are smaller as time and input adaptable to even the smallest site).  RHIC-II running will require more resources.  Data moved/relocated/managed on demand (in the background)  Generic user analysis would require mechanism to locate "Hot" datasets  Would need (require) SE enabled sites and asynchronous CPU / data transfer mechanism (like SRM now)  RRS-like essential for automation of data mining and registration on arrival (immediate access and exploitation) * * Concerned of user needs mismatching available QoS and "help desk" - OSG our best hope.

Amber Boehnlein, FNAL DO Operational Modes  DO  Depends on distributed computing for MC, production chain activities  Use SAMGrid to submit jobs  SAMGrid can broker jobs Or forward  Data handling via SAM  Data Sets delivered to local cache  Self-contained Tarball distributed via SAM  DB access via proxy servers  Next steps will be towards targeted ID activities such as jet energy scale determination to improve systematic error  M(top) = /- 3.0 (stat) +/- 3.2 (JES) +/- 1.7 (other) GeV

Amber Boehnlein, FNAL DO Operations  Monte Carlo production  Reprocessing Reprocessing  Improved tracking, EM calorimeter calibration  ~1 B event effort using 4000 GHz cpu equivalents for 9 months at 12 sites (3 OSG sites)  Would have taken ~5 years on FNAL DO dedicated resources.  Calibration DB access via proxy servers  Refixing  DO applied new hadronic calorimeter calibration post processing on FNAL dedicated analysis resources. Found a problem and are doing so again—  Six week target using remote facilities.  Fixed some skims for immediate use  QCD sample processed on CMS farm (OSG site)  Full effort ramping up—cpu needs same scale as reprocessing  Moving aggressively to use 1000 GHz equivalents on OSG! Every DO publication depends on Grid Computing

Amber Boehnlein, FNAL ATLAS Production Runs ( ) Grid ProductionWorldwideU.S.U.S. Tier 2 Jobs (k)Events (M)Jobs (k)Events (M)Percentage of U.S. Jobs done by three U.S. T2 sites Data Challenge 2 (DC2) % Rome Physics Workshop %  U.S. Tier 2 role was critical to success of ATLAS production  Over 400 physicists attended Rome workshop, 100 papers presented based on the data produced during DC2 and Rome production  U.S. provided resources on appropriate scale for U.S. physicists (60k CPU- days, >50 TB data), provided leadership roles in organization of challenges, in key software development, and in production operations  Production during DC2 and Rome established a hardened Grid3 infrastructure benefiting all participants in Grid3

Amber Boehnlein, FNAL Next ATLAS Production  Formerly, DC3, now Computer System Commissioning  Simulate 10^7 events (same order as DC2)  Full software commissioning –calibration and alignment  Will need ~2000 CPU in the U.S. continuously in 2006  OSG opportunistic resource will provide an important part of these resources.  Started last week.

Amber Boehnlein, FNAL Atlas MC Analysis Running Alpgen possible with OSG Resources

Amber Boehnlein, FNAL Conclusions  OSG is providing progressively more mature infrastructure  Increased use is leading to positive feedback from the perspective of users and providers of middleware and facilities The Accelerator based experiments are relying on it to deliver their physics programs.