BaBar Computing Gregory Dubois-Felsmann, SLAC BaBar Computing Coordinator SLAC DOE High Energy Physics Program Review 7 June 2006.

Slides:



Advertisements
Similar presentations
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Advertisements

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
June 19, 2002 A Software Skeleton for the Full Front-End Crate Test at BNL Goal: to provide a working data acquisition (DAQ) system for the coming full.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
25 February 2000Tim Adye1 Using an Object Oriented Database to Store BaBar's Terabytes Tim Adye Particle Physics Department Rutherford Appleton Laboratory.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
W.Smith, U. Wisconsin, ZEUS Computing Board Zeus Executive, March 1, ZEUS Computing Board Report Zeus Executive Meeting Wesley H. Smith, University.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
1 Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh Chulalongkorn University, Thailand July 23, 2015 Workshop on e-Science.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
The BaBar Event Building and Level-3 Trigger Farm Upgrade S.Luitz, R. Bartoldus, S. Dasu, G. Dubois-Felsmann, B. Franek, J. Hamilton, R. Jacobsen, D. Kotturi,
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
25th October 2006Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar UK Physics Meeting Queen Mary, University of London 25 th October 2006.
The Main Injector Beam Position Monitor Front-End Software Luciano Piccoli, Stephen Foulkes, Margaret Votava and Charles Briegel Fermi National Accelerator.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
1D. Olson, SDM-ISIC Mtg, 26 Mar 2002 Scientific Data Management: An Incomplete Experimental HENP Perspective D. Olson, LBNL 26 March 2002 SDM-ISIC Meeting.
BaBar -Overview of a running (hybrid) system Peter Elmer Princeton University 5 June, 2002.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
CDMS Computing Project Don Holmgren Other FNAL project members (all PPD): Project Manager: Dan Bauer Electronics: Mike Crisler Analysis: Erik Ramberg Engineering:
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
11th November 2002Tim Adye1 Distributed Analysis in the BaBar Experiment Tim Adye Particle Physics Department Rutherford Appleton Laboratory University.
May Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
David Stickland CMS Core Software and Computing
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Belle II Computing Fabrizio Bianchi INFN and University of Torino Meeting Belle2 Italia 17/12/2014.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
11th September 2002Tim Adye1 BaBar Experience Tim Adye Rutherford Appleton Laboratory PPNCG Meeting Brighton 11 th September 2002.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
Overview of the Belle II computing
Pasquale Migliozzi INFN Napoli
LHC experiments Requirements and Concepts ALICE
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Universita’ di Torino and INFN – Torino
LHC Data Analysis using a worldwide computing grid
Kanga Tim Adye Rutherford Appleton Laboratory Computing Plenary
ATLAS DC2 & Continuous production
LHCb thinking on Regional Centres and Related activities (GRIDs)
Presentation transcript:

BaBar Computing Gregory Dubois-Felsmann, SLAC BaBar Computing Coordinator SLAC DOE High Energy Physics Program Review 7 June 2006

BaBar Computing2 Overview The BaBar Computing Task –Data Volume –The Data Model –Distributed Computing Infrastructure –SLAC-Provided Resources Data Processing Pipeline –Online –Calibration, Reconstruction –Skimming –Simulation –Data Distribution –Data Access –Use of Grid Technologies Current Activities Future Plans

BaBar Computing3 The BaBar Computing Task What must be accomplished: –Accumulate physics data from the detector Acquire it, filter it, and record it while monitoring its quality Calibrate and reconstruct it –Generate corresponding simulated data Using the recorded history of the condition of the detector... Generate and reconstruct simulated events (globally distributed) –Divide all the data into skims for convenient access –Distribute it to many sites –Provide an analysis environment A software environment to be run at all sites Substantial physical resources at major sites

BaBar Computing4 Data Volume Rates: –Output from the detector: Typically ~ events/s at present 1*10 34 lumi Recently ~33-40 kB/event (BaBar custom binary format) ~ events/s selected by Level 3 trigger (~1/7) –Output of reconstruction: ~1/3 of events selected as “AllEvents” data for analysis ~3-3.5 kB “micro” & ~ kB “mini” (separate files) Bulk size averaged over history of experiment: –Micro: 44 GB/fb -1 ; Mini: 81 GB/fb -1 Simulation: –Larger: events include truth data; BBbar generated at x3 multiple –Micro: 89 GB/fb -1 ; Mini: 126 GB/fb -1

BaBar Computing5 The Data Model After Computing Model 2 (CM2) re-engineering: –Reconstructed (beam & simulated) data: ROOT trees, composed of BaBar-specific data objects Optionally striped across multiple files by component (micro, mini, truth,...) Copies of events can be done by value or by reference (selectable by component) ~140 TB by end of this year –Data skimmed for convenience of use & distribution Currently ~215 skims (170 “deep copy”, 35 pointer) –Pointer skims used for largest selection fractions –Deep-copy skims can be exported to small sites Convenience comes at a cost: Total size of skims is about 5.1x larger than micro ~300 TB by end of this year

BaBar Computing6 Distributed Computing Infrastructure BaBar computing resources are distributed –SLAC –Four “Tier A” centers supporting both central production and user analysis IN2P3 (Lyon, France) RAL (Rutherford Lab, UK) INFN (Padova and CNAF/Bologna, Italy) GridKa (Karlsruhe, Germany) –A total of up to ~40 laboratory and university sites running the BaBar simulation –“Tier C” sites ranging from large departmental clusters to users’ laptops

BaBar Computing7 Tier-A Computing Resource Commitments Proportions of commitments… MOU CPU 2006MOU Disk 2006

BaBar Computing8 SLAC and Tier A Resources Some numbers: 2006 Commitments Site CPU (SLAC CPU-weeks) Disk (TB = 2 40 bytes) IN2P INFN UK GridKa SLAC Total Funding for SLAC BaBar computing hardware is shared between DOE and the collaborating national agencies –International Finance Committee mechanism supervises this –Offsetting credit is given for national contributions to Tier A centers

BaBar Computing9 Predicted evolution of computing budget SLAC Total Dip in budget is caused by difference between anticipated and delivered luminosity in 2005

BaBar Computing10 Installed computing capacity SLAC Total SLAC Total SLAC Total SLAC Total

BaBar Computing11 SLAC-Provided Staffing SLAC staffing for B Factory (From D. MacFarlane’s presentation at 2006 Operations Review)

BaBar Computing12 Data Processing Pipeline Online Calibration, Reconstruction Skimming Simulation Data Distribution

BaBar Computing13 Online Online computing group now mostly from SLAC –Management as well as most development and operational staffing Online system provides –Data acquisition: front-end readout, feature extraction, event building, software triggering (Level 3), data logging Rates and volumes as mentioned above –Data quality monitoring –Detector operation (slow control and run control) –Configuration and operational status and conditions databases Infrastructure –O(100) computers (compute, file, database, console servers, workstations...) –Gigabit Ethernet networks for event building, file service backbone, link to SCCS; switched 100 Mb for rest of systems –Release management, user environment for development Data logged finally transferred to HPSS at SCCS

BaBar Computing14 Tier A centers in calibration and reconstruction oFast calibration pass at SLAC oData sent via network to Padova for reconstruction oData reconstructed and returned via network to SLAC SLAC RAL Karlsruhe Padova/BalognaLyon

BaBar Computing15 Calibration, Reconstruction Calibration –Calibrations for a run needed before full reconstruction Some timing and beam position parameter change rates require this –Based on a small subset of simple, clean events Bhabhas, dimuons,... Selected at an approximately constant luminosity-independent rate (~7Hz) to provide sufficient events (currently 2-3% of L3) Selected in the online system and written to a separate file –Processed by the full reconstruction executable Single-threaded as constants are fed forward (“rolling”) –Possible to execute on a subset of the SLAC CPU farm Typically nodes –Additional subfarms maintained for reprocessing and testing Normally completed within hours after completion of DAQ

BaBar Computing16 Calibration, Reconstruction Reconstruction –Raw data are exported continuously to Padova site Permanently archived there as a redundant copy –Constants resulting from calibration pass are exported in batches –Multiple runs can be processed in parallel No feed-forward dependencies - full event independence Allows use of multiple, smaller farms - alleviates scaling problems –Plenty of capacity for reprocessing Current maximum is ~2 fb -1 /day (c.f fb hour record to date) Overcapacity planned to be shifted to skimming shortly –Full reconstruction executable applies filtering Reduces processing power required as well as output sample size Retains ~1/3 of input events from Level 3 Output in ROOT files in CM2 data format –Typically completed in ~1.5 days, but very tolerant of outages –Outputs exported continuously back to SLAC

BaBar Computing17 Skimming Data skimmed for convenience of use & distribution –Currently ~215 skims (170 “deep copy”, 35 pointer) Pointer skims used for largest selection fractions Deep-copy skims can be exported to small sites –Convenience comes at a cost: Total size of skims is about 5.1x larger than micro –~300 TB by end of this year Skim cycles 3-4 times/year –Want to keep latency low to enable new analyses to start quickly

BaBar Computing18 Simulation Full resimulation pass under way –GEANT 4 v6-based core Distributed over network of universities and labs All data returned to SLAC for skimming –GridKa for uds continuum Distributed to sites by AWG assignment

BaBar Computing19 Data distribution to Tier A computing centers oData skims are uniquely assigned to Tier A Centers oDisk space for corresponding Analysis Working Group located at same Tier A SLAC RAL Karlsruhe Padova/BalognaLyon Maximum transfer rates: IN2P3 RAL GRIDKA CNAF 3 Tb/day 2 Tb/day 1 Tb/day

BaBar Computing20 Tier A assignments

BaBar Computing21 Use of Grid Technologies Grid technologies beginning to be used to support BaBar Established: simulation –~25-30 Mevents/week of total ~200 Mevents/week capacity provided by Grid systems in UK, Italy, Canada –Planning to increase utilization in future, especially when Objectivity phase-out is complete Nearing operational status: skimming –Developing capability to do skimming of simulated data on Grid UK “Tier B” resources: Manchester, 2000 nodes; possibly others later Not expecting to provide general user analysis support

BaBar Computing22 Current Activities Reprocessing –In final stages of completing full reprocessing of all data First all-CM2 processing cycle Reprocessing essentially complete Generated corresponding ~7.3 billion simulated events Finishing last ~1% of skimming Data-taking –In the midst of delivering June 1 data to analysts by next week to meet internal ICHEP deadlines –Additional data-taking through August ~21 Skimming –Starting two new skim cycles, one now, one in ~2-3 months with a test cycle starting now

BaBar Computing23 Future Plans Online farm upgrade –Current online farm (worker nodes and event-build switch): Supports Level 3 trigger and online data quality monitoring Reaching end of its hardware lifetime cycle Likely to reach processing capacity limit around BaBar’s highest luminosities –IFC (1/2006) approved online farm upgrade ($400K) New high-capacity network switch, all gigabit-Ethernet O(50) current-model AMD Opteron dual-CPU dual-core 1U workers Miscellaneous server and disk capacity improvements –Will install during upcoming shutdown Switch acquisition in progress Farm node acquisition linked with next round of CPU acquisitions for SLAC computing center Through end of data

BaBar Computing24 Future Plans Completion of phase-out of Objectivity –Non-event-store applications –Configuration, Ambient, Calibration (Spatial, Temporal) databases all migrated to ROOT data format and ROOT (and mySQL) database framework Will switch over to the non-Objectivity versions in 2006 shutdown –Conditions database partially migrated All conditions needed for data analysis available –Preparing to put this into production Remaining conditions still require significant engineering effort –Required for simulation, reconstruction, online –On tight timetable to still be able to put in production for Run 6 –Additional engineering effort (physicist or comp. pro.) would be helpful Through end of data

BaBar Computing25 Future Plans Replacement of mass storage system at SLAC –Current STK multi-silo system has ~15-year history Regular upgrades to drives, control software Finally reaching end of its support lifetime –Replacement needed For BaBar needs alone as well as for SLAC’s further projects –Planning to move to STK series system Acquisition of first silo late this year or early next Still thinking about whether to recopy all data to latest high-capacity tapes –Would probably take O(1 year) to interleave with other demands on the mass storage system Through end of data

BaBar Computing26 Future Plans After end of data –Expect ~two years of full-scale analysis effort with extensive central computing activities Planning to maintain 3-4 skim cycles/year to keep latency low for new analyses Need to accommodate well-motivated extensions of filter to look at classes of events not originally envisioned Simulation capability must be maintained Planning for possibility of one post-shutdown full reprocessing cycle if sufficiently compelling improvements in reconstruction are developed –Full reprocessing would require full resimulation pass –Should all be doable with infrastructure in place at end of data Assumes retention of resources at Tier-A sites, network of simulation hosts Continued funding will be needed to –Purchase tapes to store output of skimming and user-generated datasets –Replace hardware on its ordinary ~4-year lifetime cycle –Expect to support a substantially lesser effort for several more years Will need to think about long-term archiving of datasets After end of data