7 Feb 2000Wyatt Merritt CHEP 2000 1 Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF):  Significant.

Slides:



Advertisements
Similar presentations
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Advertisements

Sep Nick Hadley DØ Prague Workshop 1  Overview  Organization  Status (brief)  Conclusion Run II Computing and Software.
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
CERN - European Laboratory for Particle Physics HEP Computer Farms Frédéric Hemmer CERN Information Technology Division Physics Data processing Group.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
MiniBooNE Computing Description: Support MiniBooNE online and offline computing by coordinating the use of, and occasionally managing, CD resources. Participants:
Nick Brook Current status Future Collaboration Plans Future UK plans.
Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote.
Chapter 3: Computer Software. Stored Program Concept v The concept of preparing a precise list of exactly what the computer is to do (this list is called.
LHC Computing Plans Scale of the challenge Computing model Resource estimates Financial implications Plans in Canada.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
Stephen Wolbers CHEP2000 February 7-11, 2000 Stephen Wolbers CHEP2000 February 7-11, 2000 CDF Farms Group: Jaroslav Antos, Antonio Chan, Paoti Chang, Yen-Chu.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
5 May 98 1 Jürgen Knobloch Computing Planning for ATLAS ATLAS Software Week 5 May 1998 Jürgen Knobloch Slides also on:
1 Planning for Reuse (based on some ideas currently being discussed in LHCb ) m Obstacles to reuse m Process for reuse m Project organisation for reuse.
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.
Heather Kelly PPA Scientific Computing Apps LAT was launched as part of the Fermi Gamma-ray Space Telescope on June 11 th 2008.
CMS Computing and Core-Software USCMS CB Riverside, May 19, 2001 David Stickland, Princeton University CMS Computing and Core-Software Deputy PM.
Computing plans from UKDØ. Iain Bertram 8 November 2000.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Run II Review Closeout 15 Sept., 2005 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.
GLAST LAT Offline SoftwareCore review, Jan. 17, 2001 Review of the “Core” software: Introduction Environment: THB, Thomas, Ian, Heather Geometry: Joanne.
Online Monitoring for the CDF Run II Experiment T.Arisawa, D.Hirschbuehl, K.Ikado, K.Maeshima, H.Stadie, G.Veramendi, W.Wagner, H.Wenzel, M.Worcester MAR.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Storage and Data Movement at FNAL D. Petravick CHEP 2003.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
Computing R&D and Milestones LHCb Plenary June 18th, 1998 These slides are on WWW at:
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.
An Overview of Support of Small Embedded Systems with Some Recommendations Controls Working Group April 14, 2004 T. Meyer, D. Peterson.
Remote Institute Tasks Frank Filthaut 11 February 2002  Monte Carlo production  Algorithm development  Alignment, calibration  Data analysis  Data.
Computing Division FY03 Budget and budget outlook for FY04 + CDF International Finance Committee April 4, 2003 Vicky White Head, Computing Division.
Run II Review Closeout 15 Sept., 2004 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.
Feb. 13, 2002DØRAM Proposal DØCPB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Partial Workshop ResultsPartial.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
DZero Monte Carlo Status, Performance, and Future Plans Greg Graham U. Maryland - Dzero 10/16/2000 ACAT 2000.
DØ Computing Model and Operational Status Gavin Davies Imperial College London Run II Computing Review, September 2005.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
CMS High Level Trigger Configuration Management
LHC experiments Requirements and Concepts ALICE
OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model
Simulation and Physics
Presentation transcript:

7 Feb 2000Wyatt Merritt CHEP Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF):  Significant restructuring of software & computing for Run II DØ CDF Trigger rates 50 Hz 75 Hz Event size 250 kB250 kB Data storage ~300 TB/yr ~450 TB/yr [Run I storage ~60 TB ~40 TB ] The Computing Division :  Greater involvement with planning and a more formal role in reviewing the experiments

7 Feb 2000Wyatt Merritt CHEP Common choices Moved to the C++ language for reconstruction code  and chose a common (3rd party) compiler! Moved to a common release tool (SRT) Moved to common C++ libraries of utilities: ZOOM and CLHEP Did NOT move to commercial object data base for event storage Moved to commercial RDBMS for event/file/calibration cataloging Chose ROOT as an end-game analysis tool Using GEANT 3.21 as full simulation tool Moved to drop VMS and include Linux -- mix of large central systems and workstations on desktops Using a common hardware procurement process (even though some details of system architecture differ) ; same choice of robot & central system Enables joint packages Leverages Lab support F175 A282 A272 E248

7 Feb 2000Wyatt Merritt CHEP Differing choices Amount of legacy Fortran included (in reconstruction)  For CDF, originally 70 % but now 30% and falling  For DØ, none Data storage format  DØ uses EVPACK format (evolved from DSPACK)  CDF will use ROOT I/O Data handling systems  CDF’s philosophy is networked disk, local tape drives, no event-level or process bookkeeping  DØ’s philosophy is local disk, networked tape drives, large effort to make bookkeeping a serious tool for global optimization of data access Program framework Event data classes Note common choices win : 9 to 5 C241 E176 C366 C367 C368 DØ: A230 CDF: C201

7 Feb 2000Wyatt Merritt CHEP Experiences : Education Fermilab CD arranged C++ and OOAD classes from well- qualified Computer Science instructors Early differences  DØ emphasis on formal classes  CDF emphasis on good references, web communication Both may have converged to usual state of user-to-user transference? Bottom line, though, is that both experiments have retrained a substantial community, but not by any means all of their Run II users Doing better would be a big effort  Both experiments are always resource-limited when it comes to people; training and communication projects tend to be at the end of the line after the very early period

7 Feb 2000Wyatt Merritt CHEP Experiences : Development Environments & Tools Quest for a standards-compliant compiler  The state of C++ compilers in was a BIG problem  We both chose the KAI compiler, with a much better approximation to standards compliance than native compilers or gcc (fortunately, it was available early on for Linux) Bringing in third party products  Open Inventor for KAI commissioned by Run II project Debugging complicates the issue -- not a good experience How many platforms is too many?  Run II has 2 offline platforms (IRIX and Linux) and 1 compiler for both platforms -- different SWITCH combinations alone mean that 2  6-20 different ZOOM & ROOT libraries are built (and tested) for Run II  * DØ uses NT for its Level 3 platform: an additional complication for the release system

7 Feb 2000Wyatt Merritt CHEP Experiences : Language and Design Physical design  Importance very clear for making working releases. Really nice if the release system could provide more tools, more help for physical design: layered releases on the wish list General C++ design  Have an expert look over the design before starting to write: Plea from our OO experts to get first crack!  Portability is an issue with good and bad sides From a ZOOM developer: porting to ONE different compiler finds enough code problems to be well worth the effort  Design and code reviews become a must For the most part, reviews have been welcomed by developers  Memory management is very difficult for ex-Fortran programmers to master (current reconstruction still very sloppy)  From a L3 filter meeting: “ I’m coding the xxx; it’s going much more quickly than I thought, thanks to the beauty of C++ which lets me reuse all the code from yyy.” A245

7 Feb 2000Wyatt Merritt CHEP Experiences : Language and Design Operational C++ infrastructure  Event Data Models DØ EDM CDF TRYBOS to EDM2 (1st release in use by collaboration)  Frameworks CDF AC++ (shared with BaBar, now diverging due to stability requirement from BaBar, as a running experiment) DØ framework  Management of algorithm-defining parameters using a database RCP Being used now by DØ, tested for use by CDF Learning to manage infrastructure changes is a big piece of making the systems successful as a whole.  DØ RCP change: 3 week disruption

7 Feb 2000Wyatt Merritt CHEP Experiences : Modularity Switching in different external packages and toolkits  I/O format (DØ has ORACLE/msql option implemented; CDF has switched between YBOS and ROOT formats)  HepTuple : a clean interface beneath which HBOOK and ROOT can be switched  Graphics: not demonstrated yet, but possible in principle Switching algorithms  Examples: CLHEP random number generators, jet algorithms Toward a more modular ROOT  A strong request from the Run II Joint Project led to greater modularity in the ROOT architecture, allowing less heavy-weight use of its pieces, such as I/O, without dragging in graphics packages, etc.

7 Feb 2000Wyatt Merritt CHEP Experiences : The Joint Project & Reviews Jan 96 - Discussion of potential Joint Working Groups Listed 10 potential areas for common solutions Finished with 5 areas of significant joint effort including both experiments  Configuration management -- working but could certainly be improved Biggest need: more tools to help with management of physical dependency  Software tools -- could use more FTE’s than we can spare  Support databases -- successful implementation of ORACLE for MCC’s  Farm management -- very much in common  Physics analysis software -- good leverage of lab support Two areas with some common support -- Simulation, Visualization Joint hardware procurement process -- very successful Two data handling projects joint between CD and each experiment E191

7 Feb 2000Wyatt Merritt CHEP Experiences : The Joint Project & Reviews Bi-yearly reviews from Jun 97 - Jun 99 Validated the scope for hardware budgets and personnel requests -- very important in this role Evolved from reviewing Joint to (Joint + Experiment) status Pointed out critical needs and gave some leverage for getting them addressed  Hiring OO expertise was a direct outcome of these reviews Valuable checkpoint, stimulus for progress The continuation of the Joint Project: operational phase?  The final piece of the exercise is an operations plan for both experiments and for the Joint Project -- much work still to be done here A67

7 Feb 2000Wyatt Merritt CHEP Experiences: Hardware Integration DØ  Operational :1/3 of its central CPU, all the robot towers (though not with final tape drives), 1/5 of the final farm nodes, the farm I/O node, two database servers, the big network switches (not with final routing)  Central CPU configured as 48 processor production (in use since last summer), 16 processor test system  Robot configured with 1 side for users, 1 side for tests  Being used in both explicit tests of the design and in the Monte Carlo Challenge activity (which also includes offsite Monte Carlo Production Facilities)  Current status: full network performance from 3 fast ethernets to 1 Gigabit demonstrated on d0test; farm throughput demonstrated on 50-node system; user load on central system at 70% of capacity; stress tests of robot storage ramping up

7 Feb 2000Wyatt Merritt CHEP Experiences: Hardware Integration CDF  64-processor O2000, database server, 4 robot towers (not with final tape drives), 50 farm nodes (I/O node on order), 2 Network Appliances NFS file servers  Being used in MDC ( farm was earlier 14-node prototype)  Central CPU released to users this month  Current status: rate tests to start April testing in coordination with online

7 Feb 2000Wyatt Merritt CHEP CDF Mock Data Challenge Generated 500K events at LBL (~100GB); transferred over network and stored in robot Exercised chain from Level 3 nodes into robot store  Not using FiberChannel link yet, used alternate path  Continuity test, not rate test Data moved from robot to prototype farm and reconstructed, output streamed and stored in robot Will begin analysis phase next week Goals: Continuity of full chain, high volume test for reconstruction, look for design flaws and assess current state of software systems Rate tests start April 2000 MDC-II (rate tests + full L3, production, data handling) : mid-May 2000 E70

7 Feb 2000Wyatt Merritt CHEP DØ Monte Carlo Challenge Phase 1: Dec 98 - Jan 99 production 90 K events May 99 reconstruction 90 K events Test of prototype farm, small scale test for SAM Phase 2: Nov99 - Jan 00 production of 500K events (FNAL: 240K, Lyon: 210 K, Prague 20K, NIKHEF 30K) Test of remote production capability : network import! Large scale test of SAM : almost 2 TB data stored Jan 00 - Feb 00 reconstruction of 500K evts Large scale test of farm : using 50 nodes Feb 00 - Mar 00 analysis phase Goal is feedback to physics performance as well as reco & MC Online tests of data logging into the robot are underway E311 E60

7 Feb 2000Wyatt Merritt CHEP Experiences : DØ Results Z  in DØ MCC Open Inventor Geometry display

7 Feb 2000Wyatt Merritt CHEP Experiences : CDF Results SVX / ISL : Event Display Efficiency from ttbar MC: little falloff out to  1.8, p T  400 MeV

7 Feb 2000Wyatt Merritt CHEP Are There Lessons to Be Learned? Commonality of needs for infrastructure vs divergence of tastes, interests, timescales : not everything that could be done in common will be, but effort saved in a few areas is still worthwhile Common choice of compiler and release system enable joint work  development of RCP, e.g. Make infrastructure first  do it early to enable development but don’t rule out redesign Pay attention to physical design Develop mechanisms for both little changes and big changes  if you plan for big changes, they are NOT too disabling to be contemplated  release strategy plays a big part

7 Feb 2000Wyatt Merritt CHEP Are We There Yet? Yes, we have successfully built large C++ systems  CDF: 1.3 million lines of code  DØ: 285 cvs packages Will the larger community find them highly usable or barely usable? Yes, we are building data handling systems that approach LHC sizes  PB storage capacity (per exp’t) will be available  Data movements of > 1 TB/day demonstrated with ENSTORE  DØ farm has seen 15 MB/sec data flow  CDF has exercised full online-offline chain, L3 to reconstruction Yes, we are keeping attention on integration and operation  ….and this is already paying off! A remark I hear frequently from members of both experiments: “I’m glad we are finding this out now and not a year from now!”