Presentation is loading. Please wait.

Presentation is loading. Please wait.

7 Feb 2000Wyatt Merritt CHEP 2000 1 Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF):  Significant.

Similar presentations


Presentation on theme: "7 Feb 2000Wyatt Merritt CHEP 2000 1 Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF):  Significant."— Presentation transcript:

1 7 Feb 2000Wyatt Merritt CHEP 2000 1 Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF):  Significant restructuring of software & computing for Run II DØ CDF Trigger rates 50 Hz 75 Hz Event size 250 kB250 kB Data storage ~300 TB/yr ~450 TB/yr [Run I storage ~60 TB ~40 TB ] The Computing Division :  Greater involvement with planning and a more formal role in reviewing the experiments

2 7 Feb 2000Wyatt Merritt CHEP 2000 2 Common choices Moved to the C++ language for reconstruction code  and chose a common (3rd party) compiler! Moved to a common release tool (SRT) Moved to common C++ libraries of utilities: ZOOM and CLHEP Did NOT move to commercial object data base for event storage Moved to commercial RDBMS for event/file/calibration cataloging Chose ROOT as an end-game analysis tool Using GEANT 3.21 as full simulation tool Moved to drop VMS and include Linux -- mix of large central systems and workstations on desktops Using a common hardware procurement process (even though some details of system architecture differ) ; same choice of robot & central system Enables joint packages Leverages Lab support F175 A282 A272 E248

3 7 Feb 2000Wyatt Merritt CHEP 2000 3 Differing choices Amount of legacy Fortran included (in reconstruction)  For CDF, originally 70 % but now 30% and falling  For DØ, none Data storage format  DØ uses EVPACK format (evolved from DSPACK)  CDF will use ROOT I/O Data handling systems  CDF’s philosophy is networked disk, local tape drives, no event-level or process bookkeeping  DØ’s philosophy is local disk, networked tape drives, large effort to make bookkeeping a serious tool for global optimization of data access Program framework Event data classes Note common choices win : 9 to 5 C241 E176 C366 C367 C368 DØ: A230 CDF: C201

4 7 Feb 2000Wyatt Merritt CHEP 2000 4 Experiences : Education Fermilab CD arranged C++ and OOAD classes from well- qualified Computer Science instructors Early differences  DØ emphasis on formal classes  CDF emphasis on good references, web communication Both may have converged to usual state of user-to-user transference? Bottom line, though, is that both experiments have retrained a substantial community, but not by any means all of their Run II users Doing better would be a big effort  Both experiments are always resource-limited when it comes to people; training and communication projects tend to be at the end of the line after the very early period

5 7 Feb 2000Wyatt Merritt CHEP 2000 5 Experiences : Development Environments & Tools Quest for a standards-compliant compiler  The state of C++ compilers in 97-98 was a BIG problem  We both chose the KAI compiler, with a much better approximation to standards compliance than native compilers or gcc (fortunately, it was available early on for Linux) Bringing in third party products  Open Inventor for KAI commissioned by Run II project Debugging complicates the issue -- not a good experience How many platforms is too many?  Run II has 2 offline platforms (IRIX and Linux) and 1 compiler for both platforms -- different SWITCH combinations alone mean that 2  6-20 different ZOOM & ROOT libraries are built (and tested) for Run II  * DØ uses NT for its Level 3 platform: an additional complication for the release system

6 7 Feb 2000Wyatt Merritt CHEP 2000 6 Experiences : Language and Design Physical design  Importance very clear for making working releases. Really nice if the release system could provide more tools, more help for physical design: layered releases on the wish list General C++ design  Have an expert look over the design before starting to write: Plea from our OO experts to get first crack!  Portability is an issue with good and bad sides From a ZOOM developer: porting to ONE different compiler finds enough code problems to be well worth the effort  Design and code reviews become a must For the most part, reviews have been welcomed by developers  Memory management is very difficult for ex-Fortran programmers to master (current reconstruction still very sloppy)  From a L3 filter meeting: “ I’m coding the xxx; it’s going much more quickly than I thought, thanks to the beauty of C++ which lets me reuse all the code from yyy.” A245

7 7 Feb 2000Wyatt Merritt CHEP 2000 7 Experiences : Language and Design Operational C++ infrastructure  Event Data Models DØ EDM CDF TRYBOS to EDM2 (1st release in use by collaboration)  Frameworks CDF AC++ (shared with BaBar, now diverging due to stability requirement from BaBar, as a running experiment) DØ framework  Management of algorithm-defining parameters using a database RCP Being used now by DØ, tested for use by CDF Learning to manage infrastructure changes is a big piece of making the systems successful as a whole.  DØ RCP change: 3 week disruption

8 7 Feb 2000Wyatt Merritt CHEP 2000 8 Experiences : Modularity Switching in different external packages and toolkits  I/O format (DØ has ORACLE/msql option implemented; CDF has switched between YBOS and ROOT formats)  HepTuple : a clean interface beneath which HBOOK and ROOT can be switched  Graphics: not demonstrated yet, but possible in principle Switching algorithms  Examples: CLHEP random number generators, jet algorithms Toward a more modular ROOT  A strong request from the Run II Joint Project led to greater modularity in the ROOT architecture, allowing less heavy-weight use of its pieces, such as I/O, without dragging in graphics packages, etc.

9 7 Feb 2000Wyatt Merritt CHEP 2000 9 Experiences : The Joint Project & Reviews Jan 96 - Discussion of potential Joint Working Groups Listed 10 potential areas for common solutions Finished with 5 areas of significant joint effort including both experiments  Configuration management -- working but could certainly be improved Biggest need: more tools to help with management of physical dependency  Software tools -- could use more FTE’s than we can spare  Support databases -- successful implementation of ORACLE for MCC’s  Farm management -- very much in common  Physics analysis software -- good leverage of lab support Two areas with some common support -- Simulation, Visualization Joint hardware procurement process -- very successful Two data handling projects joint between CD and each experiment E191

10 7 Feb 2000Wyatt Merritt CHEP 2000 10 Experiences : The Joint Project & Reviews Bi-yearly reviews from Jun 97 - Jun 99 Validated the scope for hardware budgets and personnel requests -- very important in this role Evolved from reviewing Joint to (Joint + Experiment) status Pointed out critical needs and gave some leverage for getting them addressed  Hiring OO expertise was a direct outcome of these reviews Valuable checkpoint, stimulus for progress The continuation of the Joint Project: operational phase?  The final piece of the exercise is an operations plan for both experiments and for the Joint Project -- much work still to be done here A67

11 7 Feb 2000Wyatt Merritt CHEP 2000 11 Experiences: Hardware Integration DØ  Operational :1/3 of its central CPU, all the robot towers (though not with final tape drives), 1/5 of the final farm nodes, the farm I/O node, two database servers, the big network switches (not with final routing)  Central CPU configured as 48 processor production (in use since last summer), 16 processor test system  Robot configured with 1 side for users, 1 side for tests  Being used in both explicit tests of the design and in the Monte Carlo Challenge activity (which also includes offsite Monte Carlo Production Facilities)  Current status: full network performance from 3 fast ethernets to 1 Gigabit demonstrated on d0test; farm throughput demonstrated on 50-node system; user load on central system at 70% of capacity; stress tests of robot storage ramping up

12 7 Feb 2000Wyatt Merritt CHEP 2000 12 Experiences: Hardware Integration CDF  64-processor O2000, database server, 4 robot towers (not with final tape drives), 50 farm nodes (I/O node on order), 2 Network Appliances NFS file servers  Being used in MDC ( farm was earlier 14-node prototype)  Central CPU released to users this month  Current status: rate tests to start April 1 2000 testing in coordination with online

13 7 Feb 2000Wyatt Merritt CHEP 2000 13 CDF Mock Data Challenge Generated 500K events at LBL (~100GB); transferred over network and stored in robot Exercised chain from Level 3 nodes into robot store  Not using FiberChannel link yet, used alternate path  Continuity test, not rate test Data moved from robot to prototype farm and reconstructed, output streamed and stored in robot Will begin analysis phase next week Goals: Continuity of full chain, high volume test for reconstruction, look for design flaws and assess current state of software systems Rate tests start April 2000 MDC-II (rate tests + full L3, production, data handling) : mid-May 2000 E70

14 7 Feb 2000Wyatt Merritt CHEP 2000 14 DØ Monte Carlo Challenge Phase 1: Dec 98 - Jan 99 production 90 K events May 99 reconstruction 90 K events Test of prototype farm, small scale test for SAM Phase 2: Nov99 - Jan 00 production of 500K events (FNAL: 240K, Lyon: 210 K, Prague 20K, NIKHEF 30K) Test of remote production capability : network import! Large scale test of SAM : almost 2 TB data stored Jan 00 - Feb 00 reconstruction of 500K evts Large scale test of farm : using 50 nodes Feb 00 - Mar 00 analysis phase Goal is feedback to physics performance as well as reco & MC Online tests of data logging into the robot are underway E311 E60

15 7 Feb 2000Wyatt Merritt CHEP 2000 15 Experiences : DØ Results Z  in DØ MCC Open Inventor Geometry display

16 7 Feb 2000Wyatt Merritt CHEP 2000 16 Experiences : CDF Results SVX / ISL : Event Display Efficiency from ttbar MC: little falloff out to  1.8, p T  400 MeV

17 7 Feb 2000Wyatt Merritt CHEP 2000 17 Are There Lessons to Be Learned? Commonality of needs for infrastructure vs divergence of tastes, interests, timescales : not everything that could be done in common will be, but effort saved in a few areas is still worthwhile Common choice of compiler and release system enable joint work  development of RCP, e.g. Make infrastructure first  do it early to enable development but don’t rule out redesign Pay attention to physical design Develop mechanisms for both little changes and big changes  if you plan for big changes, they are NOT too disabling to be contemplated  release strategy plays a big part

18 7 Feb 2000Wyatt Merritt CHEP 2000 18 Are We There Yet? Yes, we have successfully built large C++ systems  CDF: 1.3 million lines of code  DØ: 285 cvs packages Will the larger community find them highly usable or barely usable? Yes, we are building data handling systems that approach LHC sizes  0.75 - 1.0 PB storage capacity (per exp’t) will be available  Data movements of > 1 TB/day demonstrated with ENSTORE  DØ farm has seen 15 MB/sec data flow  CDF has exercised full online-offline chain, L3 to reconstruction Yes, we are keeping attention on integration and operation  ….and this is already paying off! A remark I hear frequently from members of both experiments: “I’m glad we are finding this out now and not a year from now!”


Download ppt "7 Feb 2000Wyatt Merritt CHEP 2000 1 Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF):  Significant."

Similar presentations


Ads by Google