Download presentation
Presentation is loading. Please wait.
Published byBerniece Allen Modified over 9 years ago
1
7 Feb 2000Wyatt Merritt CHEP 2000 1 Object Orientation & Other New Experiences at the Tevatron for Run II Experiments The Experiments (DØ and CDF): Significant restructuring of software & computing for Run II DØ CDF Trigger rates 50 Hz 75 Hz Event size 250 kB250 kB Data storage ~300 TB/yr ~450 TB/yr [Run I storage ~60 TB ~40 TB ] The Computing Division : Greater involvement with planning and a more formal role in reviewing the experiments
2
7 Feb 2000Wyatt Merritt CHEP 2000 2 Common choices Moved to the C++ language for reconstruction code and chose a common (3rd party) compiler! Moved to a common release tool (SRT) Moved to common C++ libraries of utilities: ZOOM and CLHEP Did NOT move to commercial object data base for event storage Moved to commercial RDBMS for event/file/calibration cataloging Chose ROOT as an end-game analysis tool Using GEANT 3.21 as full simulation tool Moved to drop VMS and include Linux -- mix of large central systems and workstations on desktops Using a common hardware procurement process (even though some details of system architecture differ) ; same choice of robot & central system Enables joint packages Leverages Lab support F175 A282 A272 E248
3
7 Feb 2000Wyatt Merritt CHEP 2000 3 Differing choices Amount of legacy Fortran included (in reconstruction) For CDF, originally 70 % but now 30% and falling For DØ, none Data storage format DØ uses EVPACK format (evolved from DSPACK) CDF will use ROOT I/O Data handling systems CDF’s philosophy is networked disk, local tape drives, no event-level or process bookkeeping DØ’s philosophy is local disk, networked tape drives, large effort to make bookkeeping a serious tool for global optimization of data access Program framework Event data classes Note common choices win : 9 to 5 C241 E176 C366 C367 C368 DØ: A230 CDF: C201
4
7 Feb 2000Wyatt Merritt CHEP 2000 4 Experiences : Education Fermilab CD arranged C++ and OOAD classes from well- qualified Computer Science instructors Early differences DØ emphasis on formal classes CDF emphasis on good references, web communication Both may have converged to usual state of user-to-user transference? Bottom line, though, is that both experiments have retrained a substantial community, but not by any means all of their Run II users Doing better would be a big effort Both experiments are always resource-limited when it comes to people; training and communication projects tend to be at the end of the line after the very early period
5
7 Feb 2000Wyatt Merritt CHEP 2000 5 Experiences : Development Environments & Tools Quest for a standards-compliant compiler The state of C++ compilers in 97-98 was a BIG problem We both chose the KAI compiler, with a much better approximation to standards compliance than native compilers or gcc (fortunately, it was available early on for Linux) Bringing in third party products Open Inventor for KAI commissioned by Run II project Debugging complicates the issue -- not a good experience How many platforms is too many? Run II has 2 offline platforms (IRIX and Linux) and 1 compiler for both platforms -- different SWITCH combinations alone mean that 2 6-20 different ZOOM & ROOT libraries are built (and tested) for Run II * DØ uses NT for its Level 3 platform: an additional complication for the release system
6
7 Feb 2000Wyatt Merritt CHEP 2000 6 Experiences : Language and Design Physical design Importance very clear for making working releases. Really nice if the release system could provide more tools, more help for physical design: layered releases on the wish list General C++ design Have an expert look over the design before starting to write: Plea from our OO experts to get first crack! Portability is an issue with good and bad sides From a ZOOM developer: porting to ONE different compiler finds enough code problems to be well worth the effort Design and code reviews become a must For the most part, reviews have been welcomed by developers Memory management is very difficult for ex-Fortran programmers to master (current reconstruction still very sloppy) From a L3 filter meeting: “ I’m coding the xxx; it’s going much more quickly than I thought, thanks to the beauty of C++ which lets me reuse all the code from yyy.” A245
7
7 Feb 2000Wyatt Merritt CHEP 2000 7 Experiences : Language and Design Operational C++ infrastructure Event Data Models DØ EDM CDF TRYBOS to EDM2 (1st release in use by collaboration) Frameworks CDF AC++ (shared with BaBar, now diverging due to stability requirement from BaBar, as a running experiment) DØ framework Management of algorithm-defining parameters using a database RCP Being used now by DØ, tested for use by CDF Learning to manage infrastructure changes is a big piece of making the systems successful as a whole. DØ RCP change: 3 week disruption
8
7 Feb 2000Wyatt Merritt CHEP 2000 8 Experiences : Modularity Switching in different external packages and toolkits I/O format (DØ has ORACLE/msql option implemented; CDF has switched between YBOS and ROOT formats) HepTuple : a clean interface beneath which HBOOK and ROOT can be switched Graphics: not demonstrated yet, but possible in principle Switching algorithms Examples: CLHEP random number generators, jet algorithms Toward a more modular ROOT A strong request from the Run II Joint Project led to greater modularity in the ROOT architecture, allowing less heavy-weight use of its pieces, such as I/O, without dragging in graphics packages, etc.
9
7 Feb 2000Wyatt Merritt CHEP 2000 9 Experiences : The Joint Project & Reviews Jan 96 - Discussion of potential Joint Working Groups Listed 10 potential areas for common solutions Finished with 5 areas of significant joint effort including both experiments Configuration management -- working but could certainly be improved Biggest need: more tools to help with management of physical dependency Software tools -- could use more FTE’s than we can spare Support databases -- successful implementation of ORACLE for MCC’s Farm management -- very much in common Physics analysis software -- good leverage of lab support Two areas with some common support -- Simulation, Visualization Joint hardware procurement process -- very successful Two data handling projects joint between CD and each experiment E191
10
7 Feb 2000Wyatt Merritt CHEP 2000 10 Experiences : The Joint Project & Reviews Bi-yearly reviews from Jun 97 - Jun 99 Validated the scope for hardware budgets and personnel requests -- very important in this role Evolved from reviewing Joint to (Joint + Experiment) status Pointed out critical needs and gave some leverage for getting them addressed Hiring OO expertise was a direct outcome of these reviews Valuable checkpoint, stimulus for progress The continuation of the Joint Project: operational phase? The final piece of the exercise is an operations plan for both experiments and for the Joint Project -- much work still to be done here A67
11
7 Feb 2000Wyatt Merritt CHEP 2000 11 Experiences: Hardware Integration DØ Operational :1/3 of its central CPU, all the robot towers (though not with final tape drives), 1/5 of the final farm nodes, the farm I/O node, two database servers, the big network switches (not with final routing) Central CPU configured as 48 processor production (in use since last summer), 16 processor test system Robot configured with 1 side for users, 1 side for tests Being used in both explicit tests of the design and in the Monte Carlo Challenge activity (which also includes offsite Monte Carlo Production Facilities) Current status: full network performance from 3 fast ethernets to 1 Gigabit demonstrated on d0test; farm throughput demonstrated on 50-node system; user load on central system at 70% of capacity; stress tests of robot storage ramping up
12
7 Feb 2000Wyatt Merritt CHEP 2000 12 Experiences: Hardware Integration CDF 64-processor O2000, database server, 4 robot towers (not with final tape drives), 50 farm nodes (I/O node on order), 2 Network Appliances NFS file servers Being used in MDC ( farm was earlier 14-node prototype) Central CPU released to users this month Current status: rate tests to start April 1 2000 testing in coordination with online
13
7 Feb 2000Wyatt Merritt CHEP 2000 13 CDF Mock Data Challenge Generated 500K events at LBL (~100GB); transferred over network and stored in robot Exercised chain from Level 3 nodes into robot store Not using FiberChannel link yet, used alternate path Continuity test, not rate test Data moved from robot to prototype farm and reconstructed, output streamed and stored in robot Will begin analysis phase next week Goals: Continuity of full chain, high volume test for reconstruction, look for design flaws and assess current state of software systems Rate tests start April 2000 MDC-II (rate tests + full L3, production, data handling) : mid-May 2000 E70
14
7 Feb 2000Wyatt Merritt CHEP 2000 14 DØ Monte Carlo Challenge Phase 1: Dec 98 - Jan 99 production 90 K events May 99 reconstruction 90 K events Test of prototype farm, small scale test for SAM Phase 2: Nov99 - Jan 00 production of 500K events (FNAL: 240K, Lyon: 210 K, Prague 20K, NIKHEF 30K) Test of remote production capability : network import! Large scale test of SAM : almost 2 TB data stored Jan 00 - Feb 00 reconstruction of 500K evts Large scale test of farm : using 50 nodes Feb 00 - Mar 00 analysis phase Goal is feedback to physics performance as well as reco & MC Online tests of data logging into the robot are underway E311 E60
15
7 Feb 2000Wyatt Merritt CHEP 2000 15 Experiences : DØ Results Z in DØ MCC Open Inventor Geometry display
16
7 Feb 2000Wyatt Merritt CHEP 2000 16 Experiences : CDF Results SVX / ISL : Event Display Efficiency from ttbar MC: little falloff out to 1.8, p T 400 MeV
17
7 Feb 2000Wyatt Merritt CHEP 2000 17 Are There Lessons to Be Learned? Commonality of needs for infrastructure vs divergence of tastes, interests, timescales : not everything that could be done in common will be, but effort saved in a few areas is still worthwhile Common choice of compiler and release system enable joint work development of RCP, e.g. Make infrastructure first do it early to enable development but don’t rule out redesign Pay attention to physical design Develop mechanisms for both little changes and big changes if you plan for big changes, they are NOT too disabling to be contemplated release strategy plays a big part
18
7 Feb 2000Wyatt Merritt CHEP 2000 18 Are We There Yet? Yes, we have successfully built large C++ systems CDF: 1.3 million lines of code DØ: 285 cvs packages Will the larger community find them highly usable or barely usable? Yes, we are building data handling systems that approach LHC sizes 0.75 - 1.0 PB storage capacity (per exp’t) will be available Data movements of > 1 TB/day demonstrated with ENSTORE DØ farm has seen 15 MB/sec data flow CDF has exercised full online-offline chain, L3 to reconstruction Yes, we are keeping attention on integration and operation ….and this is already paying off! A remark I hear frequently from members of both experiments: “I’m glad we are finding this out now and not a year from now!”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.