Download presentation
Presentation is loading. Please wait.
Published byMyron Randall Modified over 9 years ago
1
2/8/00E. Buckley-Geer, CHEP 20001 Issues in managing HEP Software Development in a distributed environment Elizabeth Buckley-Geer Fermilab CHEP 2000, Padova, Italy
2
2/8/00E. Buckley-Geer, CHEP 20002 Contents Characterizing the problem Key issues and solutions from CDF/D0 Collider Run II Some thoughts on the development process Conclusions
3
2/8/00E. Buckley-Geer, CHEP 20003 Characterizing the problem Developer community of about 150 people (both collaborations) from North and South America, Europe, Asia, India, Russia Widely varying quality of network connections between FNAL and remote locations Widely varying abilities of groups to afford to purchase commercial tools
4
2/8/00E. Buckley-Geer, CHEP 20004 Characterizing the problem One common denominator since mid-1997: ØEveryone can buy a cheap PC and run Linux on it ØNo more $10-20K workstations. Every member of the group can have a PC ØThey don’t want to rely on connecting to a central machine at FNAL to do code development ØThey want to make use of these PCs at their own location to do their code development ØFirst release of CDF code for Linux was January 1998 – several years after the basic development environment was designed
5
2/8/00E. Buckley-Geer, CHEP 20005 The situation during Run I (CDF - but similar for D0) Highly centralized code development. Could only realistically develop code on central machine at FNAL (VMS cluster) – no distributed development was supported even on other VMS systems Code was ported to run on IRIX and AIX but only frozen releases were available on these platforms Frozen release were distributed to remote sites as tar files or VMS save sets Development version of the code was available to desktop VMS nodes at FNAL from 1993 onwards but code could not be committed to repository from these machines
6
2/8/00E. Buckley-Geer, CHEP 20006 Run I development tools Code was mostly Fortran with some small amounts of C. About 50 packages. Used proprietary VMS tools for for version control and package building (CMS and MMS) Used vendor compilers and debuggers. Only UNIX vendors who supported VMS extensions were considered. Luckily the list was sufficiently long! No serious use of design tools – some early attempts at D0 but didn’t survive No tools to locate memory leaks due to the nature of the memory management packages in use – YBOS and ZEBRA
7
2/8/00E. Buckley-Geer, CHEP 20007 Goals for Run II development environment – early 1996 Obviously needed to migrate from VMS as a primary platform Provide ability to do remote development – recognized as important even before the Linux revolution Reduce the need for proprietary tools for base system Handle move from Fortran to C++ Identify useful software engineering tools
8
2/8/00E. Buckley-Geer, CHEP 20008 Configuration Management Joint Project Formed joint D0, CDF, FNAL Computing Division working group to study configuration management in early 1996 (see E248 for more on Run II joint projects) Charge was to find and implement a common solution for CDF and D0 for software management ØVersion control ØPackage and release organization ØBuilding packages ØDistribution ØValidation
9
2/8/00E. Buckley-Geer, CHEP 20009 Configuration Management Joint Project Group looked at existing tools in use in HEP and elsewhere Chose ØCVS for version control with customizations from Sloan Digital Sky Survey (SDSS) ØSoftRelTools from BaBar for package organization and building ØUPS/UPD from FNAL for product setup and distribution tools
10
2/8/00E. Buckley-Geer, CHEP 200010 CVS Run in client/server mode – adopted from SDSS Repository on server + cvsuser pseudo account running a restricted shell CVSH that only allows cvs commands to be executed Local and remote access are identical so users do not need to be on a FNAL computer to access repository – necessary condition for remote development
11
2/8/00E. Buckley-Geer, CHEP 200011 SoftRelTools (SRT) Adapted from BaBar experiment Uses cpp used to create dependencies and gmake used to build libraries & binaries BaBar and FNAL agreed to diverge on development It was becoming difficult to add new features given the original structure of the package Have since done a re-write (Spring 1999) of the package at FNAL to make it more maintainable
12
2/8/00E. Buckley-Geer, CHEP 200012 UPS – Unix Product Setup FNAL product in use since 1991 Supports existence of multiple versions of a product. Choice is made using a ‘setup’ command. Re-write for Run II Completed in summer 1998 In use by both CDF and D0
13
2/8/00E. Buckley-Geer, CHEP 200013 Use of these tools at CDF ~ 65 code developers 1.3 million lines of code Ø71% C++, 20% Fortran, 8% C, 0.6% Java + external packages Ø144 packages Development release built every night on IRIX, TRU64, SUN, Linux Daily build logs scanned for errors and reported to developers. Build logs are posted on web Development builds lead to timely detection and fixing of bugs Create frozen releases about every 2 months. Also create releases to capture code used for certain milestones.
14
2/8/00E. Buckley-Geer, CHEP 200014 Use of these tools at CDF Success of development rebuild varies. Somewhat correlated with number of files changed
15
2/8/00E. Buckley-Geer, CHEP 200015 Use of these tools at D0 ~60 code developers have write access to repository ØEssentially 100% C++ except for external packages Ø280 packages – but big variation in size Test release of entire package weekly on IRIX and Linux. Goal is to have operational reconstruction exe at the end of every release. Currently 80% success rate. Production releases occur at intervals determined by the management. Used to capture important milestones and provide stable working versions. 5 production releases to date
16
2/8/00E. Buckley-Geer, CHEP 200016 Code Distribution CDF has a set of custom scripts to distribute code to remote sites. Both frozen releases and development are distributed Fairly straightforward to get distribution. Currently fairly manpower intensive for development release on remote nodes – ½ FTE devoted for fixing problems Working on switching to UPD for ease of maintenance No significant automatic code distribution happening in D0 yet
17
2/8/00E. Buckley-Geer, CHEP 200017 Code Distribution Majority of distribution is to Linux machines LinuxIRIXTRU64Solaris Develop ment 44732 Frozen Release 1151362
18
2/8/00E. Buckley-Geer, CHEP 200018 Compilers We wanted to write code that adhered to the C++ ANSI standard – not get into the Fortran extensions quagmire! GCC and vendor compilers were not thought sufficiently compliant in summer 1997 Chose KAI compiler from Kuck and Associates Compiler was available on the relevant platforms – including LINUX Has led to issues with availability of KAI versions of external products that must be built with the CDF/D0 software – e.g. we paid for a port of Open Inventor We still believe it was the right choice at the time but expect to use EGCS and vendor compilers in the future
19
2/8/00E. Buckley-Geer, CHEP 200019 Debuggers and other tools Quality of the debugging tools has left a lot to be desired ØThis was one of the few downsides of choosing KAI. Things have been particularly problematic on Linux Have purchased TotalView which is in use on IRIX and will shortly be available for Linux – seems to improve the situation Case tools – used GDPro and Rational Rose ØMostly used to document design – did not use automatic code generation features Purify and Insure++ used to look for memory leaks – but not currently available for Linux
20
2/8/00E. Buckley-Geer, CHEP 200020 Licensed products Has been very beneficial to negotiate license agreements that cover use of a product by all Run II developers independent of their location Have done this with KAI, Open Inventor Get better price - all licenses must be ordered through Fermilab
21
2/8/00E. Buckley-Geer, CHEP 200021 Thoughts on the development process Borrowing from the terminology and observations presented in “The Cathedral and the Bazaar” by Eric Raymond – O’Reilly Books Our code is clearly Open Source because (by and large) it is freely available to anyone who wants to use it from another experiment However, both CDF and D0 software projects are run using the traditional “cathedral” style of software development This is necessitated by the requirements to provide schedules, obtain manpower resources from a limited pool, meet milestones and convince review committees that you know what you are doing We can make some comparisons between aspects of the Open Source (aka Linux) model and what we are doing in HEP
22
2/8/00E. Buckley-Geer, CHEP 200022 Thoughts on the development process “Treat your users as co-developers” ØTwo user communities in an experiment Those working on the software project – programmers and physicists The rest of the experiment – the physicist-user ØThe first group tends to be like the Linux community – working on the project because they are interested in the problem and want to improve the product ØThe second group just want to use the software to get physics results – they want to improve their physics analysis software but not the infrastructure
23
2/8/00E. Buckley-Geer, CHEP 200023 Thoughts on the development process “Release early, release often” ØCDF has shown that this leads to more timely bug fixes and shorter integration time and is very desirable for the project developers ØHowever, it drives the physicist-user to distraction because he/she just wants something that works! ØHave to have stable frozen releases in addition
24
2/8/00E. Buckley-Geer, CHEP 200024 Thoughts on the development process Some of the skills necessary to co-ordinate a successful Open Source project are relevant to managing an HEP computing project ØMust have good people and communication skills ØNeed to be able to attract people to the project and keep them interested and happy ØThese can often be more important than possessing great technical prowess ØIf often feels like we are in a bazaar rather than a cathedral!
25
2/8/00E. Buckley-Geer, CHEP 200025 Conclusions CDF and D0 are successfully managing their software development projects with ~ 60 – 70 developers per experiment and 1 million lines of C++ each We are expected to have schedules, milestones and reviews which makes it unlikely that we can ever manage a project using the bazaar model However, some of the Open Source concepts are applicable to HEP projects
26
2/8/00E. Buckley-Geer, CHEP 200026 Use of these tools at CDF On days that development builds we create a rawhide release. This satisfies developers who need the up-to-date code but also need the whole release to actually build
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.