Presentation is loading. Please wait.

Presentation is loading. Please wait.

October 21, 2010 David Lawrence JLab Oct. 21, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 53: Software Engineering, Data Stores,

Similar presentations


Presentation on theme: "October 21, 2010 David Lawrence JLab Oct. 21, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 53: Software Engineering, Data Stores,"— Presentation transcript:

1 October 21, 2010 David Lawrence JLab Oct. 21, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 53: Software Engineering, Data Stores, and Databases CEBAF at JLab is (soon to be) a 12 GeV e - continuous wave * beam facility in Newport News, Virginia *2ns bunch structure GlueX Offline Software: Preparing for Big Data Volumes on a Small Manpower Budget

2 Data Rates in Some Modern Experiments Front End DAQ Rate Event Size L1 Trigger Rate Bandwidth to mass Storage GlueX3 GB/s15 kB200 kHz300 MB/s CLAS12100 MB/s20 kB10 kHz100 MB/s ALICE500 GB/s2.5 MB200 kHz200 MB/s ATLAS113 GB/s1.5MB75 kHz300 MB/s CMS200 GB/s1 MB100kHz100 MB/s LHCb40 GB/s40 kB1 MHz100 MB/s STAR50 GB/s80 MB600 Hz450 MB/s PHENIX900 MB/s~60 kB~ 15 kHz450 MB/s Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab2 LHC JLab BNL * CHEP2007 talk Sylvain Chapelin private comm. * Jeff Landgraff private Comm. Feb. 11, 2010 ** CHEP2006 talk MartinL. Purschke **

3 CW beam  CD data Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab3 Crate Trigger ProcessorF1TDC Signal distribution board Electronics All digitization electronics are fully pipelined VME64x-VXS crates F1TDC (60 ps, 32 ch. or 115 ps 48 ch.) 125 MHz fADC (12 bit, 72 ch.) 250 MHz fADC (12 bit, 16 ch.) Maximum Trigger latency ~3  s 3GB/s readout from front end 300MB/s to mass storage 3PB/yr to tape Total digital sum of roughly 4000 calorimeter channels presented to L1 trigger every 4 ns! (continuous wave) (continuous digitization)

4 Software Manpower Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab4 Estimating manpower for software is notoriously difficult in a field where the developers are also users. The time spent on developing vs. using is often interspersed making it hard to estimate the overall time spent on either. Models have been developed using source lines of code. This is subject to individual programming style, language, and nature of the code. Project Estimated man-years CMS (LHC)1020 BaBar (SLAC)926 CDF (FermiLab)918 CLEO (Cornell)319 CLAS (JLab)53 GlueX (Jlab, projected)40 Of course, that doesn’t stop physicists from using it! estimates based on lines of source code from survey done in 2006-2007 Number of major detector systems including Trigger and DAQ: BaBar = 8 CLAS = 7 Why would BaBar need 18 times as much manpower for software than CLAS ? (BaBar, CDF, CLEO, and CMS courtesy L. Sexton-Kennedy 1/11/2007)

5 Size doesn’t really matter …(right?) Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab5 Collaboratio n Approx. Size ATLAS3000 CMS3000 Alice1000 LHCb700 BaBar600 STAR515 PANDA450 PHENIX430 CLAS200 MINERvA85 GlueX65 ”… you go to war with the army you have, not the army you might want or wish to have at a later time.” - Donald Rumsfeld (former US Secretary of Defense… and jerk ) Coder’s Ambition Time to complete project

6 Software is the manpower expansion tank Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab6 People interested in participating in an experiment must be allowed to contribute to it in some way. If more people are involved than there are hardware projects available, then they must find some other way to contribute. On the other hand, if fewer people are involved, then manpower for software is limited and more code is borrowed (or licensed) from other places. What do you do with lots and lots of collaborators? This is not such a bad thing. Software tends to be a flexible source of projects This is fine too since it allows experiments to “adopt” code that took tens or hundreds of (wo)manhours to develop by only investing a few of your own.

7 Recycling Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab7 Reduce the need for development by borrowing code or ideas wherever appropriate  Successful ideas minimize R&D and the risk of failure (or worse, being stuck with an inefficient system)  Recycling code can translate directly into (wo)man-hours saved  Caveat: must take over maintenance for life of experiment Examples from GlueX: Incorporated reconstruction code from KLOE for barrel calorimeter Gave early results for simulation studies Eventually converted to C/C++ (from FORTRAN) and optimized Adopted KTKinematicData class from CLEO(-II?) Switched from using CLHEP to ROOT linear algebra Developed HDDS based on ATLAS AGDD XML-based geometry description Modified to accommodate repeating structures

8 Fears for Tiers (or “Who wants the headache of Ruling the World?”) Motivation for a Tier-based Distribution System – Ameliorates a technical problem – Solves a political one GlueX has no formal plan for a tier-based distribution system GRID-based tools for doing Partial Wave Analysis using summary files* as input Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab8 Both of these arise due to large numbers of collaborators *First pass reconstruction done with uninteresting events filtered out at JLab

9 Summary The GlueX experiment will operate with a collaboration that is 7 to 47 times smaller than experiments with similar data rates to tape This is only possible by focusing limited software (wo)manpower on necessity and borrowing code as needed. ( unexpectedly longer lead time helps too! ) Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab9 n.b. Collaborators wishing to have a big impact on a very fundamental physics experiment by contributing software are welcome: http://www.gluex.org

10 Backup Slides Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab10

11 Oct. 21, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab11


Download ppt "October 21, 2010 David Lawrence JLab Oct. 21, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 53: Software Engineering, Data Stores,"

Similar presentations


Ads by Google