Presentation is loading. Please wait.

Presentation is loading. Please wait.

October 19, 2010 David Lawrence JLab Oct. 19, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 18: Software Engineering, Data Stores,

Similar presentations


Presentation on theme: "October 19, 2010 David Lawrence JLab Oct. 19, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 18: Software Engineering, Data Stores,"— Presentation transcript:

1 October 19, 2010 David Lawrence JLab Oct. 19, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 18: Software Engineering, Data Stores, and Databases CEBAF at JLab is (soon to be) a 12 GeV e - continuous wave * beam facility in Newport News, Virginia *2ns bunch structure Multi-threaded Event Reconstruction with JANA

2 Data Rates in Some Modern Experiments Front End DAQ Rate Event Size L1 Trigger Rate Bandwidth to mass Storage GlueX3 GB/s15 kB200 kHz300 MB/s CLAS12100 MB/s20 kB10 kHz100 MB/s ALICE500 GB/s2.5 MB200 kHz200 MB/s ATLAS113 GB/s1.5MB75 kHz300 MB/s CMS200 GB/s1 MB100kHz100 MB/s LHCb40 GB/s40 kB1 MHz100 MB/s STAR50 GB/s80 MB600 Hz450 MB/s PHENIX900 MB/s~60 kB~ 15 kHz450 MB/s Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab2 LHC JLab BNL * CHEP2007 talk Sylvain Chapelin private comm. * Jeff Landgraff private Comm. Feb. 11, 2010 ** CHEP2006 talk MartinL. Purschke **

3 Why Multi-threading? 5/25/10JANA - Lawrence - CLAS12 Software Workshop3 2002 - 2014 Multi-core processors are already here and commonly used. Industry has signaled that this will be the trend for the next several years. Consequence: Parallelism is required Maintaining a fixed memory capacity per core will become increasingly expensive due to limitations on the number of controllers that can be placed on a single die (#pins). Example: Memory accounts for about 10%-25% of system cost today but that will increase by as much as 5%/year over the next several years leading to 50% of system cost going toward RAM

4 Factory Model 5/25/10JANA - Lawrence - CLAS12 Software Workshop4 STOCK MANUFACTURE in stock? ORDER PRODUCT YES NO FACTORY (algorithm) STOCK MANUFACTURE in stock? YES NO FACTORY STOCK MANUFACTURE in stock? YES NO FACTORY Data on demand = Don’t do it unless you need it Stock = Don’t do it twice Conservation of CPU cycles!

5 Complete Event Reconstruction 5/25/10JANA - Lawrence - CLAS12 Software Workshop5 Event Loop Event Processo r Event Source HDDM File EVIO File ET system Web Service User supplied code Fill histograms Write DST L3 trigger Framework has a layer that directs object requests to the factory that completes it This allows the framework to easily redirect requests to alternate algorithms specified by the user at run time Multiple algorithms (factories) may exist in the same program that produce the same type of data objects

6 Multi-threading 5/25/10JANA - Lawrence - CLAS12 Software Workshop6 Event Processo r Event Source thread o Each thread has a complete set of factories making it capable of completely reconstructing a single event o Factories only work with other factories in the same thread eliminating the need for expensive mutex locking within the factories o All events are seen by all Event Processors (multiple processors can exist in a program)

7 5/25/107JANA - Lawrence - CLAS12 Software Workshop janadot plugin: Auto-generate algorithm dependency graph

8 A closer look at janadot 5/25/108JANA - Lawrence - CLAS12 Software Workshop

9 Testing on a 48-core “Magny Cours” Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab9 Event reconstruction using 48 processing threads on a CPU with 48 cores generally scales quite well. Eventually, an I/O limit will be encountered Occasionally some problems with inexplicably lower rates. Program appears to simply run slower while not operating any differently. Unclear if this is due to hardware or Linux kernel Memory Usage vs. time while repeatedly running the 35 thread test. The marked area indicates one test where the program ran slower.

10 Summary The JANA framework is a multi-threaded event reconstruction framework – Modified factory model minimizes CPU usage for a single thread – Processing event completely within a single thread minimizes expensive mutex locking – Multiple threads require less memory than multiple, independent processes – Tests of event rate scaling with a 48-core machine looks very promising Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab10 doi: 10.1088/1742-6596/219/4/042011 doi: 10.1088/1742-6596/119/4/042018 http://www.jlab.org/JANA

11 Backup Slides Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab11

12 Associated Objects 5/25/10JANA - Lawrence - CLAS12 Software Workshop12 Cluster (calorimeter) Cluster (calorimeter) Hit track MC generated Hit object associate d objects o A data object may be associated with any number of other data objects having a mixture of types vector clusters; loop->Get(clusters); for(uint i=0; i<clusters.size(); i++) { vector hits; clusters[i]->Get(hits); // Do something with hits … } o Each data object has a list of “associated objects” that can be probed using a similar access mechanism as for event-level object requests

13 Plugins JANA supports plugins: pieces of code that can be attached to existing executables to extend or modify its behavior Plugins can be used to add: – Event Processors – Event sources – Factories (additional or replacements) Examples: – Plugins for creating DST skim files Reconstruction is done once with output to multiple files hd_ana --PPLUGINS=kaon_skim,ppi+pi-_skim run012345.evio – Plugins for producing subsystem histograms Single ROOT file has histograms from several pieces of code hd_root --PPLUGINS=bcal_hists,cdc_hists,tof_hists ET:GlueX 5/25/10JANA - Lawrence - CLAS12 Software Workshop13

14 ?? Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab14 Occasionally, the event processing ran slower with the rate falling on a different curve. The exact cause for this is unknown. The problem seemed to ameliorate if a lot of time had passed since booting with other programs being run in the interim, but the evidence was inconclusive. Some PMU data was taken, but has not been fully analyzed.


Download ppt "October 19, 2010 David Lawrence JLab Oct. 19, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 18: Software Engineering, Data Stores,"

Similar presentations


Ads by Google