Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-threaded Event Processing with JANA David Lawrence – Jefferson Lab Nov. 3, 2008 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab.

Similar presentations


Presentation on theme: "Multi-threaded Event Processing with JANA David Lawrence – Jefferson Lab Nov. 3, 2008 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab."— Presentation transcript:

1 Multi-threaded Event Processing with JANA David Lawrence – Jefferson Lab Nov. 3, 2008 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab 1

2 Thomas Jefferson National Accelerator Facility (JLab) 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab 6 GeV electron accelerator user facility funded by the US Dept. of Energy Located in Newport News on the east coast of Virginia, USA 1 of the 2 major nuclear physics research labs in the U.S. CHL 2 for basic research into the quark structure of nuclear matter 12 GeV 11 GeV (CD-3 approval came in Sept. 2008 with data planned in 2014)

3 The GlueX Experiment 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab real  beam 2 Tesla solenoid magnet 30 cm LH 2 target Forward EM calorimeter and forward TOF wall downstream Cylindrical and planar drift chambers inside magnet Barrel EM calorimeter inside magnet The “continuous wave” 12GeV electron beam at JLab has a beam bunch every 2 ns Conventional meson has quantum numbers determined only by constituent quarks Hybrid meson has some quantum properties due to contributions from the “glue”

4 Data Rates in 12GeV era Front End DAQ Rate Event Size L1 Trigger Rate Bandwidth to mass Storage GlueX3 GB/s15 kB200 kHz300 MB/s CLAS12100 MB/s20 kB10 kHz100 MB/s ALICE500 GB/s2.5 MB200 kHz200 MB/s ATLAS113 GB/s1.5MB75 kHz300 MB/s CMS200 GB/s1 MB100kHz100 MB/s LHCb40 GB/s40 kB1 MHz100 MB/s STAR8 GB/s80 MB100 Hz30 MB/s PHENIX900 MB/s~60 kB~ 15 kHz450 MB/s 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab LHC JLab BNL * CHEP2007 talk Sylvain Chapelin private comm. * NIM A499 Mar. 2003 ppg 762-765 ** CHEP2006 talk MartinL. Purschke **

5 CPU development in the coming years 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab From “Platform 2015: Intel Platform Evolution for the Next Decade” expect more than 100 cores in a box by 2014! CPU development has shifted from increased clock speed to multiple cores Dual and quad core CPUs are common today Some type of parallelization must be done to use all of the power in a next generation CPU

6 Multi-threading vs. Multiple Processes for a Single Input File 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab FILE single threaded program single threaded program single threaded program single threaded program single threaded program single threaded program multi-threaded program dispatcher Accumulator file output Merger file output file output Multiple ProcessesMultiple Threads Bookkeeping overhead is reduced with multiple threads option 1 option 2 FILE

7 Threading benefits small scale processing (individual developer cycle) 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab = multi-threaded = single-threaded Single Workstation cores processing time Total CPU power proportional to area = edit/compile Multi-threading leads to a more rapid turn around time when developing single-threaded multi-threaded The relevant measure of CPU “power” now includes the number of cores used

8 The JANA Factory Model 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab Traditional factory models pass ownership of created objects to the caller In JANA, only const pointers are passed out and ownership stays with the factory Passing out only const pointers guarantees that only the factory may modify the objects Subsequent requests get the same const pointers vector tracks; loop->Get(tracks); Templated Get() method helps ensure type safety Framework itself responsible for telling factories to delete objects at end of event Persistent flag marks factories that should not auto- delete objects

9 Threads in JANA 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab Each thread in JANA is composed of its own event processing loop and a complete set of factories Reconstruction of a given event is done entirely inside of a single thread No mutex locking is required by authors of reconstruction code Threads work asynchronously to maximize rates at the expense of not maintaining the event order on output raw data read in reconstructed values written out (e.g. ROOT tree)

10 Multi-threading when CPU limited 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab CPU intensive jobs are the ideal application for multi-threading Blue circles are reconstruction of data from a Monte Carlo simulation Red triangles are from a CPU-hungry speed testing plugin Both show very good scaling of the event processing rate with the number of threads Reconstruction of MC data, CPU bound jobs only Overall event processing rate scales linearly with the number of threads

11 Multi-threading when I/O limited 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab Multiple processes trying to access different locations on the same disk leads to competition causing the read head to physically move back and forth from one location on the disk to another A multi-threaded application will access a single file in sequence reducing the number of moves the read head must make blue circles : one multi-threaded process reading from a single file red triangles : multiple single-thread processes reading different files from the same disk No processing of event data, I/O bound jobs only

12 Features of JANA C++, object-oriented, STL Multi-threaded : reconstruction program can launch any number of processing threads with each event being seen by only one thread Plug-ins : an existing, compiled program can dynamically load other modules that extend or modify it’s behavior at run-time Reconstruction Algorithms Event (Data) sources Event Processors (i.e. the top-level “conductor”) Data on demand : modules are not “activated” unless the data they produce is requested for that particular event 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab The Event Processing Framework JANA includes the following features:

13 Summary In the 12GeV era, JLab expects to produce more than 5 pB/yr Performance improvements have been shown for both CPU and I/O limited jobs using a multi-threaded event processing framework. Taking advantage of multi-core architectures requires very little effort from reconstruction code authors in a multi-thread framework. Other JANA features not covered: Automatic TTree creation Internal profiling and call graphing Calib. /Cond. DB API … 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab

14 Backup Slides 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab

15 The janaroot plugin (for automatic creation of ROOT TTrees) 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab Each data object implements a toStrings() method which provides an expression of the data object that may not be a full representation of the object The toStrings() mechanism was developed for allowing a simple, low-level dump of objects from single events to the screen This mechanism is leveraged by janaroot to provide a similar expression as TTree s An empty event tree is also created with all other trees Each leaf is an array of size “N” to represent the N objects of this type in the event A leaf named “N” is automatically added to each tree listed as friends so that a leaves from multiple objects can be used together in expressions Limitations make this unsuitable for all applications, but it does provide a quick, easy way to make plots of some reconstructed values for less experienced users

16 The janadot plugin (for creating a factory call graph) 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab Number of calls and amount of time spent satisfying each is reported Objects at bottom of graph are (mostly) supplied by event source arrows indicate calling sequence data flow is in opposite direction

17 Important Roles of the Event Processing Framework A clear structure for modular building of reconstruction code An easy means for swapping out modules (e.g. replace one calorimeter clustering algorithm with another one) A mechanism for moving data between modules Standard interface to event sources (i.e. reconstruction agnostic as to whether event came from file, socket, web service, etc…) Standard interface to Calibrations and Conditions DB Centralized area for run-time settings with simple access mechanism (i.e. allow user to modify a setting at runtime and all modules can see it) 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab The framework should provide: JANA has been designed to provide all of these!

18 Threading benefits large scale processing 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab cores processing time = multi-threaded = single-threaded Single Farm node

19 Threading benefits large scale processing 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab 1 year of GlueX data =10k to 20k files if 1 file every 10 min.


Download ppt "Multi-threaded Event Processing with JANA David Lawrence – Jefferson Lab Nov. 3, 2008 11/3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab."

Similar presentations


Ads by Google