October 19, 2010 David Lawrence JLab Oct. 19, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 18: Software Engineering, Data Stores,

Slides:



Advertisements
Similar presentations
The June Software Review David Lawrence, JLab Feb. 16, /16/121Preparations for June Software Review David Lawrence.
Advertisements

Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
Memory Management 2010.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
February 19th 2009AlbaNova Instrumentation Seminar1 Christian Bohm Instrumentation Physics, SU Upgrading the ATLAS detector Overview Motivation The current.
CLEO’s User Centric Data Access System Christopher D. Jones Cornell University.
Offline Software Status Jan. 30, 2009 David Lawrence JLab 1.
The JANA Calibrations and Conditions Database API March 23, 2009 David Lawrence JLab 3/23/091JANA Calibration API David Lawrence -- JLab.
Multi-threaded Event Processing with JANA David Lawrence – Jefferson Lab Nov. 3, /3/08 Multi-threaded Event Processing with JANA - D. Lawrence JLab.
Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user.
I/O bound Jobs Multiple processes accessing the same disc leads to competition for the position of the read head. A multi -threaded process can stream.
May. 11, 2015 David Lawrence JLab Counting House Operations.
Protection and the Kernel: Mode, Space, and Context.
Hall D Trigger and Data Rates Elliott Wolin Hall D Electronics Review Jefferson Lab 23-Jul-2003.
Introduction to Hall-D Software February 27, 2009 David Lawrence - JLab.
Online Data Challenges David Lawrence, JLab Feb. 20, /20/14Online Data Challenges.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
The GlueX Collaboration Meeting October 4-6, 2012 Jefferson Lab Curtis Meyer.
Online Monitoring Status David Lawrence JLab Oct. 2, /2/14Monitoring Status -- David Lawrence1.
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
GlueX Software Status April 28, 2006 David Lawrence, JLab.
DAQ Issues for the 12 GeV Upgrade CODA 3. A Modest Proposal…  Replace aging technologies  Run Control  Tcl-Based DAQ components  mSQL  Hall D Requirements.
Data Acquisition for the 12 GeV Upgrade CODA 3. The good news…  There is a group dedicated to development and support of data acquisition at Jefferson.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day14:
1 Trigger and DAQ for SoLID SIDIS Programs Yi Qiang Jefferson Lab for SoLID-SIDIS Collaboration Meeting 3/25/2011.
DANA uses a factory model to deliver data “just in time”
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
CHEP ‘06 David Lawrence JLab C++ Introspection and Object Persistency through JIL David Lawrence Ph. D. Jefferson Lab, Newport News VA.
Xiangming Sun1PXL Sensor and RDO review – 06/23/2010 STAR XIANGMING SUN LAWRENCE BERKELEY NATIONAL LAB Firmware and Software Architecture for PIXEL L.
CEBAF The Continuous Electron Beam Accelerating Facility(CEBAF) is the central particle accelerator at JLab. CEBAF is capable of producing electron beams.
The PHysics Analysis SERver Project (PHASER) CHEP 2000 Padova, Italy February 7-11, 2000 M. Bowen, G. Landsberg, and R. Partridge* Brown University.
Hall-D/GlueX Software Status 12 GeV Software Review III February 11[?], 2015 Mark Ito.
Report on CHEP ‘06 David Lawrence. Conference had many participants, but was clearly dominated by LHC LHC has 4 major experiments: ALICE, ATLAS, CMS,
Silberschatz, Galvin and Gagne  Operating System Concepts UNIT II Operating System Services.
Online Reconstruction 1M.Ellis - CM th October 2008.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
Claudio Grandi INFN-Bologna CHEP 2000Abstract B 029 Object Oriented simulation of the Level 1 Trigger system of a CMS muon chamber Claudio Grandi INFN-Bologna.
Von Neumann Computers Article Authors: Rudolf Eigenman & David Lilja
STAR J/  Trigger in dA Manuel Calderon for the Heavy-Flavor Group Trigger Workshop at BNL October 21, 2002.
The JANA Reconstruction Framework David Lawrence - JLab May 25, /25/101JANA - Lawrence - CLAS12 Software Workshop.
Monitoring Update David Lawrence, JLab Feb. 20, /20/14Online Monitoring Update -- David Lawrence1.
ATLAS and the Trigger System The ATLAS (A Toroidal LHC ApparatuS) Experiment is one of the four major experiments operating at the Large Hadron Collider.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
GlueX Software Status + Framework Development David Lawrence JLab September 19, /19/081Software Status -- David Lawrence, JLab.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
ATLAS and the Trigger System The ATLAS (A Toroidal LHC ApparatuS) Experiment [1] is one of the four major experiments operating at the Large Hadron Collider.
October 21, 2010 David Lawrence JLab Oct. 21, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 53: Software Engineering, Data Stores,
Chapter 4: Threads 羅習五. Chapter 4: Threads Motivation and Overview Multithreading Models Threading Issues Examples – Pthreads – Windows XP Threads – Linux.
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
1 GlueX Software Oct. 21, 2004 D. Lawrence, JLab.
David Lawrence JLab May 11, /11/101Reconstruction Framework -- GlueX Collab. meeting -- D. Lawrence.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
DANA David Lawrence Oct. 21, D. Lawrence, JLab GlueX Software Workshop Oct Outline Requirements Philosophy Overview of Features Open Questions.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Diskpool and cloud storage benchmarks used in IT-DSS
OPERATING SYSTEMS CS3502 Fall 2017
Sharing Memory: A Kernel Approach AA meeting, March ‘09 High Performance Computing for High Energy Physics Vincenzo Innocente July 20, 2018 V.I. --
Architecture Background
Multithreaded Programming
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

October 19, 2010 David Lawrence JLab Oct. 19, 20101RootSpy -- CHEP10, Taipei -- David Lawrence, JLab Parallel Session 18: Software Engineering, Data Stores, and Databases CEBAF at JLab is (soon to be) a 12 GeV e - continuous wave * beam facility in Newport News, Virginia *2ns bunch structure Multi-threaded Event Reconstruction with JANA

Data Rates in Some Modern Experiments Front End DAQ Rate Event Size L1 Trigger Rate Bandwidth to mass Storage GlueX3 GB/s15 kB200 kHz300 MB/s CLAS12100 MB/s20 kB10 kHz100 MB/s ALICE500 GB/s2.5 MB200 kHz200 MB/s ATLAS113 GB/s1.5MB75 kHz300 MB/s CMS200 GB/s1 MB100kHz100 MB/s LHCb40 GB/s40 kB1 MHz100 MB/s STAR50 GB/s80 MB600 Hz450 MB/s PHENIX900 MB/s~60 kB~ 15 kHz450 MB/s Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab2 LHC JLab BNL * CHEP2007 talk Sylvain Chapelin private comm. * Jeff Landgraff private Comm. Feb. 11, 2010 ** CHEP2006 talk MartinL. Purschke **

Why Multi-threading? 5/25/10JANA - Lawrence - CLAS12 Software Workshop Multi-core processors are already here and commonly used. Industry has signaled that this will be the trend for the next several years. Consequence: Parallelism is required Maintaining a fixed memory capacity per core will become increasingly expensive due to limitations on the number of controllers that can be placed on a single die (#pins). Example: Memory accounts for about 10%-25% of system cost today but that will increase by as much as 5%/year over the next several years leading to 50% of system cost going toward RAM

Factory Model 5/25/10JANA - Lawrence - CLAS12 Software Workshop4 STOCK MANUFACTURE in stock? ORDER PRODUCT YES NO FACTORY (algorithm) STOCK MANUFACTURE in stock? YES NO FACTORY STOCK MANUFACTURE in stock? YES NO FACTORY Data on demand = Don’t do it unless you need it Stock = Don’t do it twice Conservation of CPU cycles!

Complete Event Reconstruction 5/25/10JANA - Lawrence - CLAS12 Software Workshop5 Event Loop Event Processo r Event Source HDDM File EVIO File ET system Web Service User supplied code Fill histograms Write DST L3 trigger Framework has a layer that directs object requests to the factory that completes it This allows the framework to easily redirect requests to alternate algorithms specified by the user at run time Multiple algorithms (factories) may exist in the same program that produce the same type of data objects

Multi-threading 5/25/10JANA - Lawrence - CLAS12 Software Workshop6 Event Processo r Event Source thread o Each thread has a complete set of factories making it capable of completely reconstructing a single event o Factories only work with other factories in the same thread eliminating the need for expensive mutex locking within the factories o All events are seen by all Event Processors (multiple processors can exist in a program)

5/25/107JANA - Lawrence - CLAS12 Software Workshop janadot plugin: Auto-generate algorithm dependency graph

A closer look at janadot 5/25/108JANA - Lawrence - CLAS12 Software Workshop

Testing on a 48-core “Magny Cours” Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab9 Event reconstruction using 48 processing threads on a CPU with 48 cores generally scales quite well. Eventually, an I/O limit will be encountered Occasionally some problems with inexplicably lower rates. Program appears to simply run slower while not operating any differently. Unclear if this is due to hardware or Linux kernel Memory Usage vs. time while repeatedly running the 35 thread test. The marked area indicates one test where the program ran slower.

Summary The JANA framework is a multi-threaded event reconstruction framework – Modified factory model minimizes CPU usage for a single thread – Processing event completely within a single thread minimizes expensive mutex locking – Multiple threads require less memory than multiple, independent processes – Tests of event rate scaling with a 48-core machine looks very promising Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab10 doi: / /219/4/ doi: / /119/4/

Backup Slides Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab11

Associated Objects 5/25/10JANA - Lawrence - CLAS12 Software Workshop12 Cluster (calorimeter) Cluster (calorimeter) Hit track MC generated Hit object associate d objects o A data object may be associated with any number of other data objects having a mixture of types vector clusters; loop->Get(clusters); for(uint i=0; i<clusters.size(); i++) { vector hits; clusters[i]->Get(hits); // Do something with hits … } o Each data object has a list of “associated objects” that can be probed using a similar access mechanism as for event-level object requests

Plugins JANA supports plugins: pieces of code that can be attached to existing executables to extend or modify its behavior Plugins can be used to add: – Event Processors – Event sources – Factories (additional or replacements) Examples: – Plugins for creating DST skim files Reconstruction is done once with output to multiple files hd_ana --PPLUGINS=kaon_skim,ppi+pi-_skim run evio – Plugins for producing subsystem histograms Single ROOT file has histograms from several pieces of code hd_root --PPLUGINS=bcal_hists,cdc_hists,tof_hists ET:GlueX 5/25/10JANA - Lawrence - CLAS12 Software Workshop13

?? Oct. 19, 2010RootSpy -- CHEP10, Taipei -- David Lawrence, JLab14 Occasionally, the event processing ran slower with the rate falling on a different curve. The exact cause for this is unknown. The problem seemed to ameliorate if a lot of time had passed since booting with other programs being run in the interim, but the evidence was inconclusive. Some PMU data was taken, but has not been fully analyzed.