John Marshall, 1 John Marshall, University of Cambridge LCD WG6 Meeting, April 18 2011.

Slides:



Advertisements
Similar presentations
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Advertisements

Efforts to Improve the Reconstruction of Non-Prompt Tracks with the SiD Lori Stevens UCSC ILC Simulation Reconstruction Meeting May 15, 2007 Includes contributions.
SiD PFA Status and Calorimeter Performance Ron Cassell (SLAC) SiD Design Study Meeting 11/15/08.
John Marshall, 1 John Marshall, University of Cambridge ILD Workshop, LAL Orsay, May
The Little man computer
Algorithms Today we will look at: what we mean by efficiency in programs why efficiency matters what causes programs to be inefficient? will one algorithm.
John Marshall, 1 John Marshall, University of Cambridge ILD Workshop, DESY, July
1 Reconstruction of Non-Prompt Tracks Using a Standalone Barrel Tracking Algorithm.
1 Sci Fi Simulation and Reconstruction Status M.Ellis/C.Rogers Wednesday 31 st March 2004.
1 N. Davidson E/p single hadron energy scale check with minimum bias events Jet Note 8 Meeting 15 th May 2007.
Validation of DC3 fully simulated W→eν samples (NLO, reconstructed in ) Laura Gilbert 01/08/06.
1 PID Detectors & Emittance Resolution Chris Rogers Rutherford Appleton Laboratory MICE CM17.
1 N. Davidson, E. Barberio E/p single hadron energy scale check with minimum bias event Hadronic Calibration Workshop 26 th -27 th April 2007.
Analysis Meeting – April 17 '07 Status and plan update for single hadron scale check with minimum bias events N. Davidson.
General Trigger Philosophy The definition of ROI’s is what allows, by transferring a moderate amount of information, to concentrate on improvements in.
Object Oriented Software Development
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
John Marshall, 1 John Marshall, University of Cambridge ILD Workshop, LAL Orsay, May
Energy Flow and Jet Calibration Mark Hodgkinson Artemis Meeting 27 September 2007 Contains work by R.Duxfield,P.Hodgson, M.Hodgkinson,D.Tovey.
John Marshall, 1 John Marshall, University of Cambridge IWLC2010, Geneva, October
Mark Thomson Timing, Tungsten and High Energy Jets.
Status of SiD PFA Development Lei Xia (ANL – HEP) What tools do we need What tools do we need Where are we now Where are we now Future plan Future plan.
Studies of PFA Fundamentals Ron Cassell – SLAC SiD Workshop Jan. 28, 2008.
John Marshall, 1 John Marshall, University of Cambridge LCWS, Beijing, March 2010.
Mark Thomson University of Cambridge  LoI loose ends  New physics studies?  What next... Future Plans… This talk:
Cluster Finding Comparisons Ron Cassell SLAC. Clustering Studies This report studies clustering in the EM calorimeter, using SLIC simulated ttbar events.
Optimising Cuts for HLT George Talbot Supervisor: Stewart Martin-Haugh.
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
Assembly Code Optimization Techniques for the AMD64 Athlon and Opteron Architectures David Phillips Robert Duckles Cse 520 Spring 2007 Term Project Presentation.
John Marshall, 1 John Marshall, University of Cambridge LCD-WG2, May
Event-Specific Hadronic Event Reconstruction 1 Graham W. Wilson, University of Kansas.
AMB HW LOW LEVEL SIMULATION VS HW OUTPUT G. Volpi, INFN Pisa.
Development of a Particle Flow Algorithms (PFA) at Argonne Presented by Lei Xia ANL - HEP.
CaloTopoCluster Based Energy Flow and the Local Hadron Calibration Mark Hodgkinson June 2009 Hadronic Calibration Workshop.
Pandora calorimetry and leakage correction Peter Speckmayer 2010/09/011Peter Speckmayer, WG2 meeting.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
CBM ECAL simulation status Prokudin Mikhail ITEP.
05/04/06Predrag Krstonosic - Cambridge True particle flow and performance of recent particle flow algorithms.
Higgs to 4 leptons in Athena with eventView Stathes Paganis (University of Sheffield) with Rosy.Nikolaidou (Saclay) Nektarios Benekos (Max Planck Institute)
John Marshall, 1 John Marshall, University of Cambridge LCD-WG2, June
1 D.Chakraborty – VLCW'06 – 2006/07/21 PFA reconstruction with directed tree clustering Dhiman Chakraborty for the NICADD/NIU software group Vancouver.
Status of Reconstruction in sidloi3 Ron Cassell 5/20/10.
Decision Making and Branching (cont.)
Ties Behnke: Event Reconstruction 1Arlington LC workshop, Jan 9-11, 2003 Event Reconstruction Event Reconstruction in the BRAHMS simulation framework:
John Marshall, 1 John Marshall, University of Cambridge LCD Meeting, December
Fast Simulation and the Higgs: Parameterisations of photon reconstruction efficiency in H  events Fast Simulation and the Higgs: Parameterisations of.
John MarshallPandora Development1 J.S. Marshall University of Cambridge.
Issues with cluster calibration + selection cuts for TrigEgamma note Hardeep Bansil University of Birmingham Birmingham ATLAS Weekly Meeting 12/08/2010.
John Marshall, 1 John Marshall, University of Cambridge LCD Software Meeting, September
Update on Diffractive Dijets Hardeep Bansil University of Birmingham 12/07/2013.
Jet reconstruction with Deterministic Annealing Davide Perrino Dipartimento di Fisica – INFN di Bari Terzo Convegno Nazionale sulla Fisica di Alice – 13/11/2007.
LAV efficiency studies with photons T. Spadaro* *Frascati National Laboratory of INFN.
Aug _5071 Top stop by charm channel analysis using D0 runI data OUTLINE physics process of top to stop Monte Carlo simulation for signal data.
J. S. MarshallPandora PFA1 Pandora Particle Flow Calorimetry Tuesday 29 th January 2013 J. S. Marshall University of Cambridge.
John Marshall, 1 John Marshall, University of Cambridge LCD-WG2, July
BEACH 04J. Piedra1 SiSA Tracking Silicon stand alone (SiSA) tracking optimization SiSA validation Matthew Herndon University of Wisconsin Joint Physics.
Chapter 5: Software effort estimation
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
John Marshall, 1 John Marshall, University of Cambridge LCD WG6 Meeting, February
Τ HLTrigger Optimization Mike B 6 th Nov. 2 M. Bachtis - UW The tau High Level Trigger scheme in CMS For the events that pass the L1 Trigger jet reconstruction.
Full Sim Status Estel Perez 27 July 2017.
The Little man computer
Tree based validation tool for track reconstruction
slicPandora: slic + pandoraPFANew
Interactions of hadrons in the Si-W ECAL
Top Tagging at CLIC 1.4TeV Using Jet Substructure
Data Analysis in Particle Physics
EFA/DHCal development at NIU
Chapter 5: Software effort estimation
Clustering-based Studies on the upgraded ITS of the Alice Experiment
Presentation transcript:

John Marshall, 1 John Marshall, University of Cambridge LCD WG6 Meeting, April

John Marshall, 2 Overview Reconstruction of events with overlaid background is a challenge for our reconstruction software. Even functions that are intrinsically efficient cannot help but be affected by the huge increases in combinations of tracks and calorimeter hits. CPU time clearly important, so efforts have been made to specifically address the problems of overlaid γγ  hadrons background, with default NumberBackground=3.2 André has been focusing on improving the performance of processors in MarlinReco library, whilst I have examined PandoraPFANew. There are some difference between the CPU times we report, but we are now satisfied that these are due to machine specifications rather than build configurations, etc. André has been able to access Intel VTune Amplifier XE 2011 package to provide impressive amount of profiling information. This can report the actual CPU time required by each function and can even provide a line-by-line breakdown of CPU time. Start with a reminder of performance at the time of the previous meeting...

John Marshall, 3 Status at last meeting Processor Name Seconds per event (10 event sample), 01/04/2011* MarlinPandora FullLDCTracking LCIOOutputProcessor7.964 LEPTrackingProcessor7.039 SiliconTrackingCLIC3.579 TPCDigiProcessor2.107 KinkFinder0.325 V0Finder0.244 RecoMCTruthLinker0.197 ILDCaloDigi0.097 Total Division of total CPU time between Marlin processors for ten 91GeV Z->uds events with overlaid γγ  hadrons background and NumberBackground=3.2 Without background, total CPU-time is just 0.565s per event. * MarlinReco revision 2151, PandoraPFANew revision 1100

John Marshall, 4 Costly functions FunctionCPU Time 01/04/2011 ClusterHelper::GetTrackClusterDistance42.839s IsolatedHitMergingAlgorithm::GetDistanceToHit32.070s ConeClusteringAlgorithm::GetGenericDistanceToHit29.802s ConeClusteringAlgorithm::GetDistanceToHitInSameLayer20.070s CartesianVector::GetZ16.702s ClusterHelper::GetDistanceToClosestHit15.620s ClusterContact::HitDistanceComparison11.077s Cluster::GetCentroid8.950s CaloHitHelper::GetDensityWeightContribution6.739s operator- (CartesianVector)6.562s ConeClusteringAlgorithm::FindHitsInSameLayer5.399s TrackClusterAssociationAlgorithm::Run4.279s CaloHitHelper::IsolationCountNearbyHits4.139s ConeClusteringAlgorithm::GetConeApproachDistanceToHit3.870s ClusterHelper::GetDistanceToClosestCentroid3.661s ConeClusteringAlgorithm::GetConeApproachDistanceToHit3.330s FragmentRemovalHelper::GetClusterContactDetails2.591s ClusterHelper::GetTrackClusterDistance2.461s CartesianVector::GetUnitVector2.362s TestPandora application used with input Pandora binary files to perform standalone Pandora reconstruction and concentrate purely on PandoraPFANew. MarlinPandora not considered.

John Marshall, 5 Reduce function calls Most costly function is GetTrackClusterDistance, used to help identify track-cluster associations. With background, this function is called for many track-cluster combinations. For each combination, examine hits in first n cluster layers to find closest perpendicular distance between a straight-line (defined by track state at calorimeter) and a hit in the cluster. After basic C++ optimization, difficult to further reduce CPU time without changing function behaviour. Instead, try to avoid comparison of tracks and clusters with very different “expected directions”. Similar cuts implemented in cone-based clustering algorithms, SoftClusterMerging, IsolatedHitMerging and FragmentRemoval algorithms. Potentially dangerous, but......cut values are configurable, default cut cos(angle)>0 should be safe. Validation crucial. Track direction Parallel distance region Find smallest perpendicular distance to hit within parallel distance region

John Marshall, 6 Change approach Another costly function is used by the IsolatedHitMerging algorithm, which matches isolated hits to nearby clusters, based on the distance to the nearest hit in the cluster. This algorithm is not unimportant, but still a rather small part of Pandora reconstruction. That it is one of the most time consuming processes justifies a change in approach. Isolated hits are now matched to clusters based upon distances to the nearest layer centroid position. This allows the nested loop over hits in each layer to be avoided. Small change to behaviour, but not obvious if any better/worse. Again, validation is crucial. Get distance to nearest layer centroid Get distance to nearest hit

John Marshall, 7 Change approach The CartesianVector class is crucial to Pandora reconstruction. Used extensively throughout all algorithms. Even small efficiency improvements to this class can help. Previously, this class offered a default constructor: inline CartesianVector::CartesianVector(float x, float y, float z) : m_x(x), m_y(y), m_z(z) { } No longer any need for the initialization flag, removing checks from many important functions: inline CartesianVector::CartesianVector() : m_x(0.f), m_y(0.f), m_z(0.f), m_isInitialized(false) { } This meant that each instance needed an initialization flag, set to true only when explicit component values were assigned. The flag needed to be checked in most member functions. Removal of the default constructor means that the fully qualified constructor must be used: GetDotProduct, GetCrossProduct, GetMagnitude, GetOpeningAngle, GetX,Y,Z,...

John Marshall, 8 General optimization Two of the functions badly affected by the increased calorimeter occupancies are those used to calculate the “density weight” and “surrounding energy” values for each hit. These quantities are intended for use with digital calorimeters and are not actually used in CLIC_CDR reconstruction. Can add entries to PandoraSettings to skip these calculations: Finally, attempted a general optimization of remaining costly functions. Try to avoid square roots, avoid trigonometric functions and simply avoid unnecessary instructions. However, not too much gained here; functions already designed for efficiency. There are still some potential further changes/savings, but now need more aggressive changes. Such changes likely to make code less readable/maintainable (e.g. repeated code to avoid function calls) and/or introduce changes to physics output (require step-by-step validation). … false …

John Marshall, 9 Impact of changes Function CPU Time 01/04/2011CPU Time 15/04/2011* ConeClusteringAlgorithm::GetGenericDistanceToHit29.802s28.010s IsolatedHitMergingAlgorithm::GetDistanceToHit32.070s15.310s ClusterHelper::GetTrackClusterDistance42.839s10.450s ClusterHelper::GetDistanceToClosestHit15.620s10.150s ConeClusteringAlgorithm::GetDistanceToHitInSameLayer20.070s9.742s ClusterContact::HitDistanceComparison11.077s9.089s Cluster::GetCentroid8.950s8.730s CaloHitHelper::GetDensityWeightContribution6.739s(6.230s) CartesianVector::GetCosOpeningAngle 0.190s 5.540s ConeClusteringAlgorithm::FindHitsInSameLayer5.399s5.158s CaloHitHelper::IsolationCountNearbyHits4.139s4.040s ClusterHelper::GetDistanceToClosestCentroid3.661s3.019s CartesianVector::GetUnitVector 0.070s 2.430s ConeClusteringAlgorithm::GetConeApproachDistanceToHit3.870s2.160s FragmentRemovalHelper::GetClusterContactDetails2.591s1.870s ConeClusteringAlgorithm::FindHitsInPreviousLayers 2.360s 1.691s CaloHitHelper::MipCountNearbyHits 1.480s 1.660s TrackClusterAssociationAlgorithm::Run4.279s1.600s CaloHitHelper::GetSurroundingEnergyContribution 1.781s (1.529s) Analysis of PandoraPFANew after efficiency improvements: it is interesting to see how the load has been redistributed. There is an large overall decrease in CPU time. * MarlinReco revision 2179, PandoraPFANew revision 1137

John Marshall, 10 A. Sailer MarlinReco Since the previous meeting, André’s examination of MarlinReco has focused on FullLDCTracking and, in particular, the assignment of tpc hits to tracks:

John Marshall, 11 MarlinReco A. Sailer MarlinReco revision 2161MarlinReco revision 2162

John Marshall, 12 Current status Processor Name Seconds per event, 01/04/2011 Seconds per event, 15/04/2011 MarlinPandora FullLDCTracking LCIOOutputProcessor LEPTrackingProcessor SiliconTrackingCLIC TPCDigiProcessor KinkFinder V0Finder RecoMCTruthLinker ILDCaloDigi Total Great success in improving efficiency of reconstruction software in presence of background. For Pandora, declared “first pass” of efficiency improvements complete. Still some gains to be made, but becoming difficult to make changes.

John Marshall, 13 Validation EjEj 45GeV100GeV250GeV500GeV Status at 01/04/ ± ± ± ± 0.06 Status at 15/04/ ± ± ± ± 0.06 Efficiency changes carefully implemented to avoid affecting physics output. Have confirmed that Pandora jet energy reconstruction performance is unaffected. Have also examined a number of low- and high-energy single particle files to help confirm that particle id performance is unaffected. Jacopo has performed a full validation of particle id and reported good results. All efficiency improvements now in Ilcsoft v01-11 pre-release 04. Remember to use up-to-date steering files. In particular, there are changes to PandoraSettings.xml file. No other changes for CLIC_ILD or CLIC_SiD. Have hopefully saved many CPU cycles!