Download presentation
Presentation is loading. Please wait.
Published byBlanche Simmons Modified over 6 years ago
1
Matthias Richter, Sebastian Kalcher, Jochen Thäder & Timm M. Steinbeck
AliRoot - Pub/Sub Framework Analysis Component Interface
2
Overview System consists of three main parts A C++ shared library with a component handler class Compile and callable directly from AliRoot A number of C++ shared libraries with the actual reconstruction components themselves Compiled as part of AliRoot and directly callable from it A C wrapper API that provides access to the component handler and reconstruction components Contained in component handler shared library Called by Pub/Sub wrapper component Makes Pub/Sub and AliRoot compiler independant Binary compatibility No recompile of reconstruction code
3
Overview Component Handler HLT TPC Shared Library Shared Library
Register Component HLT TPC Shared Library (Load Component Library;) Initialize Components; Get Components Component Handler C++ Class Global Clusterfinder Object Global Tracker Object (Load Comp. Lib.;) Get Component; Initialize Component Load Library Component Handler C Wrapper Functions Clusterfinder C++ Class Tracker C++ Class AliRoot Initialize ProcessEvent ProcessEvent Pub/Sub Framework (Load Component Library;) Get Component; Initialize Component; Process Event Wrapper Processing Component
4
Components Components have to implement a set of abstract functions Return ID (string) Return set of required input data types Return produced output data type(s) Process one event Close to what is needed for Pub/Sub components But simplified One global instance of each component has to be present in shared component library Automagic registration with global component handler object
5
AliRoot AliRoot code accesses classes in component handler shared library Obtains component objects from handler class Accesses component objects directly Component Handler Shared Library Register Component HLT TPC Shared Library Global Clusterfinder Object Global Tracker Object Component Handler C++ Class Load Library Clusterfinder C++ Class Tracker C++ Class Initialize ProcessEvent AliRoot (Load Component Library;) Initialize Components; Get Components
6
Publisher/Subscriber
Pub/Sub Framework uses ONE wrapper component Accesses handler and components via C wrapper API Can call multiple components in different libraries One component per wrapper instance Component Handler Shared Library Register Component HLT TPC Shared Library Global Clusterfinder Object Global Tracker Object Component Handler C++ Class Load Library (Load Comp. Lib.;) Get Component; Initialize Component Component Handler C Wrapper Functions Clusterfinder C++ Class Tracker C++ Class Initialize ProcessEvent Pub/Sub Framework (Load Component Library;) Get Component; Initialize Component; Process Event Wrapper Processing Component
7
Publisher/Subscriber
Initialization Sequence : AliRootWrapperSubscriber : C Wrapper : AliHLTComponentHandler : AliHLTComponent
8
Publisher/Subscriber
Processing Sequence : AliRootWrapperSubscriber : C Wrapper : AliHLTComponentHandler : AliHLTComponent
9
Publisher/Subscriber
Termination Sequence : AliRootWrapperSubscriber : C Wrapper : AliHLTComponentHandler : AliHLTComponent
10
Current Status Basic Implementation Done Base library with ComponentHandler and Component base class implemented Pub/Sub wrapper component done and working HLT TPC reconstruction code ported and working Basic AliRoot HLT Configuration scheme implemented Ongoing work on integration of the ComponentHandler into the data processing scheme of AliRoot
11
ClusterFinder Benchmarks
pp – Events 14 TeV , 0.5 T Number of Events: 1200 Iterations: 100 TestBench: SimpleComponentWrapper TestNodes: HD ClusterNodes e304, e307 (PIII, 733 MHz) HD ClusterNodes e106, e107 (PIII, 800 MHz) HD GatewayNode alfa (PIII, 1.0 GHz) HD ClusterNode eh001 (Opteron, 1.6 GHz) CERN ClusterNode eh000 (Opteron, 1.8 GHz)
12
Cluster Distribution
13
Signal Distribution
14
File Size Distribution
15
Total Distributions
16
Padrows & Pads per Patch
17
Basic Results 13797 4434 42 Average Per patch , event 13437 4264 35 3
12233 3985 44 2 17525 5683 61 1 13312 4149 23 5 13384 4210 29 4 12892 4313 60 Filesize [Byte] # Signals # Cluster Patch
18
Timing Results 6,54 6,01 4,81 3,99 2,90 Patch 5 [ms] 6,90 6,61 6,67
6,14 8,82 6,57 PIII 733MHz 6,33 6,06 6,12 5,64 8,10 6,04 PIII 800 MHz 5,11 4,87 4,90 4,51 6,65 4,95 PIII 1,0 GHz 4,13 3,94 3,98 3,66 5,32 3,96 Opteron 1,8 GHz 3,06 2,93 2,96 2,73 3,92 Opteron 1,6 GHz Average Patch 4 Patch 3 Patch 2 Patch 1 Patch 0 CPU Xeon IV 3.2 GHz 2.11 2.79 1.98 2.14 2.13 2.21
19
Timing Results
20
Timing Results Memory streaming benchmarks: 1.6 GHz Opteron system ca. 4.3 GB/s 1.8 GHz Opteron system ca. 3 GB/s Reason for performance drop of 1.8 GHz system compared to 1.6 GHz system Cause of memory performance difference unknown, currently being investigated Maybe related to NUMA parameter (cf. slice 23)
21
Tracker Timing Results
Slice tracker, average times per slice Opteron 1.8 GHz (Dual MP, Dual Core): 1 Process: ca. 3.6 ms/slice Independent of CPU 2 Processes, different chips: ca. factor 1 2 Processes, same chip, different cores: ca. factor 1.75 4 Processes, all cores: ca. factor 1.83 Xeon 3.2 GHz (Dual MP, HyperThreading: Mapping to CPUs unknown for more than 1 process 1 Process: ca ms/slice 2 Processes: ca. factor 2 slower 3 Processes: ca. factor 3.5 slower
22
Timing Results – Opteron Memory
Floating Point/Memory Microbenchmarks CPU loop: No effect w. multiple processes Linear memory read: Almost no effect Random memory read: Runtime Factors: 1.33, 1.01, 1.43 Linear memory read and write: Runtime Factors: 1.57, 1.12, 2.31 Random memory read and write: Runtime Factors: 1.91, 1.92, 2.78 Linear memory write: Runtime Factors: 1.71, 1.72, 3.48 Random memory write: Runtime Factors: 1.97, 1.90, 3.76 Runtime Factors are for two processes on same chip, two processes on different chips, four processes on all cores, relative to single process
23
Timing Results – Opteron Memory
Floating Point/Memory Microbenchmarks Memory results, in particular memory writes, likely explanation f. tracker behaviour Tasks Examine system memory parameters (e.g. BIOS, Linux kernel) One critical parameter: Kernel NUMA-Awareness found not activated Re-evaluate/optimize tracker code wrt. memory writes Likely problem already found, Conformal Mapping uses large memory array with widely (quasi) randomly distributed write and read accesses Lesson for system procurement If possible evaluate systems/architectures wrt. pure performance AND scalability
24
Mainboard prices comparable ca. 350-450 for dual MP, dual core capable
Price Comparison Opteron 1.8 GHz: Single Core: ca. 180,- € Dual Core: ca. 350,- € Xeon 3.2 GHz: Single Core: ca. 330,- € Mainboard prices comparable ca for dual MP, dual core capable For Opterons, per core prizes for full systems: Assumption: 1 GB memory per core 1.8 GHz Single/Dual Core, Dual MP: ca. 800/600,- € 2.4 GHz Single/Dual Core, Dual MP: ca. 1000/880,- € 2.4 GHz Single/Dual Core, Quad MP: ca. 1700/1250,- €
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.