Fast Parallel Event Reconstruction

Slides:

Advertisements

Similar presentations

DEVELOPMENT OF ONLINE EVENT SELECTION IN CBM DEVELOPMENT OF ONLINE EVENT SELECTION IN CBM I. Kisel (for CBM Collaboration) I. Kisel (for CBM Collaboration)

Advertisements

Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters

Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.

L1 Event Reconstruction in the STS I. Kisel GSI / KIP CBM Collaboration Meeting Dubna, October 16, 2008.

ALICE TPC Online Tracking on GPU David Rohr for the ALICE Corporation Lisbon.

ALICE HLT High Speed Tracking and Vertexing Real-Time 2010 Conference Lisboa, May 25, 2010 Sergey Gorbunov 1,2 1 Frankfurt Institute for Advanced Studies,

Porting Reconstruction Algorithms to the Cell Broadband Engine S. Gorbunov 1, U. Kebschull 1,I. Kisel 1,2, V. Lindenstruth 1, W.F.J. Müller 2 S. Gorbunov.

Computing Platform Benchmark By Boonyarit Changaival King Mongkut’s University of Technology Thonburi (KMUTT)

Panda: MapReduce Framework on GPU’s and CPU’s

GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.

Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.

CA+KF Track Reconstruction in the STS I. Kisel GSI / KIP CBM Collaboration Meeting GSI, February 28, 2008.

KIP TRACKING IN MAGNETIC FIELD BASED ON THE CELLULAR AUTOMATON METHOD TRACKING IN MAGNETIC FIELD BASED ON THE CELLULAR AUTOMATON METHOD Ivan Kisel KIP,

Online Event Selection in the CBM Experiment I. Kisel GSI, Darmstadt, Germany RT'09, Beijing May 12, 2009.

Event Reconstruction in STS I. Kisel GSI CBM-RF-JINR Meeting Dubna, May 21, 2009.

Online Track Reconstruction in the CBM Experiment I. Kisel, I. Kulakov, I. Rostovtseva, M. Zyzak (for the CBM Collaboration) I. Kisel, I. Kulakov, I. Rostovtseva,

CA tracker for TPC online reconstruction CERN, April 10, 2008 S. Gorbunov 1 and I. Kisel 1,2 S. Gorbunov 1 and I. Kisel 1,2 ( for the ALICE Collaboration.

Many-Core Scalability of the Online Event Reconstruction in the CBM Experiment Ivan Kisel GSI, Germany (for the CBM Collaboration) CHEP-2010 Taipei, October.

Helmholtz International Center for CBM – Online Reconstruction and Event Selection Open Charm Event Selection – Driving Force for FEE and DAQ Open charm:

MS Thesis Defense “IMPROVING GPU PERFORMANCE BY REGROUPING CPU-MEMORY DATA” by Deepthi Gummadi CoE EECS Department April 21, 2014.

Status of the L1 STS Tracking I. Kisel GSI / KIP CBM Collaboration Meeting GSI, March 12, 2009.

Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |

Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia.

Fast reconstruction of tracks in the inner tracker of the CBM experiment Ivan Kisel (for the CBM Collaboration) Kirchhoff Institute of Physics University.

Introducing collaboration members – Korea University (KU) ALICE TPC online tracking algorithm on a GPU Computing Platforms – GPU Computing Platforms Joohyung.

Fast track reconstruction in the muon system and transition radiation detector of the CBM experiment at FAIR Andrey Lebedev 1,2 Claudia Höhne 3 Ivan Kisel.

Status of Reconstruction in CBM

KIP Ivan Kisel JINR-GSI meeting Nov 2003 High-Rate Level-1 Trigger Design Proposal for the CBM Experiment Ivan Kisel for Kirchhoff Institute of.

Standalone FLES Package for Event Reconstruction and Selection in CBM DPG Mainz, 21 March 2012 I. Kisel 1,2, I. Kulakov 1, M. Zyzak 1 (for the CBM.

Track Finding based on a Cellular Automaton Ivan Kisel Kirchhoff-Institut für Physik, Uni-Heidelberg Tracking Week, GSI January 24-25, 2005 KIP.

First Level Event Selection Package of the CBM Experiment S. Gorbunov, I. Kisel, I. Kulakov, I. Rostovtseva, I. Vassiliev (for the CBM Collaboration (for.

Methods for fast reconstruction of events Ivan Kisel Kirchhoff-Institut für Physik, Uni-Heidelberg FutureDAQ Workshop, München March 25-26, 2004 KIP.

CBM Simulation Walter F.J. Müller, GSI CBM Simulation Week, May 10-14, 2004 Tasks and Concepts.

Track reconstruction in TRD and MUCH Andrey Lebedev Andrey Lebedev GSI, Darmstadt and LIT JINR, Dubna Gennady Ososkov Gennady Ososkov LIT JINR, Dubna.

Global Tracking for CBM Andrey Lebedev 1,2 Ivan Kisel 1 Gennady Ososkov 2 1 GSI Helmholtzzentrum für Schwerionenforschung GmbH, Darmstadt, Germany 2 Laboratory.

Cellular Automaton Method for Track Finding (HERA-B, LHCb, CBM) Ivan Kisel Kirchhoff-Institut für Physik, Uni-Heidelberg Second FutureDAQ Workshop, GSI.

Muon detection in the CBM experiment at FAIR Andrey Lebedev 1,2 Claudia Höhne 1 Ivan Kisel 1 Anna Kiseleva 3 Gennady Ososkov 2 1 GSI Helmholtzzentrum für.

Monday, January 24 TimeTitleSpeaker 10:00 – 10:30The CBM experimentV. Friese (GSI) 10:30 – 11:00CBM Silicon Pixel Vertex DetectorM. Deveaux (GSI/IReS)

CA+KF Track Reconstruction in the STS S. Gorbunov and I. Kisel GSI/KIP/LIT CBM Collaboration Meeting Dresden, September 26, 2007.

A parallel High Level Trigger benchmark (using multithreading and/or SSE)‏ Håvard Bjerke.

Kalman Filter based Track Fit running on Cell S. Gorbunov 1,2, U. Kebschull 2, I. Kisel 2,3, V. Lindenstruth 2 and W.F.J. Müller 1 1 Gesellschaft für Schwerionenforschung.

Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.

1/13 Future computing for particle physics, June 2011, Edinburgh A GPU-based Kalman filter for ATLAS Level 2 Trigger Dmitry Emeliyanov Particle Physics.

Parallelization of the SIMD Kalman Filter for Track Fitting R. Gabriel Esteves Ashok Thirumurthi Xin Zhou Michael D. McCool Anwar Ghuloum Rama Malladi.

Some GPU activities at the CMS experiment Felice Pantaleo EP-CMG-CO EP-CMG-CO 1.

GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.

Fast parallel tracking algorithm for the muon detector of the CBM experiment at FAIR Andrey Lebedev 1,2 Claudia Höhne 1 Ivan Kisel 1 Gennady Ososkov 2.

Parallel Implementation of the KFParticle Vertexing Package for the CBM and ALICE Experiments Ivan Kisel 1,2,3, Igor Kulakov 1,4, Maksym Zyzak 1,4 1 –

LIT participation LIT participation Ivanov V.V. Laboratory of Information Technologies Meeting on proposal of the setup preparation for external beams.

KFParticle Ivan Kisel 1,2, Igor Kulakov 1,3, Iourii Vassiliev 1,2, Maksym Zyzak 1,3 1 – Goethe-Universität Frankfurt, Frankfurt am Main, Germany 2 – GSI.

GPU's for event reconstruction in FairRoot Framework Mohammad Al-Turany (GSI-IT) Florian Uhlig (GSI-IT) Radoslaw Karabowicz (GSI-IT)

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

M. Bellato INFN Padova and U. Marconi INFN Bologna

Y. Fisyak1, I. Kisel2, I. Kulakov2, J. Lauret1, M. Zyzak2

FPGAs for next gen DAQ and Computing systems at CERN

New Track Seeding Techniques at the CMS experiment

GPU-based iterative CT reconstruction

CS427 Multicore Architecture and Parallel Computing

Enabling Effective Utilization of GPUs for Data Management Systems

Kalman filter tracking library

ALICE – First paper.

ALICE HLT tracking running on GPU

TPC reconstruction in the HLT

Graphics Processing Unit

I. Vassiliev, V. Akishina, I.Kisel and

NVIDIA Fermi Architecture

Silicon Tracking with GENFIT

6- General Purpose GPU Programming

Multicore and GPU Programming

Perspectives on strangeness physics with the CBM experiment at FAIR

Presentation transcript:

Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, 06 July 2010

Tracking Challenge in CBM (FAIR/GSI, Germany) Fixed-target heavy-ion experiment 107 Au+Au collisions/s 1000 charged particles/collision Non-homogeneous magnetic field Double-sided strip detectors (85% combinatorial space points) Track reconstruction in STS/MVD and displaced vertex search are required in the first trigger level. Reconstruction packages: track finding Cellular Automaton (CA) track fitting Kalman Filter (KF) vertexing KF Particle 06 July 2010, CERN Ivan Kisel, GSI

Many-Core HPC: Cores, Threads and SIMD HEP: cope with high data rates ! 2015 Cores and Threads realize the task level of parallelism Process Thread1 Thread2 … … exe r/w r/w exe ... ... 2010 CPU Thread 2000 Core Scalar Vector D S Cores Threads SIMD Width Performance Fundamental redesign of traditional approaches to data processing is necessary Vectors (SIMD) = data level of parallelism SIMD = Single Instruction, Multiple Data 06 July 2010, CERN Ivan Kisel, GSI

Our Experience with Many-Core CPU/GPU Architectures Intel/AMD CPU NVIDIA GPU 2x4 cores Since 2008 512 cores Since 2005 6.5 ms/event (CBM) 63% of the maximal GPU utilization (ALICE) Intel MICA IBM Cell Since 2008 32 cores Since 2006 1+8 cores Cooperation with Intel (ALICE/CBM) 70% of the maximal Cell performance (CBM) Future systems are heterogeneous 06 July 2010, CERN Ivan Kisel, GSI

CPU/GPU Programming Frameworks Intel Ct (C for throughput) Extension to the C language Intel CPU/GPU specific SIMD exploitation for automatic parallelism NVIDIA CUDA (Compute Unified Device Architecture) Defines hardware platform Generic programming Explicit memory management Programming on thread level OpenCL (Open Computing Language) Open standard for generic programming Supposed to work on any hardware Usage of specific hardware capabilities by extensions Vector classes (Vc) Overload of C operators with SIMD/SIMT instructions Uniform approach to all CPU/GPU families Uni-Frankfurt/FIAS/GSI Vector classes: Cooperation with the Intel Ct group 06 July 2010, CERN Ivan Kisel, GSI

Vector classes enable easy vectorization of complex algorithms Vector Classes (Vc) Vector classes overload scalar C operators with SIMD/SIMT extensions Scalar SIMD c = a+b vc = _mm_add_ps(va,vb) Vector classes: provide full functionality for all platforms support the conditional operators phi(phi<0)+=360; Vc increase the speed by the factor: SSE2 – SSE4 4x future CPUs 8x MICA/Larrabee 16x NVIDIA Fermi research Vector classes enable easy vectorization of complex algorithms 06 July 2010, CERN Ivan Kisel, GSI

Kalman Filter Track Fit on Cell Intel P4 10000x faster on each CPU Cell Comp. Phys. Comm. 178 (2008) 374-383 The KF speed was increased by 5 orders of magnitude blade11bc4 @IBM, Böblingen: 2 Cell Broadband Engines with 256 kB Local Store at 2.4 GHz Motivated by, but not restricted to Cell ! 06 July 2010, CERN Ivan Kisel, GSI

Performance of the KF Track Fit on CPU/GPU Systems Scalabilty scalar double single -> 2 4 8 16 32 1.00 10.00 0.10 0.01 Scalability on different CPU architectures – speed-up 100 Data Stream Parallelism (10x) Task Level Parallelism (100x) 2xCell SPE (16 ) Woodcrest ( 2 ) Clovertown ( 4 ) Dunnington ( 6 ) SIMD Cores and Threads Time/Track, ms Threads Cores Threads SIMD CPU GPU Real-time performance on different Intel CPU platforms Real-time performance on NVIDIA GPU graphic cards The Kalman Filter Algorithm performs at ns level CBM Progr. Rep. 2008 06 July 2010, CERN Ivan Kisel, GSI

CBM Cellular Automaton Track Finder Problem Top view 770 Tracks Front view Fixed-target heavy-ion experiment 107 Au+Au collisions/s 1000 charged particles/collision Non-homogeneous magnetic field Double-sided strip detectors (85% combinatorial space points) Full on-line event reconstruction Intel X5550, 2x4 cores at 2.67 GHz Efficiency Scalability Highly efficient reconstruction of 150 central collisions per second 06 July 2010, CERN Ivan Kisel, GSI

Parallelization is now a Standard in the CBM Reconstruction Algorithm Vector SIMD Multi-Threading NVIDIA CUDA OpenCL Time/PC STS Detector + 6.5 ms Muon Detector 1.5 ms TRD Detector RICH Detector 3.0 ms Vertexing 10 μs Open Charm Analysis User Reco/Digi User Analysis Future Future + 2009 + 2010 The CBM reconstruction is at ms level Intel X5550, 2x4 cores at 2.67 GHz 06 July 2010, CERN Ivan Kisel, GSI

International Tracking Workshop 45 participants from Austria, China, Germany, India, Italy, Norway, Russia, Switzerland, UK and USA 06 July 2010, CERN Ivan Kisel, GSI

Workshop Program 06 July 2010, CERN Ivan Kisel, GSI

Software Evolution: Many-Core Barrier Scalar single-core OOP Many-core HPC era t 1990 2000 2010 Consolidate efforts of: Physicists Mathematicians Computer scientists Developers of parallel languages Many-core CPU/GPU producers t 1990 2000 2010 Software redesign can be synchronized between the experiments 06 July 2010, CERN Ivan Kisel, GSI

// Track Reconstruction in CBM and ALICE Forward geometry Fixed-Target Collider Cylindrical geometry ALICE (CERN) CBM (FAIR/GSI) 107 collisions/s 104 collisions/s Intel CPU 8 cores (CBM Reco Group) NVIDIA GPU 240 cores (ALICE HLT Group) Different experiments have similar reconstruction problems Track reconstruction is the most time consuming part of the event reconstruction, therefore many-core CPU/GPU platforms. Track finding is based in both cases on the Cellular Automaton method, track fitting – on the Kalman Filter method. 06 July 2010, CERN Ivan Kisel, GSI

Stages of Event Reconstruction: To-Do List Track finding Track fitting Vertex finding/fitting Ring finding (PID) Time consuming!!! Kalman Filter Combinatorics Detector/geometry independent RICH specific Track model dependent Detector dependent Generalized track finder(s) Geometry representation Interfaces Infrastructure Kalman Filter Kalman Smoother Deterministic Annealing Filter Gaussian Sum Filter Field representation 3D Mathematics Adaptive filters Functionality Physics analysis Ring finders 06 July 2010, CERN Ivan Kisel, GSI

Consolidate Efforts: Common Reconstruction Package GSI: Algorithms development Many-core optimization Uni-Frankfurt/FIAS: Vector classes GPU implementation OpenLab (CERN): Many-core optimization Benchmarking Common Reconstruction Package HEPHY (Vienna)/Uni-Gjovik: Kalman Filter track fit Kalman Filter vertex fit Intel: Ct implementation Many-core optimization Benchmarking ALICE (CERN) CBM (FAIR/GSI) STAR (BNL) PANDA (FAIR/GSI) Host Experiments: Juni 28, 2010, FIAS Ivan Kisel, GSI

Follow-up Workshop Follow-up Workshop: November 2010 – February 2011 at GSI or CERN or BNL ? 06 July 2010, CERN Ivan Kisel, GSI