AddTask(findTracks); Use for track finding findTracks->UseFinder(trackFinder);"> AddTask(findTracks); Use for track finding findTracks->UseFinder(trackFinder);">

Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Similar presentations


Presentation on theme: "Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,"— Presentation transcript:

1 Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel, Reinhard Männer Computer Engineering, University of Mannheim Contents –Status CBMROOT –Realisation in hardware –Outlook

2 Christian Steinle, University of Mannheim, Institute of Computer Engineering2 Status CBMROOT The code is working in the actual CBMROOT framework cbmroot/parameters/htrack contains all table files table files transform hit signatures into priority classes cbmroot/macro contains two simulation and two reconstruction macros cbmroot/htrack contains all source code files

3 Christian Steinle, University of Mannheim, Institute of Computer Engineering3 Status CBMROOT cbmroot/macro Important macro entries: Load library: gSystem->Load("libHTrack"); Create objects CbmStsFindTracks* findTracks = new CbmStsFindTracks(iVerbose, NULL, kFALSE, "STS Track Finder"); CbmHoughStsTrackFinder* trackFinder = new CbmHoughStsTrackFinder(); Set task fRun->AddTask(findTracks); Use for track finding findTracks->UseFinder(trackFinder);

4 Christian Steinle, University of Mannheim, Institute of Computer Engineering4 Status CBMROOT Documentation A doxygen documentation is ready in the source code A howTo documentation is in review. It contains: Main class description with constructors Algorithm configuration via ASCII configuration file parameter name, meaning, standard value, value range, value format, links to other related parameters Signature definition via ASCII table files Automated generation algorithms Major algorithm definitions in the source code Peak finding definitions like, for example, the window type or size Enabling/disabling of analysis Example scripts

5 Christian Steinle, University of Mannheim, Institute of Computer Engineering5 Realisation in hardware Environment Data: 10 7 events/s with 20000 hits lead to 2*10 11 hits/s 1 hit is encoded with 32bit lead to 32 bit/hit  Data rate = 2*10 11 hits/s * 32 bit/hit = 6,4*10 12 bit/s = 6,4Tbit/s Network: (10 Gbit/s)/link  Number of links = (6,4 Tbit/s) / (10 Gbit/s) / link = 640 links FPGA: Process 1 hit / clock cycle with (10 Gbit/s)/link and 32 bit/hit  Clock = (10 Gbit/s) / (32 bit/hit) / 1 hit = 312,5*10 6 1/s = 312,5 MHz

6 Christian Steinle, University of Mannheim, Institute of Computer Engineering6 Realisation in hardware Up to now: single-chip FPGA implementation HBuffer LBuffer Histogram Layer

7 Christian Steinle, University of Mannheim, Institute of Computer Engineering7 Realisation in hardware Planned: multi-chip FPGA implementation Multi Chip

8 Christian Steinle, University of Mannheim, Institute of Computer Engineering8 Realisation in hardware Planned: multi-chip FPGA implementation Just relocated HBuffer

9 Christian Steinle, University of Mannheim, Institute of Computer Engineering9 Realisation in hardware Planned: multi-chip FPGA implementation No HBuffer needed, if enough processors for all histogram layers exist

10 Christian Steinle, University of Mannheim, Institute of Computer Engineering10 Realisation in hardware Up to now: single-chip FPGA timing

11 Christian Steinle, University of Mannheim, Institute of Computer Engineering11 Realisation in hardware Planned: multi-chip FPGA timing 248 (Speedup:19) 312 (Speedup: 4) 245 (Speedup: 4)

12 Christian Steinle, University of Mannheim, Institute of Computer Engineering12 Realisation in hardware Up to now: single-chip FPGA ressources PRELUT: input: 20 bits (xy: 17, z: 3); output: γ min and γ max (2 x 8 bit)  1 x (1M x 16) bits external RAM LUT: input: 20 bits (xy: 17, z: 3); output: startPos and houghCmd (7 + 29 bit)  2 x (1M x 18) bits external RAM HBuffer: entry: γ max, inputLUT and previousListAddress (8 + 20 + 15 bit) memory for 32k entries with 45 bits due to Blockram scalability  80 Blockram, 500 + about 5000 logic cells Histogram: 30.000 logic cells Peak finding: estimated 5000 logic cells LUT access: estimated 5000 logic cells  Ressources: 45500 logic cells, 80 dual-ported Blockram and 7MB RAM  1 x Xilinx Virtex 4 XC4VFX60

13 Christian Steinle, University of Mannheim, Institute of Computer Engineering13 Realisation in hardware Planned: multi-chip FPGA ressources Version1 (Histogram with registers) MasterIn: PRELUT, LUT  7 MB RAM and 5.000 logic cells Processing Units: Histogramming, Encoding, Diagonalization, 2D Peak finding  30.000 logic cells per histogram layer MasterOut: 3D Peak finding  5.000 logic cells  MasterIn: 1 x Xilinx Virtex 4 XC4VFX12  Processing Units: 64 x Xilinx Virtex 4 XC4VFX100  MasterOut: 1 x Xilinx Virtex 4 XC4VFX12

14 Christian Steinle, University of Mannheim, Institute of Computer Engineering14 Realisation in hardware Planned: multi-chip FPGA ressources Version2 (Histogram with Blockrams) MasterIn: PRELUT, LUT  7 MB RAM and 5.000 logic cells Processing Units: Histogramming, Encoding, Diagonalization, 2D Peak finding  31 x 2kB Blockram per layer MasterOut: 3D Peak finding  5.000 logic cells  MasterIn: 1 x Xilinx Virtex 4 XC4VFX12  Processing Units: 16 x Xilinx Virtex 4 XC4VFX100  MasterOut: 1 x Xilinx Virtex 4 XC4VFX12

15 Christian Steinle, University of Mannheim, Institute of Computer Engineering15 Realisation in hardware Planned: multi-chip FPGA ressources Version3 (Histogram with Blockrams and registers) MasterIn: PRELUT, LUT  7 MB RAM and 5.000 logic cells Processing Units: Histogramming, Encoding, Diagonalization, 2D Peak finding  31 x 2kB Blockram per layer or 30.000 logic cells MasterOut: 3D Peak finding  5.000 logic cells  MasterIn: 1 x Xilinx Virtex 4 XC4VFX12  Processing Units: 14 x Xilinx Virtex 4 XC4VFX100  MasterOut: 1 x Xilinx Virtex 4 XC4VFX12

16 Christian Steinle, University of Mannheim, Institute of Computer Engineering16 Realisation in hardware Estimation Data rate: 6,4Tbit/s with 20000 hits/event Network: 640 * (10 Gbit/s)/link FPGA: 32 bit/hit with 312,5 MHz single chip: Minimal pipeline stall: 76400 clock cycles No streamlined processing is possible Five Hough transform units for one data link lead to 3200 units multi chip: Minimal pipeline stall: #(histogram dim2) = 31 clock cycles Accept just 19969 hits and discard leading or trailing 31 hits Direct streamlined processing is possible One Hough transform unit for one data link lead to 640 units Processing time speed up: 19 Hardware: at least 16 chips (14 x XC4VFX100 and 2 x XC4VFX12)

17 Christian Steinle, University of Mannheim, Institute of Computer Engineering17 Realisation in hardware Multi-chip FPGA vs. Cell implementation Multi-chip FPGA Cell Processor A Cell processor can be used to develop concepts for a multi-chip FPGA Implementation Cheap and rapid prototyping with a Sony Playstation 3

18 Christian Steinle, University of Mannheim, Institute of Computer Engineering18 Realisation in hardware Cell Processor 1 PowerPC 64 bit architecture 32 kB L1 Cache 512 kB L2 Cache 8 Synergetic Processing Elements (SPE) 128 registers with 128 bit ALU with 128 bit SIMD 256 kB local memory Memory Flow Controller (MFC) with DMA transfer possibility 1 XDR-Ram with up to 4,5GB Handles the LUT processing, the job distribution and the 3D peak finding Handles the Histogramming, Encoding, Diagonalization and 2D peak finding Memory for the LUTs, the HBuffer unit and the LBuffer unit

19 Christian Steinle, University of Mannheim, Institute of Computer Engineering19 Outlook A manual documentation is in process A thesis documentation is in process Additional analysis in software are in process Development of PS3 (Cell) – source code is in process single-chip FPGA concept + Cell concepts = multi-chip FPGA


Download ppt "Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,"

Similar presentations


Ads by Google