M. Tiemens The hardware implementation of the cluster finding algorithm and the Burst Building Network.

Slides:

Advertisements

Similar presentations

1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.

Advertisements

Divide and Conquer Chapter 6. Divide and Conquer Paradigm Divide the problem into sub-problems of smaller sizes Conquer by recursively doing the same.

Neural Networks Basic concepts ArchitectureOperation.

1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Spring 2004 Spring 2004 Virtex II-Pro Dynamical Test Application Part.

An ATCA and FPGA-Based Data Processing Unit for PANDA Experiment H.XU, Z.-A. LIU,Q.WANG, D.JIN, Inst. High Energy Physics, Beijing, W. Kühn, J. Lang, S.

Trigger-less and reconfigurable data acquisition system for J-PET

Introduction to the Raw Handheld Board Jason Miller, David Wentzlaff, Nathan Shnidman.

Hardware.  Learn what hardware is  Learn different input and output devices  Learn what the CPU is.

Straw electronics Straw Readout Board (SRB). Full SRB - IO Handling 16 covers – Input 16*2 links 400(320eff) Mbits/s Control – TTC – LEMO – VME Output.

CS 1308 – Computer Literacy and the Internet. It’s Not Magic  The goal of the next series of lectures is to show you exactly how a computer works. 

Prototype of the Global Trigger Processor GlueX Collaboration 22 May 2012 Scott Kaneta Fast Electronics Group.

1 VeLo L1 Read Out Guido Haefeli VeLo Comprehensive Review 27/28 January 2003.

CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale.

1 RapidIO Testbed Update Chris Conger Honeywell Project 1/25/2004.

Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.

KIP Ivan Kisel JINR-GSI meeting Nov 2003 High-Rate Level-1 Trigger Design Proposal for the CBM Experiment Ivan Kisel for Kirchhoff Institute of.

1 Computer Communication & Networks Lecture 21 Network Layer: Delivery, Forwarding, Routing Waleed.

FPGA firmware of DC5 FEE. Outline List of issue Data loss issue Command error issue (DCM to FEM) Command lost issue (PC with USB connection to GANDALF)

1 Data Link Layer Lecture 23 Imran Ahmed University of Management & Technology.

CLAS12 Trigger System part1: Level1 EC Revision 1, Feb 7, 2007.

XLV INTERNATIONAL WINTER MEETING ON NUCLEAR PHYSICS Tiago Pérez II Physikalisches Institut For the PANDA collaboration FPGA Compute node for the PANDA.

Monitoring for the ALICE O 2 Project 11 February 2016.

Introduction to DAQ Architecture Niko Neufeld CERN / IPHE Lausanne.

ROM. ROM functionalities. ROM boards has to provide data format conversion. – Event fragments, from the FE electronics, enter the ROM as serial data stream;

COMPASS DAQ Upgrade I.Konorov, A.Mann, S.Paul TU Munich M.Finger, V.Jary, T.Liska Technical University Prague April PANDA DAQ/FEE WS Игорь.

DHH Status Igor Konorov TUM, Physics Department, E18 PXD DAQ workshop Münzenberg –June 9-10, 2011.

ROD Activities at Dresden Andreas Glatte, Andreas Meyer, Andy Kielburg-Jeka, Arno Straessner LAr Electronics Upgrade Meeting – LAr Week September 2009.

FPGA Helix Tracking Algorithm for PANDA Yutie Liang, Martin Galuska, Thomas Geßler, Wolfgang Kühn, Jens Sören Lange, David Münchow, Björn Spruck, Milan.

Trigger for MEG2: status and plans Donato Nicolo` Pisa (on behalf of the Trigger Group) Lepton Flavor Physics with Most Intense DC Muon Beam Fukuoka, 22.

FPGA Helix Tracking Algorithm for PANDA Yutie Liang, Martin Galuska, Thomas Geßler, Jifeng Hu, Wolfgang Kühn, Jens Sören Lange, David Münchow, Björn Spruck,

FPGA Helix Tracking Algorithm with STT Yutie Liang, Martin Galuska, Thomas Geßler, Wolfgang Kühn, Jens Sören Lange, David Münchow, Milan Wagner II. Physikalisches.

FPGA Helix Tracking Algorithm -- VHDL implementation Yutie Liang, Martin Galuska, Thomas Geßler, Wolfgang Kühn, Jens Sören Lange, David Münchow, Björn.

IRFU The ANTARES Data Acquisition System S. Anvar, F. Druillole, H. Le Provost, F. Louis, B. Vallage (CEA) ACTAR Workshop, 2008 June 10.

M. Tiemens, M. Kavatsyuk KVI, University of Groningen EMC Front-End Electronics for the Preassembly.

Building a new clustering algorithm E. Czerwinski, B. Di Micco, P. Moskal.

HTCC coffee march /03/2017 Sébastien VALAT – CERN.

FPGA Helix Tracking Algorithm with STT

Giovanna Lehmann Miotto CERN EP/DT-DI On behalf of the DAQ team

LHCb and InfiniBand on FPGA

Data and Control link via GbE

Modeling event building architecture for the triggerless data acquisition system for PANDA experiment at the HESR facility at FAIR/GSI Krzysztof Korcyl.

- PANDA EMC Readout System

Local Area Networks Honolulu Community College

PANDA collaboration meeting FEE session

ETD summary D. Breton, S.Luitz, U.Marconi

- PANDA EMC Readout System

The Implementation of the Burst Building Network for the EMC

ZCU102 (Xilinx Zynq Ultrascale+

The Data Handling Hybrid

3rd European Nuclear Physics Conference

CoBo - Different Boundaries & Different Options of

Commissioning of the ALICE HLT, TPC and PHOS systems

IP Forwarding Covers the principles of end-to-end datagram delivery in IP networks.

Update on the FPGA Online Tracking Algorithm

Distributed Clustering for Online Event Reconstruction

Study of T0 Extraction in the Online Tracking Algorithm

Emanuele Leonardi PADME General Meeting - LNF January 2017

(Read Forouzan Chapters 6 and 7)) IP Forwarding Procedure

Hubs Hubs are essentially physical-layer repeaters:

IP Forwarding Relates to Lab 3.

IP Forwarding Relates to Lab 3.

Example of DAQ Trigger issues for the SoLID experiment

IP Forwarding Relates to Lab 3.

PAVIRO - Introduction Security Systems

Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform

Event Building With Smart NICs

IP Forwarding Relates to Lab 3.

Chapter 3 Part 3 Switching and Bridging

The LHCb L0 Calorimeter Trigger

Reconfigurable Computing (EN2911X, Fall07)

Presentation transcript:

M. Tiemens The hardware implementation of the cluster finding algorithm and the Burst Building Network

M. Tiemens | 2 Recap Map EMC Data Concentrator

Distributed Cluster Finding M. Tiemens | 3 Distributed Cluster Finding 𝑝 ℎ 𝑐 𝜂 𝑐 𝜂 𝜋 0 𝛾 1 Form preclusters: 5000 events @200 kHz 2 Merge preclusters: (if needed) 3 (Set properties of completed clusters)

Distributed Cluster Finding M. Tiemens | 4 Distributed Cluster Finding 𝑝 ℎ 𝑐 𝜂 𝑐 𝜂 𝜋 0 𝛾 Relative nr of events reconstructed: Default: Online: Distributed: 29.4% 28.7% 23.9% CPU processing time (relative to Default): Default: Online: Distributed: 5000 events @200 kHz 1 1.19 0.93 (0.05 + 0.88)

VHDL Implementation Test Setup M. Tiemens | 5 VHDL Implementation Test Setup DC SFP Kintex 7 Test Board SFP Compute Node Gb Ethernet Data (4 DC’s) PC USB

Comparison Between PandaRoot and VHDL M. Tiemens | 6 Comparison Between PandaRoot and VHDL Mapped X-position PC USB Data (4 DC’s) SFP Compute Node Gb Ethernet DC Kintex 7 Test Board

Data Structure 64-bit words PRECLUSTER1 | X | Y | D | t | NR_OF_HITS M. Tiemens | 7 Data Structure 64-bit words X, Y : Position in 2D Map D : Diameter of Precluster t : Time E : Energy PRECLUSTER1 | X | Y | D | t | NR_OF_HITS HIT1 | t | E | ch. nr HIT2 | t | E | ch. nr … CLUSTER1 | x | y | z | t | E | NR_OF_HITS HIT1 | t | E | ch. nr HIT2 | t | E | ch. nr …

What if a Photon Hits the Edge? M. Tiemens | 8 What if a Photon Hits the Edge? Detection: Energy-weighted position is close to edge of map EDGE 1 Discard these clusters Options: 2 Apply correction factor to E 3 Use complicated neighbour relation to edge of other map

M. Tiemens | 9 All right, so what does this all mean for the data flow, and the processing thereof?

Data Collection EMC BwEndcap + Barrel # Rate /device (Gbps) M. Tiemens | 10 Data Collection EMC BwEndcap + Barrel # Rate /device (Gbps) Total rate (Gbps) FwEndcap Rate/device (Gbps) SADCs <10 kHz 572 0.05 28.6 SADCs 300 kHz 217 1.2 260.4 SADCs >10, <50 kHz 512 0.27 138.24 SADCs >50 kHz 112 0.6 67.2 Total 1196 234.04

Assuming worst case where each hit becomes a cluster M. Tiemens | 11 BwEndcap Barrel FwEndcap 0101 1196 digitisers Total rate: 234 Gbps 217 digitisers Total rate: 260 Gbps BwEndcap + Barrel # Rate /device (Gbps) Total rate (Gbps) FwEndcap Rate/device (Gbps) SADCs <10 kHz 572 0.05 28.6 SADCs 300 kHz 217 1.2 260.4 SADCs >10, <50 kHz 512 0.27 138.24 SADCs >50 kHz 112 0.6 67.2 Total 1196 234.04 16 inputs 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 3 → 1 Assuming worst case where each hit becomes a cluster

Output Produced by Data Concentrators (DCs) M. Tiemens | 12 Output Produced by Data Concentrators (DCs) 0101 1196 digitisers Total rate: 234 Gbps 217 digitisers Total rate: 260 Gbps 3 → 1 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps Now what?

Processing the DC Output M. Tiemens | 13 Processing the DC Output 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps Now what? 3 Options for the BBN*: 1 Throw everything into one big data collection network, which may or may not do more advanced processing (at least collect and sort data). 2 Do some data merging, then do 1. 3 Collect everything in a switch, and let one big FPGA board do the rest. *Burst Building Network, PANDA’s data collection and processing network

 Processing the DC Output 1 3 Options for the BBN*: 1 2 3 1 2 3 M. Tiemens | 14 Processing the DC Output 1 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps 3 Options for the BBN*: 1 Throw everything into one big data collection network, which may or may not do more advanced processing (at least collect and sort data). 2 Do some data merging, then do 1. 3 Collect everything in a switch, and let one big FPGA board do the rest. Switch 63 optical links @16 Gbps Smart Network Sort 1 + Merge preclusters 2 + Kick low-E clusters* 3  Repeat until entire EMC covered

Topology of the Network M. Tiemens | 15 Sort Merge Preclusters Kick Low-E Clusters* Topology of the Network 1 63 optical links @16 Gbps 32 Optical links (if 1+2+3) 63 Optical links (if 0 or 1) 508 Gbps (if 1+2+3) 1 Tbps (if 0 or 1)

Topology of the Network M. Tiemens | 16 Topology of the Network 1 Network 32 links /63 links Compute Nodes Procedure (after sorting): Make timebunches: Send 1st n timebunches to n FPGAs on 1st CN Send 2nd n timebunches to n FPGAs on 2nd CN Etc until mth CN, then send to 1st CN again and repeat from 2.

Processing the DC Output Topology of the Network M. Tiemens | 17 Processing the DC Output 3 Options for the BBN: 1 Throw everything into one big data collection network, which may or may not do more advanced processing (at least collect and sort data). 2 Do some data merging, then do 1. 3 Collect everything in a switch, and let one big FPGA board do the rest. Topology of the Network Network 32 links /63 links Compute Nodes Procedure (after sorting): Make timebunches: Send 1st n timebunches to n FPGAs on 1st CN Send 2nd n timebunches to n FPGAs on 2nd CN Etc until mth CN, then send to 1st CN again and repeat from 2.

Processing the DC Output M. Tiemens | 18 Processing the DC Output 2 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps 3 Options for the BBN: 1 Throw everything into one big data collection network, which may or may not do more advanced processing (at least collect and sort data). 2 Do some data merging, then do 1. 3 Collect everything in a switch, and let one big FPGA board do the rest. 2 per device 3 per device Combine data from 4 DCs Combine data from 3 DCs 8 inputs 8 outputs 9 inputs 7 outputs 7 Network DCs Total rate: 293 Gbps 5 Network DCs Total rate: 325 Gbps 19 optical links @16 Gbps 21 optical links (Uncombined: 63 links @988 Gbps)

Processing the DC Output M. Tiemens | 19 Processing the DC Output 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps 2 per device 3 per device 8 inputs 8 outputs 9 inputs 7 outputs Combine data from 4 DCs Combine data from 3 DCs 7 Network DCs Total rate: 293 Gbps 5 Network DCs Total rate: 325 Gbps 19 optical links @16 Gbps 21 optical links 3 Options for the BBN: 1 Throw everything into one big data collection network, which may or may not do more advanced processing (at least collect and sort data). 2 Do some data merging, then do 1. 3 Collect everything in a switch, and let one big FPGA board do the rest.

Processing the DC Output M. Tiemens | 20 Processing the DC Output 3 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps 3 Options for the BBN: 1 Throw everything into one big data collection network, which may or may not do more advanced processing (at least collect and sort data). 2 Do some data merging, then do 1. 3 Collect everything in a switch, and let one big FPGA board do the rest. 32 Gbps Switch 31 optical links @32 Gbps Xilinx UltraScale+ VU35P (64x 32Gbps transceivers) VU31P: Overflow if VU35P occupied 16 optical links @32 Gbps

Processing the DC Output M. Tiemens | 21 Processing the DC Output 3 25 Data Concentrators Total rate: 468 Gbps 14 Data Concentrators Total rate: 520 Gbps 50 optical links @10 Gbps 42 optical links @12 Gbps 32 Gbps Switch 31 optical links @32 Gbps Xilinx UltraScale+ VU35P (64x 32Gbps transceivers) VU31P: Overflow if VU35P occupied 16 optical links 32 Gbps Switch 32 Optical links @16 Gbps

Summary and Outlook 1 a b 2 Recommendation: M. Tiemens | 22 Summary and Outlook 1 Hardware implementation of clustering has been made: a Results agree with PandaRoot sim. b → Check with FPGA specifications to explore possibilities. 2 Several options for the BBN are being explored. 1 Use Option 3 (big FPGA board). Recommendation: 2 If not possible (see 1b), use Option 2 (do data merging, then feed to BBN).

Open Question to Collaboration: M. Tiemens | 23 Open Question to Collaboration: What do other subsystems require of the BBN?