Possible DAQ Upgrades DAQ1k… DAQ2k… DAQ10k!? Tonko Ljubičić STAR/BNL (for the “3L Group” — Landgraf, LeVine & Ljubičić) (Lange would fit nicely too, )

Slides:

Advertisements

Similar presentations

“L3” Jeff Landgraf. In July, Hank sent me a message asking if I had any talks for this meeting. “No, but I have some issues that could be discussed.”

Advertisements

HLT Collaboration; High Level Trigger HLT PRR Computer Architecture Volker Lindenstruth Kirchhoff Institute for Physics Chair of Computer.

June 19, 2002 A Software Skeleton for the Full Front-End Crate Test at BNL Goal: to provide a working data acquisition (DAQ) system for the coming full.

LHCb Upgrade Overview ALICE, ATLAS, CMS & LHCb joint workshop on DAQ Château de Bossey 13 March 2013 Beat Jost / Cern.

HLT - data compression vs event rejection. Assumptions Need for an online rudimentary event reconstruction for monitoring Detector readout rate (i.e.

Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.

IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)

ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.

CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.

4 Dec 2001First ideas for readout/DAQ1 Paul Dauncey Imperial College Contributions from all of UK: result of brainstorming meeting in Birmingham on 13.

Trigger-less and reconfigurable data acquisition system for J-PET

Using FPGAs with Embedded Processors for Complete Hardware and Software Systems Jonah Weber May 2, 2006.

CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)

Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.

Artdaq Introduction artdaq is a toolkit for creating the event building and filtering portions of a DAQ. A set of ready-to-use components along with hooks.

A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.

The High-Level Trigger of the ALICE Experiment Heinz Tilsner Kirchhoff-Institut für Physik Universität Heidelberg International Europhysics Conference.

Understanding Data Acquisition System for N- XYTER.

PHENIX upgrade DAQ Status/ HBD FEM experience (so far) The thoughts on the PHENIX DAQ upgrade –Slow download HBD test experience so far –GTM –FEM readout.

DEPARTEMENT DE PHYSIQUE NUCLEAIRE ET CORPUSCULAIRE The pixel telescope DAQ Daniel Haas/Emlyn Corrin DPNC Genève EUDET Annual Meeting 2008 NIKHEF, Amsterdam.

DAQ Issues for the 12 GeV Upgrade CODA 3. A Modest Proposal…  Replace aging technologies  Run Control  Tcl-Based DAQ components  mSQL  Hall D Requirements.

ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.

S.Vereschagin, Yu.Zanevsky, F.Levchanovskiy S.Chernenko, G.Cheremukhina, S.Zaporozhets, A.Averyanov R&D FOR TPC MPD/NICA READOUT ELECTRONICS Varna, 2013.

David Abbott - JLAB DAQ group Embedded-Linux Readout Controllers (Hardware Evaluation)

Status of Global Trigger Global Muon Trigger Sept 2001 Vienna CMS-group presented by A.Taurok.

A PCI Card for Readout in High Energy Physics Experiments Michele Floris 1,2, Gianluca Usai 1,2, Davide Marras 2, André David IEEE Nuclear Science.

Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.

Data Acquisition Backbone Core J. Adamczewski-Musch, N. Kurz, S. Linev GSI, Experiment Electronics, Data processing group.

Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.

DAQ for 4-th DC S.Popescu. Introduction We have to define DAQ chapter of the DOD for the following detectors –Vertex detector –TPC –Calorimeter –Muon.

Xiangming Sun1PXL Sensor and RDO review – 06/23/2010 STAR XIANGMING SUN LAWRENCE BERKELEY NATIONAL LAB Firmware and Software Architecture for PIXEL L.

Latest ideas in DAQ development for LHC B. Gorini - CERN 1.

LHCb DAQ system LHCb SFC review Nov. 26 th 2004 Niko Neufeld, CERN.

STAR Level-3 C. Struck CHEP 98 1 Level-3 Trigger for the Experiment at RHIC J. Berger 1, M. Demello 5, M.J. LeVine 2, V. Lindenstruth 3, A. Ljubicic, Jr.

Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.

June 17th, 2002Gustaaf Brooijmans - All Experimenter's Meeting 1 DØ DAQ Status June 17th, 2002 S. Snyder (BNL), D. Chapin, M. Clements, D. Cutts, S. Mattingly.

Florida Institute of Technology, Melbourne, FL

Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.

The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.

1 Electronics Status Trigger and DAQ run successfully in RUN2006 for the first time Trigger communication to DRS boards via trigger bus Trigger firmware.

STAR Pixel Detector readout prototyping status. LBNL-IPHC-06/ LG22 Talk Outline Quick review of requirements and system design Status at last meeting.

Guirao - Frascati 2002Read-out of high-speed S-LINK data via a buffered PCI card 1 Read-out of High Speed S-LINK Data Via a Buffered PCI Card A. Guirao.

1 Level 1 Pre Processor and Interface L1PPI Guido Haefeli L1 Review 14. June 2002.

LIGO-G9900XX-00-M LIGO II1 Why are we here and what are we trying to accomplish? The existing system of cross connects based on terminal blocks and discrete.

1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.

Pierre VANDE VYVRE ALICE Online upgrade October 03, 2012 Offline Meeting, CERN.

Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.

DAQ Overview + selected Topics Beat Jost Cern EP.

PHENIX DAQ RATES. RHIC Data Rates at Design Luminosity PHENIX Max = 25 kHz Every FEM must send in 40 us. Since we multiplex 2, that limit is 12 kHz and.

Introduction to DAQ Architecture Niko Neufeld CERN / IPHE Lausanne.

COMPASS DAQ Upgrade I.Konorov, A.Mann, S.Paul TU Munich M.Finger, V.Jary, T.Liska Technical University Prague April PANDA DAQ/FEE WS Игорь.

DAQ 1000 Tonko Ljubicic, Mike LeVine, Bob Scheetz, John Hammond, Danny Padrazo, Fred Bieser, Jeff Landgraf.

DAQ Selection Discussion DAQ Subgroup Phone Conference Christopher Crawford

The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.

HTCC coffee march /03/2017 Sébastien VALAT – CERN.

Use of FPGA for dataflow Filippo Costa ALICE O2 CERN

FEE for TPC MPD__NICA JINR

PRAD DAQ System Overview

Future Hardware Development for discussion with JLU Giessen

PC Farms & Central Data Recording

RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne

CoBo - Different Boundaries & Different Options of

Trigger, DAQ, & Online: Perspectives on Electronics

J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz

Large CMS GEM with APV & SRS electronics

Example of DAQ Trigger issues for the SoLID experiment

Network Processors for a 1 MHz Trigger-DAQ System

The CMS Tracking Readout and Front End Driver Testing

SVT detector electronics

TELL1 A common data acquisition board for LHCb

Presentation transcript:

Possible DAQ Upgrades DAQ1k… DAQ2k… DAQ10k!? Tonko Ljubičić STAR/BNL (for the “3L Group” — Landgraf, LeVine & Ljubičić) (Lange would fit nicely too, ) –Increase the event rate into Level 3 –Increase the event rate onto storage … but make it cheap (unlikely) … and make it simple (unlikely) … and do it without additional manpower (ridiculous) … and do it while STAR is taking data (problematic)

We have the TPC (or similar) i.e. a tracking device with many channels We want a Level 3 trigger (based upon tracks) We have a good cluster finder so we save only the 2D hitpoints The final storage (tapes) is under RCF’s control Assumed Requirements: –At least 1000 Hz Level 3 rate (central, Au+Au) –At least 100 Hz storage rate (central, Au+Au) Assumptions…

DAQ Components Event Builder and event buffer Level 3 CPU farm DAQ frontend (Cluster Finder, Formatter) Detector Frontend (FEE) Network interconnect: –Between DAQ frontend, L3, EVB –Between FEE and DAQ frontend

DAQ Components (cont’d) (current) EVB: 1 Sun, 70 MB/s, 700 GB buffer  10 Hz central AuAu raw, 50 Hz clusters only L3: MHz Alphas  50 Hz central AuAu DAQ RB: 144X3 slow I960CPUs  50 Hz central AuAu TPC FEE: 100 Hz Network: –Main: Myrinet, 100 MB/s/link –FEE  DAQ: 1.25 Gb/s  100 evts/s

Upgrades (EVB) Cluster of Linux CPUs connected via Gigabit ethernet switch to RCF Each has: –Large (and cheap) disk buffers (i.e. 4 X 120 GB IDE) –512 MB memory (not that much) –1 Gigabit Ethernet card (cheap) –1 Myrinet card (for internal DAQ) (1 k$) –1 CPU of any slow variety (not CPU-intensive) –Good, fast motherboard (I/O intensive) Need about 5-10 of them Advantages: –Scalability – adding more nodes increases rates linearly –Paralellism is simple – round robin on an event-by-event basis, all nodes are equal –Robustness – all are the same, trivial automatic recovery in case of failure –Cost – IDE disks are soooo much cheaper than SCSI Cost: 4 k$ per cluster (nicely equipped). Now! –Compare to current 50 k$ for a single Sun workstation: for the cost of one Sun we get 10 X (!) the throughput!

Upgrades (TPC FEE) ALICE developed a FEE chip for their own TPC (ALTRO) 8 channel analog/digital hybrid with ADCs and DSP on chip pedestal subtraction, gain correction, baseline restoration, zero-suppression and event buffering (8 buffers) on chip (up to) 20 MHz sampling clock Decoupled readout clock of (up to) 40 MHz Available now (?) Needs more evaluation but looks promising! Expect more details from the Berkeley guys in the near future (Bieser, Crawford)

Upgrades (DAQ frontend) Inputs data from detector FEE, finds clusters, formats them, calculates pedestals, buffers data, ships to L3/EVB, etc. – versatile Works on a M X N (2D) plaque suitable for most detectors (i.e. TPC padrow is 182 X 512, SVT is 240 X 128, etc.) – “detector blind” Current example: –Intel I960HD CPU, 66 MHz internal, 33 MHz external bus takes ~ 7 ms for a central Au+Au event per padrow  need speedup of ~10 X (but hope for more, ) Possible choices: –DSPs (“easy” to program; many, many to choose from) –FPGAs (tough to program, fast!, many to choose from) –Embedded FPGA cores or hybrids (i.e. Xilinx Virtex II Pro) Combination of both FPGA & CPU Versatile – many have fast links (i.e Gb/s !) on chip! Extremely complex! Expensive! (at least now…) A lot of R&D: –Evaluate possible hardware choices (above) –Adapt the cluster finder software to the different hardware –Need very specific manpower – possible cooperation with Instrumentation Division –Very critical item – need to start work NOW! (R&D funding)

DAQ Interconnects Complex issue depends on: –Where will the Cluster Finder be? On the detector? In the DAQ room? –What is done in FEE vs. Cluster Finder? Does FEE zero- suppress (ala ALICE FEE) or it is left to the DAQ frontend (like now)? –Data aggregation and scheduling? How does one pack this data? Multiplexing scheme? Data routing? –How many fibers one needs? At which speed? Which topology? –Does one use commercially available switches/protocols (i.e. Gigabit, 10 Gb???) or use custom built (like we do now)? –One needs to ship a Sector’s worth of data to a single L3 Node – how? Which network? Which topology? –Cost !? –Need to start thinking NOW!

Level 3 (tracking) This is tough: –Currently takes 40 ms/sector with a pretty fast (500 MHz Alpha) CPU  need to speed up at least 50 times! –How to get 50 X (some ideas): Faster CPU in 6 years (~ 4X) Concentrate on primary tracks (~2 X) Know the vertex (~2 X). Need vertex detector!!! Tune the code (~ 2 X) Only tracks that exit the volume i.e. pass trough the last padrow in the TPC (implied rapidity-Pt cut) (~2 X) Use as seeds track hits in other detectors (EMC? TOFRPC?) (~ 2X) Parallelize, parallelize, parallelize! –i.e. each CPU node is a 4way SMP with each CPU working on one track in parallel (~4 X) Could be done! (With a lot of magic wand waiving…) Cost!? Assume 4 X 4way SMP per 24 sectors that’s 96 4way SMP k$ machine that’s ~ 1 M$. Doable.

Level 3 (cont’d) How to reduce cost and make it sweeter? Let’s look at Offline vs. Level 3 CPU farms similarities: –Both need super fast CPUs –A lot of them! –Offline needs a fast connection to the data source (i.e. HPSS tapes) but Level 3 already has (or can easily be made to have) a connection to HPSS! Differences: –Offline needs disks and a lot of memory – L3 doesn’t –Offline needs different code structures and perhaps OS setup Skin Changing Local Grid –Level 3 nodes “become” reconstruction nodes when not in use in DAQ (“change skin”) –Level 3 generally boots diskless (for L3) and this system is under complete control of the L3 Group. L3 code doesn’t even need to know that there are disks in the node! –Offline needs disks and all the code (kernel/OS/reconstruction) images on those disks are under complete control of Offline. –Switch from the Level 3 “skin” to Reconstruction is done via a reboot command with an appropriate parameter (i.e. “boot –l3” or “boot –offl”). (The simplest, cleanest but slowest way) Advantages: –Major cost saving Disadvantages: –Can’t run the whole system at the same time (but one could run certain partitions depending on the required load!)

Summary EVB rates no problem (up to 500 MB/s) for STAR-DAQ however the RCF side is a different issue (see M. Messer’s talk) Detector FEE + DAQ Frontend + Level 3 needs a complete rehaul and we must start from scratch If we maintain any of the existing systems we can not go above 50 Hz 1000 kHz (or more) into Level 3 is doable but a lot of work needs to be done to optimize it We need to know what are we looking for in L3 since a completely general and exhaustive tracking will probably not be possible Most of the Level 3 cost could be shared between Offline Reconstruction if we use the Skin Changing scheme

Conclusion Doable Need R&D effort (funding, manpower) immediately for: –TPC FEE overhaul –DAQ frontend studies; hardware and software adaptations –Interconnect/network studies for the FEE  DAQ data transfer as well as DAQ  L3 Need strong support from the collaboration – the effort needed is too large to be done in “our spare time” We should change the name to SuperSTAR