Presentation is loading. Please wait.

Presentation is loading. Please wait.

Possible DAQ Upgrades DAQ1k… DAQ2k… DAQ10k!? Tonko Ljubičić STAR/BNL (for the “3L Group” — Landgraf, LeVine & Ljubičić) (Lange would fit nicely too, )

Similar presentations


Presentation on theme: "Possible DAQ Upgrades DAQ1k… DAQ2k… DAQ10k!? Tonko Ljubičić STAR/BNL (for the “3L Group” — Landgraf, LeVine & Ljubičić) (Lange would fit nicely too, )"— Presentation transcript:

1 Possible DAQ Upgrades DAQ1k… DAQ2k… DAQ10k!? Tonko Ljubičić STAR/BNL (for the “3L Group” — Landgraf, LeVine & Ljubičić) (Lange would fit nicely too, ) –Increase the event rate into Level 3 –Increase the event rate onto storage … but make it cheap (unlikely) … and make it simple (unlikely) … and do it without additional manpower (ridiculous) … and do it while STAR is taking data (problematic)

2 We have the TPC (or similar) i.e. a tracking device with many channels We want a Level 3 trigger (based upon tracks) We have a good cluster finder so we save only the 2D hitpoints The final storage (tapes) is under RCF’s control Assumed Requirements: –At least 1000 Hz Level 3 rate (central, Au+Au) –At least 100 Hz storage rate (central, Au+Au) Assumptions…

3 DAQ Components Event Builder and event buffer Level 3 CPU farm DAQ frontend (Cluster Finder, Formatter) Detector Frontend (FEE) Network interconnect: –Between DAQ frontend, L3, EVB –Between FEE and DAQ frontend

4 DAQ Components (cont’d) (current) EVB: 1 Sun, 70 MB/s, 700 GB buffer  10 Hz central AuAu raw, 50 Hz clusters only L3: 48 500 MHz Alphas  50 Hz central AuAu DAQ RB: 144X3 slow I960CPUs  50 Hz central AuAu TPC FEE: 100 Hz Network: –Main: Myrinet, 100 MB/s/link –FEE  DAQ: 1.25 Gb/s  100 evts/s

5 Upgrades (EVB) Cluster of Linux CPUs connected via Gigabit ethernet switch to RCF Each has: –Large (and cheap) disk buffers (i.e. 4 X 120 GB IDE) –512 MB memory (not that much) –1 Gigabit Ethernet card (cheap) –1 Myrinet card (for internal DAQ) (1 k$) –1 CPU of any slow variety (not CPU-intensive) –Good, fast motherboard (I/O intensive) Need about 5-10 of them Advantages: –Scalability – adding more nodes increases rates linearly –Paralellism is simple – round robin on an event-by-event basis, all nodes are equal –Robustness – all are the same, trivial automatic recovery in case of failure –Cost – IDE disks are soooo much cheaper than SCSI Cost: 4 k$ per cluster (nicely equipped). Now! –Compare to current 50 k$ for a single Sun workstation: for the cost of one Sun we get 10 X (!) the throughput!

6 Upgrades (TPC FEE) ALICE developed a FEE chip for their own TPC (ALTRO) 8 channel analog/digital hybrid with ADCs and DSP on chip pedestal subtraction, gain correction, baseline restoration, zero-suppression and event buffering (8 buffers) on chip (up to) 20 MHz sampling clock Decoupled readout clock of (up to) 40 MHz Available now (?) Needs more evaluation but looks promising! Expect more details from the Berkeley guys in the near future (Bieser, Crawford)

7 Upgrades (DAQ frontend) Inputs data from detector FEE, finds clusters, formats them, calculates pedestals, buffers data, ships to L3/EVB, etc. – versatile Works on a M X N (2D) plaque suitable for most detectors (i.e. TPC padrow is 182 X 512, SVT is 240 X 128, etc.) – “detector blind” Current example: –Intel I960HD CPU, 66 MHz internal, 33 MHz external bus takes ~ 7 ms for a central Au+Au event per padrow  need speedup of ~10 X (but hope for more, ) Possible choices: –DSPs (“easy” to program; many, many to choose from) –FPGAs (tough to program, fast!, many to choose from) –Embedded FPGA cores or hybrids (i.e. Xilinx Virtex II Pro) Combination of both FPGA & CPU Versatile – many have fast links (i.e. 3.25 Gb/s !) on chip! Extremely complex! Expensive! (at least now…) A lot of R&D: –Evaluate possible hardware choices (above) –Adapt the cluster finder software to the different hardware –Need very specific manpower – possible cooperation with Instrumentation Division –Very critical item – need to start work NOW! (R&D funding)

8 DAQ Interconnects Complex issue depends on: –Where will the Cluster Finder be? On the detector? In the DAQ room? –What is done in FEE vs. Cluster Finder? Does FEE zero- suppress (ala ALICE FEE) or it is left to the DAQ frontend (like now)? –Data aggregation and scheduling? How does one pack this data? Multiplexing scheme? Data routing? –How many fibers one needs? At which speed? Which topology? –Does one use commercially available switches/protocols (i.e. Gigabit, 10 Gb???) or use custom built (like we do now)? –One needs to ship a Sector’s worth of data to a single L3 Node – how? Which network? Which topology? –Cost !? –Need to start thinking NOW!

9 Level 3 (tracking) This is tough: –Currently takes 40 ms/sector with a pretty fast (500 MHz 21264 Alpha) CPU  need to speed up at least 50 times! –How to get 50 X (some ideas): Faster CPU in 6 years (~ 4X) Concentrate on primary tracks (~2 X) Know the vertex (~2 X). Need vertex detector!!! Tune the code (~ 2 X) Only tracks that exit the volume i.e. pass trough the last padrow in the TPC (implied rapidity-Pt cut) (~2 X) Use as seeds track hits in other detectors (EMC? TOFRPC?) (~ 2X) Parallelize, parallelize, parallelize! –i.e. each CPU node is a 4way SMP with each CPU working on one track in parallel (~4 X) Could be done! (With a lot of magic wand waiving…) Cost!? Assume 4 X 4way SMP per sector @ 24 sectors that’s 96 4way SMP machines. @10 k$ machine that’s ~ 1 M$. Doable.

10 Level 3 (cont’d) How to reduce cost and make it sweeter? Let’s look at Offline vs. Level 3 CPU farms similarities: –Both need super fast CPUs –A lot of them! –Offline needs a fast connection to the data source (i.e. HPSS tapes) but Level 3 already has (or can easily be made to have) a connection to HPSS! Differences: –Offline needs disks and a lot of memory – L3 doesn’t –Offline needs different code structures and perhaps OS setup Skin Changing Local Grid –Level 3 nodes “become” reconstruction nodes when not in use in DAQ (“change skin”) –Level 3 generally boots diskless (for L3) and this system is under complete control of the L3 Group. L3 code doesn’t even need to know that there are disks in the node! –Offline needs disks and all the code (kernel/OS/reconstruction) images on those disks are under complete control of Offline. –Switch from the Level 3 “skin” to Reconstruction is done via a reboot command with an appropriate parameter (i.e. “boot –l3” or “boot –offl”). (The simplest, cleanest but slowest way) Advantages: –Major cost saving Disadvantages: –Can’t run the whole system at the same time (but one could run certain partitions depending on the required load!)

11 Summary EVB rates no problem (up to 500 MB/s) for STAR-DAQ however the RCF side is a different issue (see M. Messer’s talk) Detector FEE + DAQ Frontend + Level 3 needs a complete rehaul and we must start from scratch If we maintain any of the existing systems we can not go above 50 Hz 1000 kHz (or more) into Level 3 is doable but a lot of work needs to be done to optimize it We need to know what are we looking for in L3 since a completely general and exhaustive tracking will probably not be possible Most of the Level 3 cost could be shared between Offline Reconstruction if we use the Skin Changing scheme

12 Conclusion Doable Need R&D effort (funding, manpower) immediately for: –TPC FEE overhaul –DAQ frontend studies; hardware and software adaptations –Interconnect/network studies for the FEE  DAQ data transfer as well as DAQ  L3 Need strong support from the collaboration – the effort needed is too large to be done in “our spare time” We should change the name to SuperSTAR


Download ppt "Possible DAQ Upgrades DAQ1k… DAQ2k… DAQ10k!? Tonko Ljubičić STAR/BNL (for the “3L Group” — Landgraf, LeVine & Ljubičić) (Lange would fit nicely too, )"

Similar presentations


Ads by Google