Future Dataflow Bottlenecks Christopher O’Grady with A. Perazzo and M. Weaver Babar Dataflow Group
View of the DAQ System A series of parallel assembly lines System runs as fast as the slowest worker on the assembly line Trigger rate projections tell us how fast workers should work, event size projections tell us how fast they actually take. Use software written by Amedeo Perazzo and James Swain to record system performance info We use this information to project into the future. All the following plots are made by Matt Weaver.
Projections On June 25, 2001 I predicted the DCH readout would be a bottleneck in Also, when we saw unexplained deadtime from GLT, the projection system “told” us that there was 90us delay shipping data (which we then saw on the scope). This projection system works well.
Projection Improvements Matt has split all occupancy projections into HER/LER/LUM components (previously, only for DCH). Using Jan 2004 background runs. Looks at all ROMs individually. Previously just the worst. Projecting through (3*10**34)
General Observations HER worse than 2002 LER better than 2002 Now see a luminosity term in sizes Event size is 75kb in 2007 (3*10**34)
Trigger Rate Projections From the Trigger Group: Need <140us!
Fiber Transfer Bottlenecks DCH/SVT the largest. GLT also important. 140us
Behaviour of Fiber Deadtime Worse than other deadtime, since “earliest” buffering in the system.
Plan of Attack for Fiber DCH: in progress GLT/DCT: sudong in progress, should be straightforward SVT: 1.try running system at 60MHz, or 2.reduce occupancy (and efficiency) with thresholds EMT: straightforward
Feature Extraction Bottleneck DCH/DRC/EMC/EMT/SVT ideally need work 140us
Plan of Attack for FEX DCH FEX taken care of with electronics upgrade. DRC and SVT FEX relatively easy (don’t “do” anything). EMT FEX requires data format change (some work but doable, in principle). Amedeo already did one pass. EMC FEX hard! Already a lot of work on that by Matt.
EMC FEX Need new idea (like Walt had) or new CPUs. new CPUs won't necessarily work easily: mechanical, electrical, software issues. significant work and money. Maybe 20% gain from nbr bits, but hardware untested and corners may not see gain.
VME Bottleneck Currently overestimated. EMC/DRC/SVT 140us
Plan of Attack for VME Many bottlenecks, but maybe not a problem. Could imagine going to all-network event build. For this would likely need ~150 Gbit network cards ($45K) + fibers + network switch ($60K?). Maybe more L3 nodes.
Summary Up until now have been able to reduce big bottlenecks: EMT FEX, EMC FEX, network stack, DCH data transfer(in progress), GLT/DCT data transfer (sudong, in progress). With the above work we should be able to sustain 5kHz in Not good enough for predicted 7kHz L1. EMC FEX and SVT fiber transfer hardest. Bottlenecks getting varied and difficult: ~$100K + significant code mods to eliminate VME. ~$500K for new CPUs, and new CPUs may be tough. There are deadtime periods we don’t quantitatively understand.
My Intuition It’s going to be a little rough by Continue improving the system piece by piece with manpower and money we have, BUT Tightening the trigger in 2007 will likely be necessary.