Dominique Breton, Jihane Maalmi Sizing of the front-end derandomizer: simulation of the hardware Verilog model Dominique Breton, Jihane Maalmi
Jihane Maalmi –Dominique Breton- Elba - May 2011 Aim of this study This study was triggered by discussions between Steffen, Umberto and Dominique, in the wake of the last February CERN ETD meeting Its goal is to get a first flavour of the necessary derandomizer depth in order to have a limited dead-time. At the last workshop in Frascati, Steffen presented the first simulations, performed with SimPy Those were running a simplified model of the derandomizer elements They produced clean and useful preliminary results In parallel, we started working on Verilog simulations based on the front-end implementation presented in the former workshops In this present talk, first simulations of this actual hardware implementation will be shown Environment and all effective parameters will be described Jihane Maalmi –Dominique Breton- Elba - May 2011
General Architecture of SuperB Electronics Jihane Maalmi –Dominique Breton- Elba - May 2011
Constraints concerning the Trigger Trigger window : Long latency (~ 6 µs) + jitter, due to machine and detector constraints, >> potentiallly large trigger window (1µs max) for data readout - The trigger window will be adjusted depending on the sub-detector in order to optimize the dataflow: It will be fixed but programmable in the FEE Consecutive Triggers : - No minimum distance fixed at the architecture level. - Min ~ 70 ns (highly probable) due to the time precision of trigger. - No limitation fixed for their number in a burst. => Those constraints should only depend on the trigger system itself Problems : - Two consecutive physics events may reside within the trigger time window (overlapping). FEE should be able to deal with close triggers (Overlapping), and send data in consequence (reducing the size of posterior events) Jihane Maalmi –Dominique Breton- Elba - May 2011
Simulation of synchronous model The FCTS sends a L1 trigger command optionally associated with a value corresponding to a time window. The FEE sends to the DAQ (ROM) the data contained inside a readout window, embedded in a frame including status, trigger tag and time, and length of data field. Trigger is defined by three parameters: - The latency: L (fixed in the FEE) - The readout window: W (fixed in the FEE and sub-detector dependent) - The time distance between triggers: D (measured in the FEE) Constraints : - Minimum dead time in data processing - Triggers with potentially overlapping windows
Jihane Maalmi –Dominique Breton- Elba - May 2011 Parameter Definition t0 L1 Trigger #0 Data to keep Data to dump L W Time M Baseline: latency pipeline always provides the oldest relevant data L: fixed latency W: window containing the relevant data for trigger #0 M: data sent to ROM Jihane Maalmi –Dominique Breton- Elba - May 2011
Jihane Maalmi –Dominique Breton- Elba - May 2011 Synchronous Model with a fixed readout window L : Latency W : Window D : Distance between triggers M : data sent to ROM Case 1 : D ≥ W Trigger #0 Trigger #1 D Non overlapping latencies with 2 different windows (green): no problem M1 = W L D ≥ L W W M0 M1 Trigger #0 Trigger #1 Overlapping latency trigger with overlapping windows: trickier … The window W1 is then shortened! M1 = W – (W – D)= D Case 2 : D < W D W M0 W M1 Jihane Maalmi –Dominique Breton- Elba - May 2011
Jihane Maalmi –Dominique Breton- Elba - May 2011 Synchronous model: dealing with Overlapping Case 1 : Dn ≥ W : Mn = W Case 2 : Dn < W : Mn = Dn Mn : amount of data to send to ROM for trigger #n Trigger input Counter Dn Dn ≥ W? Clock 56 MHz W Fifo “M” !empty M U X Mn FSM W end enable W Mn Counter Registers L All 56 MHz synchronous pipelined operations Start_flag, Mn to serializer Wr_en Data input Latency Pipeline Derandomizer L Jihane Maalmi –Dominique Breton- Elba - May 2011
Parameter definition for derandomizer simulation (1) The derandomizer is the buffer located just behind the latency buffer and in front of the readout serializer. Its role is to absorb the difference in dataflow rate between the gated output of the latency buffer (exponential distribution of the distance between events) and the input of the serializer (fixed dataflow). This part of the system is defined by three parameters: - The average trigger rate: R[Events/s] (150 kEvents/s by default) - The readout window: W[Nb of 56MHz clock periods] (fixed in the FEE and sub-detector dependent) - The number of channels multiplexed at the output of the derandomizer to feed the link: N Default assumption : - Serializer is sending (or makes an equivalent job) 32-bit words at a rate of 56 MHz => 1.8 Gbits/s
Jihane Maalmi –Dominique Breton- Elba - May 2011 Model of the derandomizer environment Derandomizer Output State Machine All 56 MHz synchronous pipelined operations Almost_full Derandomizer Input State Machine L1 Trigger Clock 56 MHz Rd Derandomizer Wr Ch0 Latency Pipeline 32Bits SERDES L Serial link W M U X L Registers Derandomizer Wr_en Clk 56MHz ChN-1 Latency Pipeline Rd Jihane Maalmi –Dominique Breton- Elba - May 2011
Parameter definition for derandomizer simulation (2) The other important parameter is the mean link occupancy. It is defined by the ratio between the average link payload and its nominal capacity In our case: The nominal capacity is 1.8 Gbits/s The payload is: 32[bits] . R[kHz] . W . N For instance, with a payload of 1.44 Gbits/s, the ratio is 80% It has to be noticed that: R is a constant at the experiment level (150 kHz) W is sub-detector dependent N has to be defined by sub-detector In order to “feel” the influence of this ratio, think about a derandomizer where the ration is equal to 100%, fill it up with a burst, and wonder how long it may take to empty it …
Introducing the dead time … The goal of these simulations is to get a first flavour of the necessary derandomizer depth in order to have a limited dead-time. Due to the fact that the minimum distance between triggers is not “zero”, the minimum achievable dead-time is not zero either. For instance, there is a constant term of: ~0.45% at 30 ns ~0.75% at 50 ns ~1.05% at 70 ns ~1.5% at 100 ns In the following simulations, we took 50 ns as arbitrary hypothesis and looked for the derandomizer depth necessary to end up with a dead-time of 1%. The main variable parameters are: The window length (W): That way, the effect of pile-up can be studied Different link ratios are studied Other parameter remain constant : R = 150 kEvents/s
Simulation Result Trigger Mean Rate : 150 kHz , Min Inter Trigger Distance = 50 ns, Required dead time : between 0,9 % and 1,1 % W varying between 10 and 55 (180 ns to 990 ns); Mean Link Occupancy (or Ratio): between 0.5 and 0.95 Result : Derandomizer Depth [Nb of full events] Derandomize Depth Ratio or Mean link Occupancy
Simulation Result (2) Trigger Mean Rate : 150 kHz , Min Inter Trigger Distance : varying 30,50,70 ns Required dead time : between 0,9 % and 1,1 % Mean Link Occupancy (or Ratio): 0.8 W varying between 10 and 55 (180 ns to 990 ns); Result : Derandomizer Depth [Nb of full events] Derandomize depth Window
Simulation Result s(3) Dead Time ( W) W varying between 10 and 40; Plot1 : Trigger Mean Rate : 150 kHz Min Inter Trigger Distance : 30 ns For a link with mean link occupancy = 0.8 With same obtained derandomizer depth, and without changing the multiplexing factor (N) : Plot 2 : Upgrade : Trigger Mean Rate 300 kHz, Min Distance 30 ns Mean link occupancy > 100% dead time increases by a factor 35!! Dead time W
Remarks The problem is very different if one looks at detectors with opposite behaviors: short time window, high multiplexing factor and reduced pile-up (and possibly variable event size) => ~SVT, Forward and Barrel PID, Backward EMC, IFR long fixed time window, low multiplexing factor and high pile-up probability => DCH, Forward and Barrel EMC In the case of variable event size, what should we base the derandomizer depth on: - The average size ? (less depth but more potential pile-up) - The maximum size ? (minimum dead-time but maximum depth) Importance of a fast throttle Problem : how to implement a model of the derandomizer if the event size is random (to build the throttle )? The worst case will depend on the sub-detector implementation We may need a fast direct throttle between FEE and FCTS Return path of clock and control links could be used therefore
Conclusion We started simulating the FEE hardware Verilog model in order to estimate the necessary derandomizer depth to keep the dead-time at a reasonable level Therefore, we defined all the parameters linked to the derandomizer environment First results show that the link occupancy ratio has a great influence on the derandomizer depth Pile-up helps ! … We have to optimize the code to extract the required information more easily We need to compare our results with Steffen’s The final goal of the study is to give to subdetectors a table with the necessary derandomizer depth and link occupancy ratio with respect to the width of their own trigger time window