Presentation is loading. Please wait.

Presentation is loading. Please wait.

(Fast) flow control to the FE LHCb Electronics Upgrade Meeting 13 June 2013 F. Alessio, K. Wyllie and conversations with many of you.

Similar presentations


Presentation on theme: "(Fast) flow control to the FE LHCb Electronics Upgrade Meeting 13 June 2013 F. Alessio, K. Wyllie and conversations with many of you."— Presentation transcript:

1 (Fast) flow control to the FE LHCb Electronics Upgrade Meeting 13 June 2013 F. Alessio, K. Wyllie and conversations with many of you

2 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 2 Scope & Outline Converge on minimal common solutions for digital part of FE Flow control at FE How and at which stage to use TFC commands? Minimal specifications for FE channel Addresses main problematics connected to TFC commands at FE How and where to use them in a synchronous way SYNCH command and SYNCH pattern Data packing format Aim at keeping the TELL40 data encoding independent from TFC commands Flexibility and simplicity Different data packing formats for different data readout modes First attempt to simulate generic FE channel: LHCb-PUB-2013-015LHCb-PUB-2013-015

3 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 3 Generic FE channel as in specs FE channel contains a buffer: No trigger at FE!! Derandomizer used to pipe data @ 40MHz to be packed and sent over GBT link. TFC commands to synchronize FE readout. DATA coming out on GBT link: No empty spaces, no unexpected 0s Fully dynamic packing algorithm across GBT frame-width Wishingly, data should be in order… If buffer is empty and cannot send a new event, send an idle frame But, what’s in the HEADER?

4 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 4 Few comments to start with: -BX Veto and Header Only commands are identical from FE point of view  ORed -TFC commands are synchronous wrt to BXID Reset  once we align BXID Reset with beam, TFC commands come ALWAYS at the same latency (wrt to BXID Reset, hence BXID)!  Compression/suppression logic should act accordingly to TFC command (why would you want to compress/suppress if that crossing is rejected a priori? Especially if your pre-processing is dynamic…) -Data is filtered according to TFC commands and the FE buffer status -Data is packed onto the GBT link in a continuous fashion FE flow control scheme

5 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 5 Generic FE data flow scheme Compression/suppression logic can have dynamic or static latency Applies changes to data FE buffer for data Tag data with TFC commands and pipe them across compresson/suppression logic block Modify data according to TFC commands + BufferFull then pack continuously onto GBT Data available needed only if compression / suppression is dynamic

6 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 6 Generic FE data flow scheme Append and align TFC commands and BXID to Raw data before compression/suppression stage

7 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 7 Generic FE data flow scheme After compression/suppression, add other information: Big Event Data available (if applicable) Data Length

8 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 8 Generic FE data flow scheme Data filter, generates data packets to write to buffer. Typical case: -Synch = 0 -(Header_Only or BX Veto) = 0 or BufferFull = 0 -NZS = 0 Truncation bit is mandatory! BXID length customizable. Data length needed for TELL40 decoding.

9 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 9 Generic FE data flow scheme Special case 1: -Synch = 1 -(Header_Only or BX Veto) = 0 or BufferFull = 0 -NZS = 0 Synch Pattern (& length) programmable! BXID needs 12 bits!

10 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 10 Generic FE data flow scheme Special case 2: -Synch = 0 -(Header_Only or BX Veto) = 1 or BufferFull = 1 -NZS = 0 No need for Length field or Data field (or anything else) Truncation bit = 1 indicates data was discarded!

11 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 11 Generic FE data flow scheme Special case 3: -Synch = 0 -(Header_Only or BX Veto) = 0 or BufferFull = 0 -NZS = 1 Trunc = 0 + special code (0x1111….) to indicate it is a NZS packet.  Length of NZS data packet is configured per sub-detector  Never allow 0x1111… in typical case (at max 0x1111…0).

12 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 12 Summarizing X = don’t care Optional means it can be added if wanted. Order of priority

13 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 13 Generic FE data flow scheme Buffer and packing control manages: -Read and write pointers to Buffer -Buffer occupancy (Buffer Full flag) -Data valid flag for GBTX See TELL40 fw specs, coming soon…

14 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 14 Generic FE data flow scheme Buffer and packing control manages: -Read and write pointers to Buffer -Buffer occupancy (Buffer Full flag) -Data valid flag for GBTX See TELL40 fw specs, coming soon…

15 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 15 Generic FE data flow scheme Buffer and packing control manages: -Read and write pointers to Buffer -Buffer occupancy (Buffer Full flag) -Data valid flag for GBTX See TELL40 fw specs, coming soon…

16 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 16 What to do on SYNCH command? Synch command is meant to be sure that system is synchronized… in a synchronous way! Double usage (in AND or in OR): 1.Periodically: i.e., SYNCH command sent every n Hz 2.Asynchronously: i.e. when a desynch is detected, like TELL40 detects wrong frames, wrong packing, fast diagnostics in TELL40 specific sub-detectors’ codes.  makes sense to clear the FE buffer  could be sent only for a local sub-detector from SOL40 i.e. could be fast triggered either by ECS or by TELL40 via SOL40….  FEs send Synch Patter for the same BXIDs everywhere TELL40 can align to corresponding frame and BXID FE frees its memory : delete its content, read and write pointers back to empty FE sends Synch Pattern  TELL40 closes all events before and sends them out truncated Synch Pattern programmable!!!

17 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 17 Data Valid GBT can accept DATA or IDLE frame:  Send IDLE frame whenever a GBT frame is not ready to be sent!  IDLE frame can contain whatever your sub- detector wants to send. See TELL40 fw specs, coming soon… Data Valid signal to distinguish between DATA and IDLE frame:

18 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 18 Data Valid Be careful to rise synchronize the Data Valid signal to the right rising edge when using the 80 MHz clock (or 160 or 320…) GBTX would split the frame in this case!! Synchronize your DV signal to the beginning of the GBT frame!

19 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 19 Data Valid Some sub-detectors will connect more FEs to the same GBT transmitter: -Each FE with its own memory -Can happen that one can send DATA, the other cannot! (IDLE vs DATA in the same packet) Keep DV always high! You HAVE to indicate whether the packet was DATA or IDLE, by sacrificing one bit your DATA/IDLE frame

20 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 20 Data Valid Some sub-detectors will connect more FEs to the same GBT transmitter: -Each FE with its own memory -Can happen that one can send DATA, the other cannot! (IDLE vs DATA in the same packet) Keep DV always high! You HAVE to indicate whether the packet was DATA or IDLE, by sacrificing one bit your DATA/IDLE frame

21 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 21 Conclusion Updated Electronics specifications documents ready: LHCb-PUB-2011-011LHCb-PUB-2011-011

22 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 22 Qs & As?

23 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 23 Reminder Reminder: first attempt to simulate behaviour of a generic FE channel with TFC commands. Previous electronics meeting: https://indico.cern.ch/conferenceDisplay.py?confId=225750 https://indico.cern.ch/conferenceDisplay.py?confId=225750 First simulation effort: LHCb-INT-2013-015LHCb-INT-2013-015 Few questions arose: -Can we live with less BXID bits? -Can we find some dynamic packing rules to allow sending fewer bits of header? -Can we pack data fully dynamically at the FE? -Can you use and apply TFC commands at the right time? -Can we send ZS and NZS dynamically? -Etc…etc… (if only it were so simple… )

24 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 24 System and functional requirements 1.Bidirectional communication network 2.Clock jitter, and phase and latency control At the FE, but also at TELL40 and between S-TFC boards 3.Partitioning to allow running with any ensemble and parallel partitions 4.LHC interfaces 5.Events rate control 6.Low-Level-Trigger input 7.Support for old TTC-based distribution system 8.Destination control for the event packets 9.Sub-detectors calibration triggers 10.S-ODIN data bank Infomation about transmitted events 11.Test-bench support

25 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 25 The S-TFC system at a glance DATA S-ODIN responsible for controlling upgraded readout system Distributing timing and synchronous commands Manages the dispatching of events to the EFF Rate regulates the system Support old TTC system: hybrid system! SOL40 responsible for interfacing FE+TELL40 slice to S-ODIN Fan-out TFC information to TELL40 Fan-in THROTTLE information from TELL40 Distributes TFC information to FE Distributes ECS configuration data to FE Receives ECS monitoring data from FE STORAGE

26 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie S-TFC concept reminder 26

27 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie The upgraded physical readout slice Common electronics board for upgraded readout system: Marseille’s ATCA board with 4 AMC cards S-ODIN  AMC card LLT  AMC card TELL40  AMC card LHC Interfaces  specific AMC card 27

28 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie Latest S-TFC protocol to TELL40 28 «Extended» TFC word to TELL40 via SOL40:  64 bits sent every 40 MHz = 2.56 Gb/s (on backplane)  packed with 8b/10b protocol (i.e. total of 80 bits)  no dedicated GBT buffer, use ALTERA GX simple 8b/10b encoder/decoder THROTTLE information from each TELL40 to SOL40: no change: 1 bit for each AMC board + BXID for which the throttle was set  16 bits in 8b/10b encoder  same GX buffer as before (as same decoder!) Constant latency after BXID We will provide the TFC decoding block for the TELL40: VHDL entity with inputs/outputs MEP accept command when MEP ready:  Take MEP address and pack to FARM  No need for special address, dynamic

29 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie S-TFC protocol to FE, no change 29 TFC word on downlink to FE via SOL40 embedded in GBT word:  24 bits in each GBT frame every 40 MHz = 0.98 Gb/s  all commands associated to BXID in TFC word Put local configurable delays for each TFC command GBT does not support individual delays for each line Need for «local» pipelining: detector delays+cables+operational logic (i.e. laser pulse?)  DATA SHOULD BE TAGGED WITH THE CROSSING TO WHICH IT BELONGS! TFC word will arrive before the actual event takes place To allow use of commands/resets for particular BXID Accounting of delays in S-ODIN: for now, 16 clock cycles earlier + time to receive Aligned to the furthest FE (simulation, then in situ calibration!) TFC protocol to FE has implications on GBT configuration and ECS to/from FE see specs document!

30 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 30 SODIN firmware v1r0 – block diagram

31 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie Timing distribution 31 From TFC point of view, we ensure constant: LATENCY: Alignment with BXID FINE PHASE: Alignment with best sampling point Some resynchronization mechanisms envisaged:  Within TFC boards  With GBT No impact on FE itself Loopback mechanism:  re-transmit TFC word back  allows for latency measurement + monitoring of TFC commands and synchronization

32 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 32 How to decode TFC in FE chips? Use of TFC+ECS GBTs in FE is 100% common to everybody!!  dashed lines indicate the detector specific interface parts  please pay particular care in the clock transmission: the TFC clock must be used by FE to transmit data, i.e. low jitter! Kapton cable, crate, copper between FE ASICs and GBTX FE electronic block

33 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 33 The TFC+ECS GBT These clocks should be the main clocks for the FE 8 programmable phases 4 programmable frequencies (40,80,160,320 MHz) Used to: sample TFC bits drive Data GBTs drive FE processes

34 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 34 The TFC+ECS GBT protocol to FE  TFC protocol has direct implications in the way in which GBT should be used everywhere 24 e-links @ 80 Mb/s dedicated to TFC word: use 80 MHz phase shifter clock to sample TFC parallel word TFC bits are packed in GBT frame so that they all come out on the same clock edge We can repeat the TFC bits also on consecutive 80 MHz clock edge if needed  Leftover 17 e-links dedicated to GBT-SCAs for ECS configuring and monitoring (see later)

35 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 35 Words come out from GBT at 80 Mb/s In simple words: Odd bits of GBT protocol on rising edge of 40 MHz clock (first, msb), Even bits of GBT protocol on falling edge of 40 MHz clock (second, lsb)

36 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 36 TFC decoding at FE after GBT This is crucial!!  we can already specify where each TFC bit will come out on the GBT chip  this is the only way in which FE designers still have minimal freedom with GBT chip if TFC info was packed to come out on only 12 e-links (first odd then even), then decoding in FE ASIC would be mandatory! which would mean that the GBT bus would have to go to each FE ASIC for decoding of TFC command  there is also the idea to repeat the TFC bits on even and odd bits in TFC protocol would that help? FE could tie logical blocks directly on GBT pins…

37 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 37 Now, what about the ECS part? Each pair of bit from ECS field inside GBT can go to a GBT-SCA One GBT-SCA is needed to configure the Data GBTs (EC one for example?) The rest can go to either FE ASICs or DCS objects (temperature, pressure) via other GBT-SCAs GBT-SCA chip has already everything for us: interfaces, e-links ports..  No reason to go for something different! However, «silicon for SCA will come later than silicon for GBTX»…  We need something while we wait for it!

38 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 38 Protocol drivers build GBT-SCA packets with addressing scheme and bus type for associated GBT-SCA user busses to selected FE chip  Basically each block will build one of the GBT-SCA supported protocols Memory Map with internal addressing scheme for GBT-SCA chips + FE chips addressing, e-link addressing and bus type: content of memory loaded from ECS SOL40 encoding block to FE!

39 LHCb Electronics Upgrade Meeting, 26/07/12 F. Alessio, R. Jacobsson Usual considerations … 39 TFC+ECSInterface has the ECS load of an entire FE cluster for configurating and monitoring  34bits @ 40 MHz = 1.36Gb/s on single GBT link ~180 Gb/s for full TFC+ECSInterface (132 links) Single CCPC might become bottleneck… Clara & us, December 2011  How long to configure FE cluster? how many bits / FE? how many FEs/ GBT link? how many FEs / TFC+ECSInterface?  Numbers to be pinned down soon + GBT-SCA interfaces and protocols.

40 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie Old TTC system support and running two systems in parallel 40 We already suggested the idea of a hybrid system: reminder: L0 electronics relying on TTC protocol  part of the system runs with old TTC system  part of the system runs with the new architecture How? 1.Need connection between S-ODIN and ODIN (bidirectional)  use dedicated RTM board on S-ODIN ATCA card 2.In an early commissioning phase ODIN is the master, S-ODIN is the slave  S-ODIN task would be to distribute new commands to new FE, to new TELL40s, and run processes in parallel to ODIN  ODIN tasks are the ones today + S-ODIN controls the upgraded part In this configuration, upgraded slice will run at 40 MHz, but positive triggers will come only at maximum 1.1MHz… Great testbench for development + tests + apprenticeship… Bi-product: improve LHCb physics programme in 2015-2018… 3. In the final system, S-ODIN is the master, ODIN is the slave  ODIN task is only to interface the L0 electronics path to S-ODIN and to provide clock resets on old TTC protocol

41 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie S-ODIN on Marseille’s ATCA board 41

42 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie TFC+ECSInterface on Marseille’s ATCA board 42

43 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 43 The code: FE data generator

44 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 44 The code: FE buffer manager

45 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 45 The code: GBT dynamic packing Very important to analyze simulation output bit-by-bit and clock-by-clock!

46 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 46 FE generic data generator is fully programmable: Number of channels associated to GBT link Width of each channel Derandomizer depth Mean occupancy of the channels associated to GBT link Size of GBT frame (80 bits or WideBus + GBT header 4 bits) Extremely flexible and easy to configure with parameters Covers almost all possibilities (almost…) Including flexible transmission of NZS and ZS Including TFC commands as defined in specs Study dependency of FE buffer behaviour with TFC commands Study effect of packing algorithm on TELL40 Study synchronization mechanism at beginning of run Study re-synchronization mechanism when de-synchronized Etc… etc… etc… And it is fully synthesizable… The code: configuration

47 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 47 Simulated 11 different scenarios: fixed GBT size to 80 bits + 4 bits GBT header fixed width of data header to 24 bits in three fields (12 for BXID, 8 for data size, 4 for info) fixed width of data channel to 5 bits as practical example Mean OccupancyNumber of channels/GBT link Derandomizer Depth Notes Scenario 110%50 (x 5bits)750% truncated Scenario 225%50750% truncated Scenario 340%507518.8% truncated Scenario 440%5010518.5% truncated Scenario 540%5013518.1% truncated Scenario 640%5016517.9% truncated Scenario 740%401658.4% truncated Scenario 840%321650% truncated Scenario 940%32165NO BX_VETO 1.5% truncated Scenario 1035%321650% truncated Scenario 1135%32165NO BX VETO 0% truncated Simulation results Numbers scale relatively: less occupancy, more number of channels

48 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 48 Scenario 1: 10% occupancy, 50x5bits channels, derandomizer depth 75 Scenario 2: 25% occupancy, 50x5bits channels, derandomizer depth 75 Simulation results

49 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 49 Scenario 8: 40% occupancy, 32x5bits channels, derandomizer depth 165 Scenario 9: 40% occupancy, 32x5bits channels, derandomizer depth 165 + NO BX VETO sent from TFC Simulation results

50 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 50 Synthesis results Stratix IV GX 230KF40C Resource Usage Estimated ALUTs Used3228 (<2% Total) Dedicated logic registers896 (<1% Total) Memory ALUTs Used0 (0% Total) Estimated ALUTs Unavailable719 Total combinatorial functions3228 7 input functions18 6 input functions947 5 input functions379 4 input functions430 <= 3 input functions1454 Combinatorial ALUTs by mode normal mode2904 extended LUT mode18 arithmetic mode274 shared arithmetic mode32 Total registers896 (no I/Os) Clock speed Fmax registered (only main clock) 172.71 MHz ARRIA GX 35DF78 Resource Usage Estimated ALUTs Used2636 (10% Total) Dedicated logic registers846 (3% Total) Memory ALUTs Used0 (0% Total) Estimated ALUTs Unavailable457 Total combinatorial functions2635 7 input functions17 6 input functions895 5 input functions387 4 input functions252 <= 3 input functions1084 Combinatorial ALUTs by mode normal mode2312 extended LUT mode17 arithmetic mode274 shared arithmetic mode32 Total registers896 (no I/Os) Using Quartus Altera 12.1 SP1 No synthesis optimization done, let fitter free, no pinout defined, no timing constraint No memory cells used Doable, can be further improved though.

51 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 51 Simulation should be a coordinated effort Personal drive in order to be able to produce a (complex) code for TFC on time FE generic code + TFC code should be merged with TELL40 effort To test both FE packing algorithm and FE buffer management To test decoding at TELL40 and investigate consequences/solutions To analyze effects of TFC commands on global system (including TELL40) Effort already ongoing between me and Guillaume to do so We would very very much appreciate to have the code (emulation) of each sub-detectors a FE generic code is useful to study things on paper, but real code is something different Proposal is to use this simulation effort to validate FE code simulation performed by me and Guillaume to investigate solutions, issues in FE FYI, simulation outlook

52 LHCb Electronics Upgrade Meeting, 13/06/13 F. Alessio, K. Wyllie 52 Packing mechanism as specified in our document is feasible. Will be used temporarily to emulate FE generated data in global readout and TFC simulation. However, very big open questions: Is your FE compatible with such scheme? What about such code in an ASIC? Behaviour of FE derandomizer will strongly depend on your compression or suppression mechanism. If dynamic could create big latencies If your data does not come out of order can become quite complicated… Behaviour of FE derandomizer will strongly depend on TFC commands FE buffer depth should not rely on having a BX VETO! Aim at a bandwidth for fully 40 MHz readout  BX VETO solely to discard events synchronously. What about SYNCH command? When do you think you can apply it? Ideally after derandomizer and after suppression/compression, but… How many clock cycles do you need to recover from an NZS event? Can you handle consecutive NZS events? Conclusions


Download ppt "(Fast) flow control to the FE LHCb Electronics Upgrade Meeting 13 June 2013 F. Alessio, K. Wyllie and conversations with many of you."

Similar presentations


Ads by Google