Download presentation
Presentation is loading. Please wait.
Published byHarold Oliver Modified over 9 years ago
1
Communication & Data Flow Marlon Barbero, Bonn University FE-I4 Review, CERN Nov. 3 rd - 4 th 2009
2
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 2 Contents 10:00 Communication and data flow (1h00') - Input clock and command - Clock multiplier - Data output - Readout architecture overview, simulations & calculations
3
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 3 Talk Overview Pixel Array: data formatting / compression 80×336 digital pixels Asynch. FIFO (hamming code) ‘LVDS’-out 160Mb/s 2 monitoring config. Periphery: PLL, 40MHz in, 160MHz out 40MHz digital ctrl block interface L1T global config global register bank pixel config trigger FIFO EoC Powering clk select 160MHz aux L1T, token, read, … EoC token EoC token 28 b × 40 DC pixel config What drove choice of region architecture? Efficiency region? Extra features? Data formatting block. How? Why chosen format? Data flow in DC. Coding, format. Storage in FIFO. How? Why? 8b10b encoder unit. Why? Specs/Protocol? PLL (40MHz 160MHz). Input clock, MUX, clock Xer, high speed serializer… Command decoder: Configuration, reset & L1T. Protocol chosen.
4
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 4 A- Inputs Plan: A- Inputs. B- Readout Architecture & Data Flow. C- Other means of communication -I/O. A- Inputs: A1- Clocks & clock multiplication. A2- Command decoder.
5
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 5 A1- Clock Input Main clock input: LVDS 40_MHz_clock_in. This is the clock which is: – sent to the PLL higher frequency clock generation. – sent to and used by ‘all’ blocks: Command Decoder (CMD), End Of CHip Logic (EOCHL), End Of Double-Column Logic (EODCL), Pixel Digital Region (PDR)… – Note that Data Output Block (DOB) uses higher frequency clock to stream out data at 160Mb/s. We’ll come back to that. Auxiliary clock input: LVDS AUX_clock_in. This is the clock which goes: – to the PLL. Bypass PLL and use AUX for data streaming possible. – might be used by somewhere else? (stop mode?) Clock & timing distribution, Abder’s Andre Tomek Implement. in progress
6
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 6 CLKGEN: Clock Multiplier For IBL, need to transmit data out at BW of 160Mb/s 2 options: – send a 80MHz CLK to the FE and use both edges to transmit Needs modification of BOC / ROD to produce higher speed TTC Needs synchronization protocol on the FE between 80MHz clock & beam crossing. A new DORIC needs to decode CLK at twice frequency – send a 40MHz CLK to the FE and multiply clock on FE Needs a clock multiplier on chip Note: synergy with what the strip MCC need In FE-I4, we have FE clock multiplier + AUX clock input: – Clock multiplier from the 40MHz input clock – AUX: possibility to send “your choice of clock” to the FE I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force Andre
7
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 7 CLKGEN specs I/O: – CLKGEN Input: REFCLK 40MHz input clock PLL: 640MHz, divided down to 320 /160 / 80 / 40 MHz. – CLKGEN Output: 2 Single Ended clocks selected from internal clocks (not 640MHz), 40MHz in or AUX in. Why 640 MHz: – Good Duty Cycle for divided down clocks (dual edge serializing initially intended). – Higher freq VCO Smaller LF cap, reduced area. – Synergy with other projects. – Drawbacks: Power, switching noise. Area 236×281 μm 2, I average_nominal ~ 3-4 mA, settling time ~1.2 μs, loss of lock detect.
8
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 8 PLL Overview Charge Pump Voltage Controlled Oscillator Phase Frequency Detector Frequency Divider Loop Filter Conversion, Enabling and Buffering IN: 40 MHz OUT 640 MHz 40, 80, 160, 320 available CLK fed back
9
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 9 CLKGEN Overview PLL MUX CLK0_out CLK1_out 160 320 80 40 config -> registers EN_40M EN_80M EN_160M EN_320M EN_PLL ICP (from DAC) Ibias (from DAC) Each current controlled by 8 bit DAC registers config -> registers ENables registers Ref (40M In) 640 AUX 160 320 80 40 40 In Used for data stream-out in DOB / serializer Used in 4 to 1 MUX AUX 160 320 80 40 40 In Ref2FastFb2Fast
10
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 10 CLKGEN MUX scheme allows: 1- serializing data with various clock. 2- tests of 4-chip modules for sLHC in star configuration. This FE has a special role. Accepts 80 Mb/s streams from 3 FE and streams out at 320 Mb/s. 3 FEs, each send data at 80 Mb/s
11
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 11 A2- Command Decoder There exists a dual path for configuration. Test Mode: CMOS pins + Shift Register à la FE-I4_proto1. We focus here on “standard input”: command decoder. Select between Command Decoder & Bypass from bond (InMUX_select=1). CmdDec Inputs: LVDS Command in, LVDS clock in. Similar to FE-I3 command decoder. 3 classes of commands: trigger, fast, slow. Issuing commands during running? No automatic exit from RunMode anymore, but a choice of user (slow ctrl command needed to exit). If RunMode off, fast command and trigger NOT accepted. Maurice test overview tomorrow Roberto
12
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 12 Main feature Robust vs. SEU: – All triplicated logic + majority vote (Address / WrRegData corrected each clock cycle). – State machine returns to idle quickly by construction (no need of reset -FE-I3 like-). – Error detection provided (XOR of all triplicated outputs). (Increments counter & stored in CmdBitFlip[4:0] Config. Reg.) – Trigger 11101 single bit flip safe (bit flip flagged, but trigger issued). – Various error counters (e.g. invalid field 1, 2, 3) Fully scan-able: 3 ports, TST_SE (Enable), TST_SI (Scan In), TST_Out (Scan Out). More details provided tomorrow
13
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 13 Commands trigger, fast, slow: Trigger: only the LV1 command Fast: 3 commands. Slow: allows 6 commands (16 possible). TypeNameField 1Field 2Field 3Description TriggerLV1 11101 Level 1 trigger FastBCR 101100001 Bunch Counter Reset FastECR 101100010 Event Counter Reset FastCAL 101100100 Calibration pulse SlowCMD 101101000 CommandSlow command header Trigger OR Fast / Slow? Which of 3 Fast OR Slow?
14
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 14 Commands Trigger: LV1: In RunMode only. Trigger acquisition of event. Only Field 1 (11101) needed OK with ATLAS requirement (1 trigger per 5 clock cycle). Fast: In RunMode only. Field 2 (not 1000): – BCR: Bunch Counter set to 0. – ECR: Event Counter reset. Clears data path (all memory pointers, data structures, clears pending events). Interrupts data transmission if in progress. – CAL: Calibration pulse sent in response. Delay (bx granularity) up to 64 bx. Width (1-256). Dig & Analog hits. Slow: Accepted at all time. No automatic taking of RunMode. Field 2 is 1000. Cal Inj. Abder Dig. Inj. Tomek
15
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 15 Slow Commands NameField 3Field 4 1 Field 5 2 Field 6Bits 2 Description RdRegister 0001 ChipIdAddress-- Read addressed FE register. WrRegister 0010 ChipIdAddressData16 Write into addressed FE register. WrFrontEnd 0100 ChipId-Data672 Write conf data to enabled DC. (1 to 40 DC @ a time) GlobalReset 3 1000 ChipId- Reset command. Puts the chip in its idle state. GlobalPulse 3 1001 ChipId-- 1-64 Has variable pulse width.Used to latch / read data, inject dig hit, enable clock in stop mode. Reset command too. EnDataTake 3 1010 ChipId - - Sets the FE in RunMode. Identifies slow command 3+1 bits Chip ID; broadcast 1xxx 6 bit address, for WrReg & RdRegused by all write operations: into RegisterBank OR Config FE (40 DC shift register, 672 bits) Note: A DisableRunMode command will be added.
16
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 16 Notes on slow commands -1- List of global register (8-16b): doc FEI4_Global_Register_vX. – These are SEU-hard latches: Analog Pixel tuning: e.g.: PrmpVbpf, DisVbnA, Amp2Vbn… (~20) FE Config.: PxStrobes (13 bits / 13 latches), PxSRSetup (Write to which DC, S0, S1, … global communication to/from analog DC). LVDS (bias) / PLL (bias, clk config.) / VCAL (Internal calibration, 10 bits + setting delay 6 bits, LSB~1ns). DIG mode (DC clock source in stop mode, 8b/10b disable,…) ColMask / ErrorMask / Trigger (Latency setting, self trigger, # consecutive…). Empty Record (empty pattern when 8b10b disabled). ANAsel 1/2/3: MUX test analog buffer. CMOSout 1/2: sel for InMUX. see data out protocol Abder see InMUX Michael Andre Tomek / J-D
17
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 17 Notes on slow commands -2- Global Register: – Also EFuse shadow register: for redundant SR of Double- Columns, trimming of references (CREF, VREF), Chip Serial Number. Grand Total: ~ 50 Global Register, either 8b or 16b wide.
18
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 18 Notes on slow commands -3- WrFrontEnd: writing configuration to FE, 672 bits / 1 Double- Column granularity. Which DC addressed set by PxSRSetup register (0-40 possible). 13b / pixel Configuration of complete FE takes ~9ms. Note on WrFrontEnd: writing the register is also shift register out.
19
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 19 Notes on slow commands -4- GlobalReset: Reset the whole FE to initial state. GlobalPulse: Reset command of various length. Selective reset based on length (à la FE-I3). Also used to latch / read data, inject dig hit, ctrl stop mode. EnDataTake: Sets the FE in RunMode. Can then decode L1T and Fast Commands. No automatic exit.
20
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 20 B- Readout Architecture, Data Flow B1- 4-pixel digital region. B2-Data transfer through the Double-Column. B3- Compression / formatting at End of Column. B4- Storage in FIFO. B5- 8b10b coder and protocol out.
21
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 21 B1- 4-pixel digital region Choice made based on 3 ways of checking performance of architecture chosen: – C++ description of chip: flexible framework with time-based description of pixel region / DC / Chip / Communication protocol all 1 st studies coupling pixels in phi / z / z&phi, various region size… Based on physics hits (see backup). Identified 4-pixel region architecture. – Verilog model and test bench: Towards implementation. Other sources of inefficiency + power. – Analytical model: Mathematical crosscheck of inefficiency (not time-based, no protocol). Coherent picture For 3×LHC full lumi & 3.7cm layer: 4-pixel region tied in phi & z the winner! David Arutinov Tomek
22
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 22 4-pixel region specs Storage of up to five 4-pixel + neighbor events. Small / big hit discrimination, 3 programmable modes (of course no discrimination available too). 2 BX association for small hits. Analog info = 4b ToT. Neighbor Logic (small hits in adjacent pixels -phi-): 4 bits. Records up to 16 consecutive triggers. Programmable latency up to 255 BX.
23
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 23 Digital Pixel: Regional Architecture local storage Store hits locally in region until L1T: 0.25% of pixel hits shipped to EoC DC bus traffic “low”. low traffic on DC bus Consequences of regional architecture: Each pixel is tied to its neighbors -time info- (clustered nature of real hits). Small hits are close to large hits! To record small hits, use position instead of time. Handle on TW. Spatial association of digital hit to recover lower analog performance. Lowers digital power consumption (below 10 μW / pixel at IBL occupancy). Physics simulation Efficient architecture. disc. top left disc. bot. left disc. top right disc. bot. right 5 ToT memory /pixel 5 latency counter / region hit proc.: TS/sm/big/ToT Read & Trigger Neighbor Token L1TRead Digital Region 4-Pixel Unit
24
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 24 Performance / Efficiency IBL: charge sharing in Z comparable to phi Memories SimulationAnalytical IBL10xLHCIBL10xLHC 5 0.047%2.19%0.029%2.25% 6 0.011%0.65%0.003%0.57% 7 <0.01%0.16%<0.01%0.13% η=0 Mean ToT = 4 0.6% Regional Buffer Overflow @ IBL rate, pile-up inefficiency is the dominant source of inefficiency Inefficiency: Pile-up inefficiency (related to pixel x-section and return to baseline behavior of analog pixel) ~ 0.5%. Regional buffer overflow ~0.05%. Inefficiency under control for IBL occupancy.
25
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 25 Digital Power 4-pixel region for 21 regions <7mV Digital power: at IBL occupancy, digital power < 10μW/pixel. Drop on Vdd Tomek gives more recent estimate in his PDR
26
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 26 B2- Digital DC / Data transfer Made of 168 4-pixel digital region. In DC, Token based readout (dual token scheme DC / EoC with triple redundancy + majority voting). 21 4-pixel digital region the base structure for clock / buffering: – Skew-compensated clock routing ~0.8ns skew for all pixels of array? – Buffering of read / L1T. Data transferred to FIFO asap. All controlled by EOCHL. Address transfer with minimal number of gates for yield enhancement (thermal encoder scheme). Data + Address is hamming coded, decoded and corrected before data compression block. Tomek Jan-David
27
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 27 B3- Formatting at EoC Reducing bandwidth an issue, both at IBL & sLHC. Estimated data rates with the same tools as previously described, physics-based (MC data from Vadim Kostyukhin), extrapolation at various radius and various possibility to reduce rates. Studied clustering possibility, proximity algorithms, formatting. See backup formatting section. Formatting also to fit FIFO / 8b10b coding needs. Tomek
28
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 28 e.g.: 10×LHC (50ns bx) / sLHC 40 80 120 160 200 100 0 0200300400500 600 r [mm] z [mm] 324 524 37/37 70 131 201 50.5 88.5 122.5 mean: 60 mean: 34 mean: 13.4 mean: 8.4 mean: 3.9 210 150 FE-I4, 50μm×250μm. FE-I4 simul., 50μm×250μm. FE-I4 Nigel, 50μm×250μm. FE-I4 sdtf 220908, 50×250μm 2. rates given in [pixel hits.bx -1 cm -2 ] η=0η=0.1η=0.2η=0.3η=0.4η=0.5η=0.6η=0.7η=0.8 η=0.9 η=1.0 η=1.2 η=1.5 η=2.0 η=2.5 η=3.0 η=3.5 55.1038.6759.1560.1260.0258.7461.18
29
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 29 Pixel occupancy Data bandwidth Pixel hit rate FE output bandwidth: – # bits / pixel transmitted? » address 7+9 bits, analog info 4+2 bits 22b? » data output protocol? Reduce data output by taking into account clustered nature of physics hits / geometry. NUMBER OF PIXELS FE-I4, central module, 21cm layer FE-I4, central module, 3.7cm layer 10xLHC FE-I4, central module, 3.7cm layer 3xLHC
30
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 30 Formatting considered Clustering: Z-clustering. Can have logic in EoC to calculate Zcluster size (above certain # pixel adjacent in Z) discard analog info (long clusters in Z info not useful), ship out pixel ID + size of cluster At η~2.0, 0.6 BW? BUT Very dependant on FE location, and throw away analog info. NO. Proximity algorithm: Send out relative addresses pix: 7+9b 1 + 3b address (8 “next pixels” coded this way). 0.8 BW? BUT variable data format, error prone. Fixed format clustered data transfer. distancecount 237975 127978 6551921 31527 4929 653878 6482 5352 8345 10303 12280 11278 14267 7262 9257 01 656 657 2 3658 659 4 5.. 653 654655 numbering scheme histo distance
31
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 31 Fixed format clustered data compression factor (all at 3×LHC) 3.7cm (vs. 21cm), η=0 indiv pixels: 4.09 (0.25)×(7+9+4+2)= 1.00 (1.00)A.U. static 1×2: 3.45 (0.18)×(7+8+2×4+2)=0.96 (0.83) A.U. dynamic 1×2: 3.02 (0.15)×(7+9+2×4+2)= 0.87 (0.74) A.U. static 1×4: 2.86 (0.17)×(6+8+4×4+4)=1.08 (1.08) A.U. dyn. in-DC 1×4: 2.43 (0.15)×(6+9+4×4+4)= 0.95 (0.95) A.U. dynamic 1×4: 2.13 (0.14)×(7+9+4×4+4)= 0.85 (0.94) A.U. DC (×40) row (×336) column rowToT NL 10 6.count.FE -1.s -1 preliminary Choice: Dynamic phi-pairing (dynamic 1×2) merge neighbours and small hits in process. Compression ok, simple to do and good format, 24 bits (nice for FIFO and 8b10b). Note that hamming decoding needed before formatter.
32
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 32 B4- data storage FIFO In FIFO, record words stored as 3×8b words. Beginning of data event, EOCHL stores 24-b Header in FIFO. Then data words are stored, address (16-b) + ToTs (8-b) for 2 pixels. In FIFO are also stored: – Read back from Configuration. – Service messages. More in summary data format below. Jan-David
33
From DC to FIFO DC 6b Region Add 8b Data 20b Hamming Decoder From Columns Event Builder Hamming Encoder Data Switch Read out Control Header Fifo 8 places 3 * 12 Bit Word 0 Word 1 Word 2 8 Bits12 Bits 36 Bits Read Busy WriteFull ServiceRead Back
34
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 34 B5- 8b10b encoder and protocol Normal mode is 8b10b coded. Test mode is 8b10b off. Good thing for testing the link (requirement of off-detector group).
35
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 35 8b10b For IBL, need to transmit data out at BW of 160Mb/s At BOC/ROD: – Data rate 4 times the clock rate – Phase adjustment Use Clock Data Recovery mechanism CDR requires an output data stream with good engineering properties 8b10b: – adequate for this purpose, enough transitions for reliable CDR – widely used easy to implement – provides some level of error detection – provides comma for frame identification & synchronization I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force
36
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 36 Control symbols & Commas Control symbols: a set of 12 extra valid 10-bit sequences. Can be used as command. K28.1, K28.5, K28.7: commas. 11111 or 00000 can not be found anywhere else in data stream
37
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 37 K.28.7 & bit flip Single bit flip in K.28.7 cannot transform the stream into another meaningful stream (only into K.28.1 & K.28.5). K.28.1, K.28.5, K.28.7 can be used for re-synchronization of the 10-bit streams in case of loss of synchronization (only streams having a running disparity of +/-5). In sync. state, flip in regular data can not generate K.28.7. Note: Restriction. Not 2 K.28.7 in a row use K.28.5/1 for fillers.
38
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 38 Frame Frame built up (meaningful frames + empty records) such as: – detects single bit flips in synchronized state. – tolerant to loss of sync (data slipping) re-sync on next commas. – state machine in receiving part: can do CDR. can search the 11111 or 00000 stream and re-synchronize to the commas. – state machine in receiving part needs to: check for violation of 8b10b protocol.
39
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 39 Format implemented -1 Three 8b10b commas used: – SOF: K.28.7. – EOF: K.28.5. – Idle state: K.28.1. Record words: (all that follows shown before 8b10b for clarity) – 24 bits long. – all start with 11101 (except: data record DR & empty record ER). – Data Header (DH): | 11101 | 001 | xxxx | [3:0]trigID | [7:0]bcID | header for transmission of regular data. 001: 1-b flip give invalid code. xxxx: for later uses. trigID is trigger ID as received by ROD, bcID bunch crossing ID, needed for internal consecutive triggers (up to 16 trigg depending on RunMode), stop mode (up to 255 trigg).
40
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 40 Format implemented -2 – Data Record (DR): | [6:0]Column | [8:0]Row | [3:0]ToTtop | [3:0]ToTbot | Column numbering goes from 0000001 to 1010000. – Address Record (AR): | 11101 | 010 | Type | [14:0]Address | Address of a global register, or the position of the shift register. 010: Flags the Address Record; 1b flip gives invalid code. Type: 1 bit information. 0 = Global Register; 1 = Shift Register. [14:0]Address: If Type 0, address gives the Global Register ID. If Type 1, address gives the Shift Register position. Note that the transmission of an Address Record always requires the transmission of an associated Value Record. – Value Record (VR): | 11101 | 100 | [15:0]Value | Value of a global register, or value contained in the shift register. Note that the use of 11101 followed by 100 for a Value Record allows also for sending Value Records with no Address Record before.
41
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 41 Format implemented -3 – Service Record (SR): | 11101 | 111 | [15:0]Message | A service message (e.g. error message). 111: Flags the Service Record; 1b flip gives invalid code. [15:0]Message: Service message. Note that SR can belong to a data stream only a single SR is then allowed. – Empty Record (ER): | 3×ER[xxxx.xxxx] | When 8b10b coding is turned off, to fit the 24-bit long record word requirement and to ease the recognition of the end of a data stream, Empty Records are simply made of as many as needed 24-bit long programmable words. These can be 0-frames, but also 11001100 for example (that’s the 40MHz clk sent back). When 8b10b is on, and no data / SR / config. read back is pending, ERs are then transmitted out, made of as many as needed K.28.1 commas.
42
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 42 Summary data format 24-bit Record Word Acro -nym Field 1Field 2Field 3Field 4Field 5Comments Data Header DH11101001xxxx [3:0] trigID [7:0] bcID xxxx reserved for later use Data Record DR [6:0] Column [8:0]Row [3:0] ToTtop [3:0] ToTbot Column numbering: 0000001 to 1010000 Address Record AR11101010Type [14:0] Address Type 0: Global Register / Type 1: Shift Register Position Value Record VR11101100 [15:0] Value Value Record without previous Address Record allowed Service Record SR11101111 [15:0] Message Service Message (e.g. error codes) Empty Record ERERvalue Idle = K.28.1 commas (8b10b coding case) e.g.: SOF | DH | DR | DR | DR | SR | EOF | Idle | Idle | AR| VR | AR | VR | Idle…..
43
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 43 Output rates (sLHC) Trigger rate100 kHz Interactions per crossing400 Sensor model260um planar, unirradiated Comparator threshold4000e Output format for analog dataFixed frame dynamic 2 pixel phi pairing Bits / pixels per analog output frame26 / 2 Output format for binary dataFixed frame dynamic 2 pixel phi pairing (L2, L3) Fixed frame dynamic 4 pixel group (L0, L1) Bits / pixels per binary output frame24 / 4 (L2, L3) ; 20 / 2 (L0, L1) Encoding, parity, redundancy or headersNone Design marginFactor of 2 Layer (~rad.), [cm] comp. firing per cm^2 per BX Required bandwidth per chip (Mb/s) (analog / binary) chips/ modul e 320Mb/s LVDS outputs / module (analog / binary) EOS card data volume (Gb/s) (analog / binary) FE-I4 chip data losses (*) x 10 -4 3.760.0749 / 45413 / 212.0 / 7.3n/a + 5 718.4230 / 14043 / 25.5 / 3.4n/a + 2 166.675 / 5841 / 12.4 / 1.818 + 1 203.942 / 3241 / 12.7 / 2.110 + 0 disks80 max?412.9?10 to 20? Now dynamic phi-pairing, 24 bits / 2 pixels. Requirements for SLHC pixel electrical system (system design task force) The simulations for an IBL at 3.7 cm radius and a luminosity of x3 LHC indicate that a data rate of at least 86 Mbps per FE. I/O recommendations for IBL, Dec. 08. (system design task force)
44
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 44 C- Other means of communication -I/O C1- Stop mode. C2- External control of DC. C3- No 8b10b. C4- InMUX.
45
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 45 C1- Stop mode Stop standard data acquisition and read all hits from chip. How: – clock gating. – L1T and clock controlled externally by user (or logic). – e.g. procedure: Set latency to proper value (max?). StopMode on. Clock control gated. Send one trigger, send one clock, read all FE. Implementation details still worked on. This will be an important test mode: test PDR!
46
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 46 C2- External control of DC A feature implemented in test submission of digital region (3D Tezzaron-Chartered). Control of DC externally with few signals: – L1T, read, peripherical trigger_counter value (and clock) to be provided from outside. – Off-chip, sense token (sent out) to know if data is available. Might turn out to be an interesting test feature too. Needed? Still debatable. Implementation not yet done.
47
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 47 C3- No 8b10b Link test like (need?) clock sent back from FE turn off 8b10b, send empty frame with appropriate Empty Record value to mimic clock. Convenient to more directly check output data. Speed
48
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 48 C4- InMUX Multipurpose slow control access: – 4 configurable CMOS inputs and 4 outputs (their function depends on the 3 configuration pins InMUX_select). – InMUX 1: direct control of Global registers and pixel SR. – InMUX 2: manual control of EODCL. – InMUX 3: control of end of chip logic. – InMUX 4: scan chain for CMD. – InMUX 5: scan chain for EOCHL.
49
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 49 THE END MORE INFO IN BACK-UP SLIDES (organized by topic) IF NEED IS.
50
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 50 BACKUP BACKUP CLOCK
51
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 51 Star Config. Timing
52
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 52 Why f out,max =640 MHz f out =640 MHz is not a big challenge in 130 nm CMOS Frequency division easily possible More precise duty cycle handling @ 160 MHz Smaller capacitance values in LF, less area Potential need of a higher clock in the future Synergy with other projects
53
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 53 BACKUP BACKUP 4-pixel region
54
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 54 Introduction Pixel hit rate consequences on digital architecture and on FE data bandwidth (data output protocol, module concept, EoS…). Events: (Pythia generator) – WH(120GeV); H bb. – overlaid with: 24 / 75 / 240 / 400 events pileup. “LHC”/“3×LHC”/ “sLHC” (25ns / 50ns bx) Geometry: (Geant3 simulation package) – pixel size: FE-I3: 400×50μm 2 ; FE-I4: 250×50μm 2. – first: 4 barrels, 3.7 (FE-I4) & 5.05/8.85/12.25 cm radius FE-I3. – new: 6 barrels, 3.7/5.05/8.85/12.25/16/21 cm radius FE-I4. Threshold: first 3750e -. New down to 1000e -. Files contain a list of pixels that record digital hit on a bunch- crossing basis. _
55
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 55 Foreword: Minimal Bias events FE-I4 for: - b-layer upgrade: luminosity? radius? 75 ev pile-up & 3.7cm. - s-LHC: lumi.? radius? 240/400 ev pile-up & outer layer. Extrapolation to LHC energy: extrapolation @ 14TeV: uncertainty ~ 30%? (1 st years operation crucial to feedback simulation) / interaction at η=0
56
Option for the region (history) Different pixel organizations 2x2 (truncation) 1x4 (truncation) 1x4 1x8 Timing 40MHz Clock 20MHz Clock BCID (8bit gray timing)
57
Hit processing (HC3 mode)- schematic Receives comparator output BC resolution Generates Leading Edge (LE) Generates Small hit Leading Edge (sLE) Generates Trailing Edge (TE) Generates ToT counter reset and enable (rst_cnt, en_cnt)
58
Hit processing timings 58 HC=1 1. Signal from comparator to short (no positive edge of clock) 2. Signal from comparator is recorded 3. Small leading edge (sLE) - signal to neighbor (always 2 BC) 4. Leading Edge (LE) 5. Trailing edge (TE) 6. Reset (rst_cnt) and enable counter 1 2345 6 comp clk LE sLE TE rst_cnt cnt_clk tot_cnt mem_clk
59
ToT processing - schematic 59 Start ToT Counter Global LE generation (orLE) Reset memory signal generation (rst_mem) Memory pointer selection (freeAddr) Record reset/small in memory Record neighbor Record TOT value in memory
60
Memory Management - schematic 60 Selects free memory Token management Selects triggered memory during read Enables outputs Design: x5 latency cell
61
Latency Cell /Trigger- schematic 61 Start/Reset latency counter Indicate status (full) Trigger (triggered) Store/Recognize trigger ID
62
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 62 BACKUP BACKUP formatting
63
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 63 e.g.: 3×LHC 40 80 120 160 200 100 0 0200300400500 600 r [mm] z [mm] 37 50.5 88.5 122.5 6.246.405.985.805.876.406.06 2.532.542.522.532.622.632.62 1.401.231.25 1.361.331.32 4.073.874.043.983.944.072.69 FE-I3, 50μm×400μm. FE-I4 simul., 50μm×250μm. η=0.1η=0.2η=0.3η=0.4η=0.5η=0.6η=0.7η=0.8 η=0.9 η=1.0 η=1.2 η=1.5 η=2.0 η=2.5 η=3.0 η=3.5 (assumption: 100kHz L1T, 336×80 pixels FE-I4) rates given in [10 6 pixel hits.module -1 s -1 ] [10 6 pixel hits.FE -1 s -1 ] For reference in backup slides: rates given in [pixel hits.bx -1 cm -2 ] For reference in backup slides: 3LHC same radius, FE-I4 3.98 6.06 2.55 1.30
64
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 64 Extrapolations to other radius Reasonable fit with: exp(1.34-0.57*R)+0.15-0.0053*R sLHC, 50ns bx / 400 events pileup Hits/mm 2 r [cm] Hits/mm 2 r [cm] sLHC, 25ns bx / 240 events pileup Reasonable fit with: exp(0.86-0.58*R)+0.088-0.0031*R sLHC (25ns)sLHC (50ns) Radius layer [mm][pix.bx -1.cm -2 ] 373560 50.519.534 7010.618.4 88.57.813.4 122.54.78.4 1314.78.3 1504.27.1 2012.54.4 2102.33.9
65
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 65 Pixel occupancy Data bandwidth Example 1: pixel clustered in z. – no useful analog info. – can have logic in EoC to calculate Zcluster size ship out pixel ID + size of cluster. FE-I4, module 4, 3.7cm layer reduction in BW EoC 13579 ≥ cluster size
66
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 66 Pixel occupancy Data bandwidth Example 1: pixel clustered in z. – no useful analog info. – can have logic in EoC to calculate Zcluster size ship out pixel ID + size of cluster. reduction in BW FE-I4, central module, 3.7cm layer EoC 13579 ≥ cluster size
67
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 67 Pixel occupancy Data bandwidth Example 1: pixel clustered in z. – no useful analog info. – can have logic in EoC to calculate Zcluster size ship out pixel ID + size of cluster. reduction in BW FE-I4, central module, 3.7cm layer Very dependant on FE location! And throw away analog info. EoC 13579 ≥ cluster size
68
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 68 Pixel occupancy Data bandwidth Example 2: proximity algorithms. distancecount 237975 127978 6551921 31527 4929 653878 6482 5352 8345 10303 12280 11278 14267 7262 9257 FE-I4, central module, 3.7cm layer 01 656 657 2 3658 659 4 5.. 653 654655 Send out relative addresses: pix: 7+9b 0 + 16b add 1 + 3b add (8 “next pixels” coded this way) compression efficiency: ~ 0.66 (address only!) 3xLHC 10xLHC But: variable data format length error prone
69
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 69 Pixel occupancy Data bandwidth Example 3: clustered data out with fixed format. bit count / pixel (all at 3×LHC) 3.7cm (vs. 21cm), η=0 indiv pixels: 4.09 (0.25)×(7+9+4+2)= 90.0 (5.49) Mb.FE -1.s -1 static 1×2: 3.45 (0.18)×(7+8+2×4+2)=86.2 (4.58) Mb.FE -1.s -1 dynamic 1×2: 3.02 (0.15)×(7+9+2×4+2)= 78.5 (4.08) Mb.FE -1.s -1 static 1×4: 2.86 (0.17)×(6+8+4×4+4)=97.2 (5.92) Mb.FE -1.s -1 dyn. in-DC 1×4: 2.43 (0.15)×(6+9+4×4+4)= 85.3 (5.23) Mb.FE -1.s -1 dynamic 1×4: 2.13 (0.14)×(7+9+4×4+4)= 76.5 (5.15) Mb.FE -1.s -1 DC(×40) row (×336) column rowToT NL Disclaimer: no header, trailer, DC-balancing, error correction… assumption: 100kHz L1T, 336×80 pixels FE-I4 10 6.count.FE -1.s -1 preliminary
70
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 70 Pixel occupancy Data bandwidth Example 3: clustered data out with fixed format. compression factor (all at 3×LHC) 3.7cm (vs. 21cm), η=0 indiv pixels: 4.09 (0.25)×(7+9+4+2)= 1.00 (1.00)A.U. static 1×2: 3.45 (0.18)×(7+8+2×4+2)=0.96 (0.83) A.U. dynamic 1×2: 3.02 (0.15)×(7+9+2×4+2)= 0.87 (0.74) A.U. static 1×4: 2.86 (0.17)×(6+8+4×4+4)=1.08 (1.08) A.U. dyn. in-DC 1×4: 2.43 (0.15)×(6+9+4×4+4)= 0.95 (0.95) A.U. dynamic 1×4: 2.13 (0.14)×(7+9+4×4+4)= 0.85 (0.94) A.U. DC(×40) row (×336) column rowToT NL assumption: 100kHz L1T, 336×80 pixels FE-I4 Disclaimer: no header, trailer, DC-balancing, error correction… dyn. 1×4 better at small R? (larger η!) dyn. 1×2 at large R? 10 6.count.FE -1.s -1 preliminary For reference in backup slides: same at higher η
71
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 71 BACKUP BACKUP FIFO
72
After FIFO Fifo 8 places 3 * 12 Bit Hamming Decoder 8 Bit / 10Bit Serializer Multi- plexer Fifo empty Read
73
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 73 BACKUP BACKUP 8b10b and CDR
74
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 74 Main characteristics of 8b10b coding 8 bits data 10 bits data. DC-balance: same number of 0’s and 1’s. Disparity of 10b word: - 2, 0 or +2. Maximum run length without transitions: 5 bits. DC balancing Frequent transitions in data stream Essential for clock recovery from the data stream (allows CDR). Some low level of error detection. 256 data symbols (Dx.y) + 12 specific control symbols (Kx.y)
75
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 75 8b10b code mapping: data -1- Dx.y: 8 bits 10bits. Splits the 8 bits in 3 MSBs (y) and 5 LSBs (x). y = 3MSBs 4 bits. x = 5LSBs 6 bits.
76
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 76 8b10b code mapping: data -2- 3 bits 4 bits 5 bits 6 bits (…)(…) (…)(…) (…)(…)
77
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 77 8b10b code mapping: data -2- 3 bits 4 bits 5 bits 6 bits (…)(…) (…)(…) (…)(…) Choice made from value of 4b6b stream to respect some properties (Disparity, uniqueness of some bit string).
78
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 78 8b10b code mapping: disparity counter Disparity D= #1’s - #0’s. 4 bits: 16 values. Only 6 are disparity neutral (need 2 3 ). 6 bits: 64 values. Only 20 are disparity neutral (need 2 5 ). 4 bits and 6 bits: Only even disparity possible. Therefore allow transmission of values with disparity -2, 0 and +2. Track down RD counter value and compensate! RD=-1
79
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 79 8b10b code mapping
80
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 80 8b10b code mapping RD+ data w. D = -2 or 0 transmitted RD- data w. D = 0 or +2 transmitted
81
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 81 Error detection & Control symbols Not real error detection: out of 1024 10-bit sequences, only ~1/2 + 12 are allowed, remaining produce an error flag. Control symbols: a set of 12 extra valid 10-bit sequences. Can be used as command. K28.1, K28.5, K28.7: commas. 11111 or 00000 can not be found anywhere else in data stream
82
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 82 Robustness? Stuff bits for error detection (parity check?/CRC?)? Max 5 consecutive identical bits. Furthermore, in commas only! Keep event synchronization: Robustness against single bit flip in header. Use commas in the header? unique stream in K28.1, K28.5, K28.7 data / monitoring / configuration. Some level of burst error protection too.
83
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 83 Clock and Data Recovery -CDR- 1 In receiver, PLL with approximate frequency reference, where phase alignment is done (phase alignment to the transitions in the data stream).
84
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 84 Clock and Data Recovery -CDR- 2 PLL needs to lock to 8b10b stream: not a periodic signal! PFD needs to be flexible enough to allow no transition in data stream during clock period. Ex: Use tapped delay line, designed to be more than 1 data bit period, but less than 2. Transition in data stream: – centered: correct phase. – elsewhere: correct the VCO. – not present: do nothing.
85
Marlon Barbero, FE-I4 Review, Communication / Data Flow, CERN 2009 Nov. 3/4 85 Clock and Data Recovery -CDR- 3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.