Presentation is loading. Please wait.

Presentation is loading. Please wait.

Uneven Bin Width Digitization and a Timing Calibration Method Using Cascaded PLL Wu, Jinyuan Fermilab May. 2014.

Similar presentations


Presentation on theme: "Uneven Bin Width Digitization and a Timing Calibration Method Using Cascaded PLL Wu, Jinyuan Fermilab May. 2014."— Presentation transcript:

1

2 Uneven Bin Width Digitization and a Timing Calibration Method Using Cascaded PLL Wu, Jinyuan Fermilab May. 2014

3 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 2 Introduction Digitization with uneven bins is needed in FPGA based TDC. The differential nonlinearity is acceptable in many cases. A value called equivalent bin width is defined. A scheme of generating calibration pulses with cascaded PLL circuits is presented. The same scheme can be used for clock phase measurement.

4 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 3 TDC Using FPGA Logic Chain Delay Convenient. Low cost. But the bin widths are not uniform.  IN CLK

5 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 4 16 Patterns @ 400 MHz VCCINT =1.20V VCCINT =1.18V Delay Line Speed vs. Core Voltage

6 Adjusting Bin Widths? Feb. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov PPS TDC 5 Compensation : Adjusting bin width to certain value. Slowing down the delay Chain? Linearization : Fine tuning width of each bin. Cost? 

7 Nonlinearity = Something Bad? Feb. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov PPS TDC 6 Nonlinear scales are commonly used. Sometimes, the markers can be in arbitrary (but known) positions, such as in solar spectrum. solar-spectrum-from-www-mao-kiev-ua-- sol_ukr--terskol--bmv_m Association of Universities for Research in Astronomy Inc. (AURA)

8 The Equivalent Width Feb. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov PPS TDC 7 n-1 W 2013456nn+1 w0w0 w1w1 w2w2 w3w3 w4w4 Digitizers with non-uniform bin widths are able to make precise measurements as long as it is calibrated appropriately. A equivalent bin width can be defined as above. The calibration can be done offline or/and online. w eq

9 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 8 Auto Calibration: Histogram Booking In the auto calibration process, a bin width histogram (DNL histogram) is first booked. More counts are accumulated in wider bins. DNL Histogram In (bin) LUT  Out (ps) 16-32K Events

10 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 9 Auto Calibration: Summing Lookup Table Bin widths are summed up into the calibration lookup table. Note that the values represent times of the centers of the bins. DNL Histogram In (bin) LUT  Out (ps)

11 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 10 Calibration Pulse Generation: Random != Uniform 16384 Events When number of events is finite, random hits has large fluctuations. Pulses with evenly spread timing relative to the TDC clock are desirable.

12 Cascaded PLL Circuits Feb. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov PPS TDC 11 Two stages of PLL circuits are cascaded together. f(CK250a) = 250 MHz f(CK251c) = 250.06 MHz f(CK251c) = ( 4096/4095 )*f(CK250a) T(CK250a) - T(CK251c) = 0.97 ps. CK250a CK251c CLOCK_50

13 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 12 Phase Differences The relative timing differences between the CK250a and CK251c cover entire range of 4000 ps with 4096 cycles. The 2 N number 4096 is chosen for easy implementation of the calibration sequencing functional block.

14 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 13 Test Result in an Oscilloscope Screen Capture A total of 16384 Calibration edges are collected. Entire 4000 ps range are scanned 4 times (4*4096 = 16384). The histogram (with 50 ps/bin) serves as a demonstration of calibration lookup table. Trigger Edges By CK250a Trigger Edges By CK250a Calibration Edges By CK251c Calibration Edges By CK251c Calibration Lookup Table Calibration Lookup Table

15 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 14 Clock Phase Measurement, Another Application Two clocks from same source but with different phases are multiplied in PLL. The CK251c clock scans entire 4000 ps range and the correctness of the captures of the DFF driven by two clocks are checked. D Q Correctly Captured? CK250aCK251c D Q Correctly Captured? CK250bCK251c Cascaded PLL Circuits CK251c CK250a Cascaded PLL Circuits CK250b CK50a CK50b

16 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 15 Oscilloscope Screen Capture The phase difference of CK250a and CK250b can be measured after CK251c scans through. The 0-1 and 1-0 transitions have different setup time. Captured Correctly By CK250a Captured Correctly By CK250a Captured Correctly By CK250b Captured Correctly By CK250b Captured 0-1 Trans. By CK250b Captured 0-1 Trans. By CK250b Captured 1-0 Trans. By CK250b Captured 1-0 Trans. By CK250b 4 ns/step => 0.97 ps

17 The End Thanks

18 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 17 Good, However Auto calibration solved some problems However, it won’t eliminate the ultra-wide bins 

19 Concern: Dead Time? May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 18

20 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 19 Wave Union Launcher A In CLK 1: Unleash0: Hold Wave Union Launcher A Regular TDC records only one transition Wave Union TDC records multiple transitions.

21 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 20 Wave Union Launcher A: 2 Measurements/hit 1: Unleash

22 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 21 Sub-dividing Ultra-wide Bins 1: Unleash 1 2 12 Device: EP2C8T144C6 Plain TDC:  Max. bin width: 160 ps.  Equivalent bin width: 60 ps. Wave Union TDC A:  Max. bin width: 65 ps.  Equivalent bin width: 30 ps.

23 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 22 Time Measurement Errors Due to Power Supply Noise Typical RMS resolution is 25-30 ps. Measurements with cleaner power (diamonds) is better than noisy power (squares). Switching Power Supply Switching Power Supply Linear Power Supply Linear Power Supply

24 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 23 Pipeline Structure of TDC Time Sensing Block The front-end of the TDC is designed with pipeline structure. There is nearly no dead time in this section. A hit can be digitized every clock cycle (@250 MHz). However, we introduce some dead time by using slower clock to save power. Hit Detect Logic Coarse Time Counter Fine Time Encoder In CLK ENA Fine Time Coarse Time Data Ready

25 Concern: Low-power? May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 24

26 Delay Line & Sampling Register Array Low-Power Design Practice: Clock Speed May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 25 The Sampling Register Arrays are clocked at 250 MHz. All other stages are clocked at 62.5 MHz. When a valid hit is sampled, the Sampling Register Array is disabled so that the registered pattern is stable for 64 ns. The Data Load/Transfer Registers are enabled to load input 64 ns, so that a valid hit is guaranteed to be load once and only once. CK250 Data Load/ Transfer Register CK62 Load Clock Disable Sequencer Encoder IN0 Buffer w/ Zero Suppression 250 MHz62.5 MHz

27 Delay Line & Sampling Register Array Low Power Design Practice: Resource Sharing May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 26 The Data Load/Transfer Registers are enabled to load input 64 ns, (i.e., 4 clock cycles at 62.5 MHz). The Data Load/Transfer Registers transfer data from other channels when they are not enabled to load. Four channels share an Encoder and a Buffer with Zero Suppression. CK250 Data Load/ Transfer Register CK62 Load Clock Disable Sequencer Encoder IN0 IN1 IN2 IN3 Buffer w/ Zero Suppression Data Merging Register 250 MHz62.5 MHz

28 Low-Power Design Practice: Wave Union May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 27 Intrinsically the Wave Union TDC is a low- power scheme. Multiple measurements are made with one set of delay line, register encoder etc. yielding finer resolution that otherwise needs several regular TDC blocks to achieve.

29 Concern: Data Packing? May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 28

30 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 29 Data Packaging: Block Diagram For each straw, 2 TDC and 1 ADC are implemented. Time and charge data are grouped and sent out together. 1 Straw: 2 TDC + 1 ADC Carry Chain Reg. Array Encoder Buffer & Data Packing Output Buffer Parallel-to -Serial Converter Data Carry Chain Reg. Array Encoder ADC Data 1 Straw: 2 TDC + 1 ADC

31 Data Packing: A Real Design for a Similar Project May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 30 TDC and ADC data packaging for OpenPET of LBL.

32 Data Bit Layout May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 31 012345678910111213141516171819202122232425262728293031 CH: 0-15 TDC Fine Time LSB=15.625ps TDC Coarse Time: LSB= 4 ns001 Hit Header & Count Header ADC 0 0 ADC 1 ADC 2ADC 3 ADC 4ADC 5 ADC 6ADC 7 ADC 8ADC 9 ADC 10ADC 11 TDC Fine Time LSB=15.625ps TDC Coarse Time: LSB= 4 ns0010 Hit Count CH: 0-15 TDC Fine Time LSB=15.625ps TDC Coarse Time: LSB= 4 ns001 Hit ADC 0 0 ADC 1 ADC 2ADC 3 ADC 4ADC 5 ADC 6ADC 7 ADC 8ADC 9 ADC 10ADC 11 TDC Fine Time LSB=15.625ps TDC Coarse Time: LSB= 4 ns0010 Data layout for full ADC resolution. This scheme uses 256 bits/hit. There could be other layout with 128 bits/hit.

33 Connection Between Digitizer and ROC May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 32 Clock and frame signals are provided along with data links. Data links runs at 200 M bits/s Digitizer ROC TX Clock Data Clock Generator RX Frame Generator

34 Data Rate: Is 200 Mb/s Enough? May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 33 Assumption: 1695 ns micro-bunch length. 900 ns data taking window. 1 LVDS data output pair for every 4 straws. The 300 kHz hit rate in TDR is likely an over estimate. As long as the actual hit rate is < 200 kHz, data link of 200 Mb/s per LVDS pair should be sufficient. Hit/Straw256 bits/hit128 bits/hit 300 kHz253 Mb/s126 Mb/s 200 kHz169 Mb/s85 Mb/s 100 kHz85 Mb/s42 Mb/s 30 kHz25 Mb/s12 Mb/s

35 Test Results May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 34

36 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 35 The Test Hardware 2011 Altera Cyclone III Starter Kit ($211+$50) FPGA: EP3C25F324C6N ($73.90) 32 channel: 30 ps (25 ps with linear power supply) 27 mW/channel www.altera.com

37 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 36 Test Setup NIM to LVDS Converter TDC Module

38 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 37 Output Raw Data and Typical Delta T Histogram Between Two Channels RMS of this histogram is 25 ps. 00003C C064A6 F064B8 C07CA4 F07CB4 C094A0 F094B0 C0AC9C F0ACAC C0C497 F0C4A8 C0DC91 F0DCA2

39 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 38 Delta T Between NIM Inputs TDC channels internally ganged together has smallest standard deviation of time differences. Typical channel pairs sharing same fan-out unit has 30 ps RMS. Timing jitters of the fan-out units add to the measurement errors. Pulse Gen. LeCroy 429A NIM FAN- OUT NIM To LVDS FPGA LeCroy 429A NIM FAN- OUT TDC NIM To LVDS A B C

40 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 39 Measurement Precisions Analyzed by Woon-Seng Choong, LBNL

41 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 40 Performance Degrading in CPU/GPU, ASIC & FPGA Imperfect designs degrade performance of ICs, including CPU/GPU considerably. ASIC devices are built using older technology and suffering similar design degrading. FPGA internal structure causes extra performance degrading in addition to design degrading. Design modification in FPGA is easier so that design degrading can be minimized. Performance CPU/GPU Degrading Due to Design Theoretical limit of current technology ASIC. Degrading Due to Design Theoretical limit of Older technology FPGA Degrading Due to Structure Degrading Due to Design Carefully designed FPGA may have better performance than typical ASIC.

42 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 41 Specifications RMS Resolution (Delta T between two channels)25 to 30 ps Same channel re-hit time interval64 ns Temporary buffer capacity128 hits/(4 ch)/(16 us) LVDS output port rate:250 M bits/s/port Output capacity in each LDVS output port:128 hits/(16 ch)/(16 us) Number of LVDS output ports:1, 2, 3, 4/(16 ch) Power Consumption (Core only)9.3 mW/channel Power Consumption (Total)27 mW/channel

43 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 42 Test Result NIM Inputs 0 12 RMS 10ps LeCroy 429A NIM Fan-out NIM/ LVDS NIM/ LVDS - 140ps Wave Union TDC B + + BNC adapters to add delays @ 140ps step.

44 Other Applications: Single Slope ADC May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 43 FPGA TDC RR C R1 V REF + 4xR2 V REF - V IN1 + V IN1 - V IN2 + V IN2 -

45 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 44 If You Want to Try The FPGA on the Starter Kit is fairly powerful. More than 16 pairs LVDS I/O can be accessed via the daughter card. FPGA can fit 32 channels but implementing 16 channels is more practical given the I/O pairs. TDC data are stored in the RAM on the board and can be readout via USB. A good solution for small experiment systems as well as student labs. www.altera.com DK-START-3C25N Cyclone III FPGA Starter Kit $211 www.altera.com THDB-H2G (HSMC to GPIO Daughter Board) $50

46 Timing Uncertainty Confinement May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 45

47 Historical Implementation in ASIC TDC May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 46 DLL Clock Chain Encoder Coarse Time Counter HIT Coarse Time Register Coarse Time Selection Logic c1c0 HIT is used as CK input which creates unnecessary challenges. Deadtime is unavoidable. Coarse time recording needs special care. Two array + encoder sets are needed for raising edge and falling edge. The register array must be reset for next event. The encoder must be re-synchronized with system clock in order to interface with readout stage. Unnecessary Challenges = Extra Efforts + Reduced Performance

48 Unnecessary Challenges May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 47 In history, Gray code counters, double counters and dual registers + MUX are found in ASIC TDC coarse time counter schemes. Theses are unnecessary if the TDC is designed appropriately. In FPGA, a plain binary counter is sufficient. Coarse Time Counter Coarse Time Counter Coarse Time Counter Gray Code Counter 000 001 011 010 110 111 101 100 Unnecessary for FPGA TDC

49 A Better Implementation May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 48 DLL Clock Chain OR + Register Clock Domain Transfer DVEGT4..T0 HIT Multi- Sampling Register Array Deadtimeless operation is possible. No special care is needed for coarse time. Both raising and falling edges are digitized with a single array + encoder set. No resetting is needed for the register array. The output is synchronized with the system clock and is ready to interface with readout stage. Coarse Time Counter TC 16-bit Encoder with Registered Outputs HIT is used as D input.

50 Coarse Time Counter May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 49 The timing uncertainty between HIT and CLK is confined in the sampling register array. All the remaining logics are driven by the CLK signal. No special cares such as Gray code counter is needed for coarse time counter. Hit Detect Logic Coarse Time Counter Fine Time Encoder HIT CLK ENA Fine Time Coarse Time Data Valid

51 Comparison May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 50 Historical Scheme: HIT-> CK; (c0..c31)->D; Preferable Scheme: HIT-> D; (c0..c31)->CK; Deadtime is unavoidable.Deadtimeless operation is possible. Coarse time recording needs special care.No special care is needed for coarse time. Two array + encoder sets are needed for raising edge and falling edge. Both raising and falling edges are digitized with a single array + encoder set. The register array must be reset for next event. No resetting is needed for the register array. The encoder must be re-synchronized with system clock in order to interface with readout stage. The output is synchronized with the system clock and is ready to interface with readout stage.

52 Wave Union? Photograph: Qi Ji, 2010 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov 51 Uneven Bin Width Digitization & Cascade PLL

53 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 52 Typical Block Diagram The carry chain and register array captures arrival time of the input transition. The position of the transition is encoded as a time code. Data buffers at various stages are used to store data temporarily. Digitized time data are sent out the chip through data ports. Carry Chain Reg. Array Encoder Buffer & LUT Carry Chain Reg. Array Encoder Buffer & LUT Carry Chain Reg. Array Encoder Buffer & LUT TDC Channels Output Buffer Parallel-to -Serial Converter Data

54 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 53 Example of an Actual Design of a 16-Channel TDC The hit time for each of the 16 channel inputs is digitized and encoded. Data from 4 channels are buffered and data from 4 groups of 4 channels are merged together. Raw hit times are converted to fine time through automatic calibration block. Data from all 16 channels are buffered and sent out via 4 pairs of LVDS ports @250 M bits/s. TDC + Encoder Data Buffer + Concentration Automatic Calibration Output Buffer Serialization

55 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 54 Temperature/PS Voltage Effects Power supply voltage changes from 2.5 V to 1.8 V, (about the same as 100 o C to 0 o C). Delay speed changes by 30%. The difference of the two TDC numbers reflects delay speed. 2 nd TDC 1 st TDC

56 FPGA TDC in Fermilab May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 55 The card is a 6U VME board. An Altera Cyclone III FPGA device (EP3C40F484C6) is used to implement TDC. Up to 64 channels can be implemented. A multi-threshold discriminator daughter card can be attached as shown in the right. The project is supported by detector R&D/test beam task codes.

57 TDC module for SeaQuest. Hardware made in Taiwan. Firmware development efforts:  (0.2 EE + 0.75 Graduate Student)*9 months 1M Gates Actel ProASOC-3 FPGA TDC in Flash Based FPGA May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov 56 Uneven Bin Width Digitization & Cascade PLL From Su-Yin Wang’s slides

58 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 57 FPGA TDC: a Single Chip Solution TDC FPGA TDC FPGA TDC DAQ V TH In In modern HEP system, a lot of time it is necessary to put an FPGA after a TDC to handle the generated data. It is convenient to include TDC function inside the FPGA.

59 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 58 Digitization with Non-uniform Bin Widths The phenomenon of digitizers with non-uniform bin widths sometimes is called differential non-linearity, which sounds a bad thing. In fact, digitizers with non-uniform bin widths make precise measurements as long as it is calibrated to the centers of the bins. A equivalent bin width can be defined as above. The calibration can be done offline or/and online.

60 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 59 Histogram Booking The phenomenon of digitizers with non-uniform bin widths sometimes is called differential

61 May. 2014, Wu Jinyuan, Fermilab jywu168@fnal.gov Uneven Bin Width Digitization & Cascade PLL 60 Summing The phenomenon of digitizers with non-uniform bin widths sometimes is called differential


Download ppt "Uneven Bin Width Digitization and a Timing Calibration Method Using Cascaded PLL Wu, Jinyuan Fermilab May. 2014."

Similar presentations


Ads by Google