A Demo Prototype for Digitization at Feed Through Wu, Jinyuan, Scott Stackley, John Odeghe Fermilab Nov. 2010
Wu Jinyuan, Fermilab, Comments from Internal Review The primary technical concerns are with reliability, noise and appropriateness of the design specs.
Nov. 2010Wu Jinyuan, Fermilab, Background There is a desire of digitizing as close to the front-end as possible: To improve noise performance and To permit longer cable run. Digitization at the feed through has been considered before but there are two primary concerns: Power consumption/cooling. Difficulty of serving the cards at feed through. In this document, the option of digitization at the feed through is revisited with the concerns above kept in mind.
Nov. 2010Wu Jinyuan, Fermilab, Reducing Noise During Digitization The goal is to digitize the analog signals before they are contaminated by noise. However, digitization processes create noise that may contaminate signals. Therefore, it is a natural to minimize digital activities in the digitization processes. Q: How many bit transitions are considered to be minimal? A: 1 bit transition/data sample.
Nov. 2010Wu Jinyuan, Fermilab, The Single Slope ADC ADC Feed Through FPGA V REF Shaper Line Driver ADCShaper ADCShaper ADCShaper FPGA TDC Line Driver Line Driver Line Driver Feed Through Shaper TDC Shaper TDC Shaper TDC Analog signal of each channel from the shaper is fed to a comparator and compared with a common ramping reference voltage V REF. Pulses, rather than analog signals are transmitted on the cable. The times of transitions representing input voltage values are digitized by TDC blocks inside FPGA. This approach sometimes is (mistakenly) refereed as “Wilkinson ADC”. T1T1 V1V1 T2T2 V2V2
Nov. 2010Wu Jinyuan, Fermilab, Single Slope ADC Test: Waveform Digitization Raw Data Input Waveform, Overlap Trigger & Reference Voltage Calibrated FPGA TDC pF 100 V REF Shown here is a demo of a 6-bit single slope TDC. Sampling rate in this test is 22 MHz. Both leading and trailing reference ramps are used in this example. Nonlinear reference ramping is OK. The measurement can be calibrated.
Nov. 2010Wu Jinyuan, Fermilab, Typical ADC devices creates noise that may interfere the analog circuits. The time interval for resetting of the common reference voltage may be noisy but analog signal is not sampled during it. There is no digital control activities during ramping up of the common reference voltage. Digital Noise During Digitization T1T1 V1V1 T2T2 V2V2 Noisy Clean ADCShaper
Nov. 2010Wu Jinyuan, Fermilab, With sampling rate at 2 MHz, the whole ramping cycle is 500 ns. Arrange ns for upward ramping. To achieve 12-bit ADC precision, the TDC LSB is (409.6 ns)/4096 = 100 ps. TDC with 100 ps LSB can be comfortably implemented in FPGA today. TDC Resolution Requirement T1T1 V1V1 T2T2 V2V2 500 ns
Nov. 2010Wu Jinyuan, Fermilab, The Wave Union TDC using FPGA A possible choice of the TDC can be a delay line based architecture called the Wave Union TDC implemented in FPGA. Shown here is an ASIC-like implementation in a 144-pin device. 18 Channels (16 regular channels + 2 timing reference channels). This FPGA (EP2C8T144C6) costs $28, $1.75/channel. (AD9222: $5.06/channel) LSB ~ 60 ps. RMS resolution < 25 ps. Power consumption 1.3W, or 81 mW/channel. (AD9222: 90 mW/channel) In CLK Wave Union Launcher A
Nov. 2010Wu Jinyuan, Fermilab, A Demo Prototype The single slope ADC demo card uses the same connector the same pin out and has same width as the intermediate amplifier card. Each card serves 32 channels (16 channels are laid out in current version). With CMOS cold ASIC, the analog section on this card can be further simplified. The FPGA TDC digitizes the transition time of the comparator output (16 channels in this version). The Data Handling FPGA performs wire-time conversion, Huffman Coding, Dynamic Decimation etc. and supports interface of an 100Mbits/s Ethernet (Fast Ethernet, or FE). Data to DAQ via the Fast Ethernet port.
Nov. 2010Wu Jinyuan, Fermilab, The Option I V REF Shaper FPGA TDC Feed Through Shaper TDC Shaper TDC Shaper TDC Each line driver card at feed through now hosts 32 shapers, 32 comparators and a common ramp reference voltage generation circuit. There is no control logic on board. Amount of service is minimized. Cabling: 32 differential pairs/card or 1 pair/channel, which is the same as the baseline design. The number of connections to the digital section will be unchanged.
FPGA Nov. 2010Wu Jinyuan, Fermilab, The Option II V REF Shaper FPGA Feed Through Shaper TDC Each line driver card at feed through now hosts 32 shapers, the common ramp reference voltage generation circuit and 2 FPGA devices each having 16 TDC channels. Comparators can be absorbed into FPGA using the differential input receivers. Each FPGA serving 16 channels sends data out of 4 LVDS differential pairs in a Cat-5 RJ-45 cable. Data rate for each LVDS pair is 160 M bits/s for uncompressed continuous waveform. Cabling: 8 differential pairs/card or (1 pair)/(4 channels), which is 1/4 of the baseline design. The number of connections to the digital board is now reduced from 64 pairs to 16 pairs. If necessary, a digital board now may handle 256 channels.
FPGA Nov. 2010Wu Jinyuan, Fermilab, The Option III V REF Shaper Data Handling FPGA Feed Through Shaper TDC The raw waveform data from the FPGA TDC is sent to the Data Handling FPGA. The wire-time conversion of the raw data is performed in the Data Handling FPGA (connected to a SDRAM buffer). Data for accelerator neutrino events and supernova are compressed inside the Data Handling FPGA. Compressed data are sent out via the Fast Ethernet. The occupancy of the Fast Ethernet is about 30% as calculated in the backup slides. In Option II and III, power and ground planes are cut to confine digital noise. Ethernet Interface
Nov. 2010Wu Jinyuan, Fermilab, Oscilloscope Traces The ramping references in differential format is shown on the left. The sampling rate is 2M samples/s. The ramping references and the input signal is shown on the right. (The full chain test of the demo prototype is undergoing. The measurements shown in this and following pages were made in the board for other projects during the summer The circuits used for these measurements were copied into the demo prototype.)
Nov. 2010Wu Jinyuan, Fermilab, Waveform Digitization A pulse “looks-like” a TPC wire signal is generated. The input signal is compared with the ramping references and the crossing time is digitized. The sampling rate is 2M samples/s. The time is converted back to amplitude.
Nov. 2010Wu Jinyuan, Fermilab, Digitization Precision The histogram bin is (full scale)/4096. The histogram shows the deviation of measurement points from the pedestal.
Nov. 2010Wu Jinyuan, Fermilab, The 12-bit Performance Each bin in the histogram is (full scale)/4096. The measurement is taken when there is no input signal. There is no intrinsic noise preventing the comparator + TDC structure to be used as an ADC at 12- bit precision.
Nov. 2010Wu Jinyuan, Fermilab, The Data From Ethernet Data can be sent out from Ethernet handled by FPGA firmware only. No microprocessors are needed. Data can be packed as “pseudo bit map” (left). Test patterns can be generated in FPGA for connectivity tests (center). Diagnostic histograms can be generated for online monitoring (right). (These data and patterns are taken in a board for other projects. But the circuits are copied into the Demo Prototype board. The FPGA is identical as the Data Handling FPGA.)
Nov. 2010Wu Jinyuan, Fermilab, Resource, Power and Cost Saving Digitization: Baseline: AD9222: $5.06/channel. FPGA TDC: EP2C8T144C6 ($28 Web), $1.75/channel. Digital Data Handling: Baseline: 64 channels: EP3SL150F1152: ($2235 Web, $500 Contract) This Demo: 32 channels: EP2C8Q208C7N: ($24 Web). Single card structure saves on crates etc. (To be calculated in details) Power consumption in each stage is less than its counterpart in baseline. Details to be calculated. Cost of this card: PCB fabrication:$81.5 * Components:$314 * Assembly:$30 ** Total:$425 * Actual payments. ** Estimate
Nov. 2010Wu Jinyuan, Fermilab, The Data Handling FPGA It is hard to believe the FPGA in this demo (EP2C8Q208C7N: $24 Web) can handle the tasks of the FPGA in the baseline (EP3SL150F1152: $2235 Web, $500 Contract). I will explain briefly here and more details will be written in backup slides. FPGA costs are driven steeply by pin count. So reducing pin count is the trick of resource saving. The following are changed from the baseline: An optimal data flow scheme is used so that only one RAM buffer is needed (rather than three), saving pins for the other two RAM buffers. A steady flow of output data (Ethernet) is utilized instead of the token passing data way used in baseline that has high instantaneous rate but low average data rate. The steady data output allows more efficient usage of the I/O pins. Nov CD-1 Readiness Directors Review20
Nov. 2010Wu Jinyuan, Fermilab, DAQ Interface DAQ is beyond the scope of this document, but a possible scheme is shown here. About 10 Fast Ethernet, one from each board are merged into a Gigabit Ethernet data stream. Up to 32 GE streams are needed for 10K channels. (The occupancies of FE and GE are about 30%.) The 32 GE streams are connected to 32 PCs via a multi-port switch. Each PC stores an event every 2 seconds at 15 Hz beam. FE to GE Merging Switch 32x32 GE Switch FE to GE Merging Switch PC
Nov. 2010Wu Jinyuan, Fermilab, Reliability Issues At least 4 layers of metal contacts (connectors) are reduced from the signal chain comparing with the baseline. The failure probabilities due to loose connections are reduced accordingly. There are no microprocessors on the board. The Ethernet interface is handled as a sequence inside FPGA. There are no major electronics racks in the detector enclosures, (except power supplies and clock distribution, which can be controlled and monitored via Ethernet.) Old fashioned design approaches are used. All components are at least 8 years or older. The FPGAs (Cyclone II) are two generations older than today’s technology. Upgrading FPGAs one generation up will be considered.
Nov. 2010Wu Jinyuan, Fermilab, Acknowledgements MicroBooNE Project: BNL : Circuit Diagram, Component/Connector part number, Mechanical Drawing. Fermilab IPM Program: Summer Student: Scott Stackley. Fermilab SIST Program: Summer Student: John Odeghe.
Nov. 2010Wu Jinyuan, Fermilab, The End Thanks
Nov. 2010Wu Jinyuan, Fermilab, Noise Margin Comparison ADC Feed Through FPGA V REF Shaper Line Driver Shaper FPGA TDC Feed Through Shaper TDC Shaper TDC Shaper TDC The noise to cause an LSB error at input of 12-bit ADC is (full range)/4096, i.e., (1V+1V)/4096 = 0.5 mV. The TDC timing error can be created by differential noise on the input LVDS wires. Recall that the voltage swing of LVDS is 350 mV mV and assume the LVDS rise time is worsen to 10 ns due to attenuation of the cable. The noise to cause an LSB error (100 ps) in TDC is ( )*(100ps/10ns) mV = 7 mV.
Nov. 2010Wu Jinyuan, Fermilab, Bandwidth in and out of RAM The external RAM is 16M x 16 bits running at 80 MHz (write or read). The internal Data Merging RAM organizes the data into time-ordered records with 64 samples each. During the 4.8 ms time windows covering the beam spill, data records are written into the external RAM. The occupancy of the RAM port is (2 MHz x 32)/80MHz = 80%. All time ordered records are sent through Dynamic Decimation and Huffman Coding as Supernova Data and output via the Ethernet Interface. Outside the 4.8 ms time window, the potential Accelerator Neutrino Events stored in the external RAM are read out for Huffman coding and sent out via Ethernet. The occupancy during readout is: (2MHz x 32 x 4.8ms)/((1/15Hz-4.8ms)*80MHz) = 6% Data From TDC 2 MHz X 32 X 16 bits X 4.8 ms X 15 Hz Data Merging RAM Dynamic Decimation External Memory Huffman Coding Huffman Coding Accelerator Neutrino Events Supernova Data Ethernet Interface
Nov. 2010Wu Jinyuan, Fermilab, External RAM Write/Read Timing The external RAM is 16M x 16 bits running at 80 MHz (write or read). The internal Data Merging RAM organizes the data into time-ordered records with 64 samples each. During the 4.8 ms time windows covering the beam spill, data records are written into the external RAM. The occupancy of the RAM port is (2 MHz x 32)/80MHz = 80%. All time ordered records are sent through Dynamic Decimation and Huffman Coding as Supernova Data and output via the Ethernet Interface. Outside the 4.8 ms time window, the potential Accelerator Neutrino Events stored in the external RAM are read out for Huffman coding and sent out via Ethernet. The occupancy during readout is: (2MHz x 32 x 4.8ms)/((1/15Hz-4.8ms)*80MHz) = 6% The diagram is not to the scale. Spill Write Read Ethernet (Neutrino Events) Ethernet (Supernova Data)
Nov. 2010Wu Jinyuan, Fermilab, Bandwidth out of Ethernet The raw data rate for 32 channels is: 2MHz x 16 bits x 32 = 1024 M bits/s. The total compression ratio of Dynamic Decimation + Huffman Coding: 1/60 to 1/100. The supernova data rate output via Ethernet: 1024 M bits/s x 1/60 x (5/4) = 21 M bits/s, i.e., 21% of the Fast Ethernet capacity. The data in 4.8 ms covering 15 Hz accelerator spill: 1024 M bits/s x 4.8ms x 15Hz = 73 M bits/s. The compression ratio of the Huffman Coding: 1/10. The accelerator neutrino events data rate output via Ethernet: 73 M bits/s x 1/10 x (5/4) = 9.2 M bits/s, i.e., 9.2% of the Fast Ethernet capacity. Total Ethernet occupancy, i.e., (supernova data) + (accelerator neutrino data) = 21%+9% = 30%. The fast Ethernet is sufficient to output the supernova data and data from all spills. The PMT triggers can be applied to further reduce the accelerator neutrino data. The PMT may not significantly reduce the data rate in the FE, but it will help the DAQ. Detailed discussion is beyond scope of this document.
Nov. 2010Wu Jinyuan, Fermilab, Possible Output Buffer for Supernova It is unlikely that an output buffer for supernova data is needed. But if necessary, the external RAM can serve the buffer function as shown in dash lines. Data From TDC 2 MHz X 32 X 16 bits X 4.8 ms X 15 Hz Data Merging RAM Dynamic Decimation External Memory Huffman Coding Huffman Coding Accelerator Neutrino Events Supernova Data Ethernet Interface Spill Write Read Supernova data buffer Write Supernova data buffer Read
Nov. 2010Wu Jinyuan, Fermilab, Why do we want to take this risk? The digitization at feed through approach is a good study for possibilities to improve noise performance, to allow longer cable run and to reduce cost for MicroBooNE. The cold ASIC for DUSEL or LBNE will host ADC and data compression. The R&D of digitization at feed through can be useful experience for the cold ASIC. For example, it is useful to know how to place functions that traditionally need computer access for register setting, monitoring and diagnosis in remote and inaccessible locations. Nov CD-1 Readiness Directors Review30