Digital Correlator Design Using Vertex-2 FPGAs Zuo Yingxi Yao Qijun Lin Zhenhui Purple Mountain Observatory Nanjing 210008, Chian Yx.zuo@mwlab.pmo.ac.cn Nov.4, 2003
Outline Brief Review of Digital Correlators Principle & General Structure of a 2-Bit Digital Correlator Vertex-2 Series FPGAs Design & Implementation Experiment Conclusion
A Brief Review of Digital Correlators Application: Single-dish telescope Sample & Quantize IF Signal Calculate Rxx FFT → spectrum Array, VLBI Sample IF outputs from 2 telescopes → calculate Cross-correlation Rxy Mm-wave VLBI correlator requirement Broad band 2 Types XF FX 2-bits vs. 1-bit quantization efficiency (affecting SNR) 1-bit (two-level) quantization Sampling at Nyquist rate 0.64 at 2× Nyquist rate 0.74 2-bit (four-level) quantization Sampling at Nyquist rate 0.87 So 2-bit quantization is more efficient, specially for VLBI
Brief Review of Current Digital Correlators Haystack CMOS Correlator GBT 256K Spectrometer COBRA Correlator System Digital Spectrometer for Nobeyama 45-m Telescope S500C128A Digital Correlator Chip from Spaceborne Inc. UWBC (Ultra-Wide Band Correlator) for NMA ACSIS (Auto-Correlation Spectrometer and Image System) at JCMT ALMA Baseline Correlator
Haystack CMOS Correlator Complex or real cross/auto correlation surported Up to 64 MSPS sampling rate 2-bit, 4-level arithmetic surported 512 lags, with many configuration options 24-bit accumulator stages Fully cascadable Integration can continue while data is output CMOS compatible I/O 4 input channels, can handle demultiplexd data Applications: The SMA, The Westerbork Array, The NASA/MIT MKIV VLBI processor, The Joint European VLBI processor Report Date: Sep. 15,1998
GBT 256K Spectrometer Up to 800MHz Bandwidth (1.6-GSPS), down to 12.5MHz Up to 16384 lags, 49KHz resolution at 800MHz 3-level up to 800MHz, 9-level below 50MHz BW Using “Quaint” correlator chips: Cross/Auto Correlation surported 1024 Lags 100MSPS sampling rate 3-level or 2-level 33-bit accumulation stage for 3-level operation 32-bit 3-state asynchronous output port Data and control signals cascadable Integration can continue while data is output CMOS and TTL compatible inputs Report Date: Feb. 13, 1995
COBRA Correlator System For the Millimeter-Wavelength Array (MWA) at Owens Valley Radio Observatory (OVRO) Digitizer 1GSPS sampling rate 2-bit Serial-to-parellel converter (1:8 @125MHz, or 1:16 @62.5MHz) Correlation Using FPGAs (Altera 100KA or 100KE) for calculating average cross-correlation 62.5MHz clock 10 FPGAs on a correlator card DSP for real-time control and FFT Report Date: 04/16/1996
Digital Spectrometer for Nobeyama 45-m Telescope For the 25-Beam Array Receiver System (BEARS) Using A/D in a digital oscilloscope, 1GSPS, 2-bits 1024-Lags autocorrelation (with 32 LSIs of 32 lags connected in cascades) Report date: 2000
S500C128A Digital Correlator Chip from Spaceborne Inc. 500MHz effective signal bandwidth 500HMz clock frequency 2-bit, 4-levels 128 Lags/chip 22s Max. integration time 100us Min. readout time TTL level for computer interface Differential ECL I/O for clock and digital data Single 3.3V power supply Fully cascadable S500S1024 available Application: JPL Digital Autocorrelator Spectrometer Report date: April 2000
UWBC (Ultra-Wide Band Correlator) for NMA 2 GSPS sampling rate (1024-MHz BW) 2-Bits, 4-levels A/D, ECL output With 1:64 demultiplexer, 2GHz/64=32MHz Using a simple “bit-match” method to calculate 2-bit cross-correlation Special-purpose LSI (UWBC2) operating at 32 MHz Integration time from 0.1s up to 1.0s with 24-bits acuumulator 256 Lags Application: for Nobeyama Millimeter Array (NMA) Report Date: 2000
ACSIS (Auto-Correlation Spectrometer and Image System) at JCMT A/D 2 GSPS 2-bit, 3-level Correlator Using “Quaint” Correlator chips Consists of 32 correlation modules. On each module are 32 “Quaint” ICs Simmilar to the NRAO/GBT correlator It is currently in the design and development stage (by 3 Jan. 2002)
ALMA Baseline Correlator A/D 4-GHz sampling rate, 2-GHz BW 3-bit, 8-levels Transmittied over fiber optic cables to the correlator Use FIR filter and then form 2-bit 4-levels Correlator: 2-bit, 4-levels Schedule The minimally populated correlator will be complete by April 2002 Will be working in the lab by the end of 2002 Will be delivered to the VLA site in May 2003
Characteristics of the above correlators High sampling rate Using 2-bit ADC Ultra-high speed sampling; Serial-to-parellel converting; Relative low correlation speed ASICs; General programmable logic devices More lags Hybrid design Cross- & auto-correlation
Principle & General Structure of a 2-Bit Digital Correlator From Rx.1 IF From Rx.2 IF
Parellel correlation arithmetic and diagram (1) Sampling two input signals X(t) and Y(t), i= 0, 1, 2, …, 4N-1 After 1:4 serial-to-parellel converting, n= 0, 1, 2, …, N-1
Arithmetic of the parellel correlation and the diagram (2) j = 0, 1, 2, …, M-1
Parellel-correlation diagram
Vertex-2 series FPGA (Field Programmable Gate Array) Wide-band digital correlator needs high-speed & large-scale logical devices FPGAs advantages Very large-sacle IC, up to 10 million gates in one chip Very high speed, up to 410MHz Rapid time-to-production Design with an FPGA is just “configuration” High reliability Chip’s quality garranted by producer Verified by many other apllications
Vertex-Ⅱ series FPGA
Design & Implementation Using “design entry” of Xilinx F4.1i software Using Xilinx F4.1i software XC2V2000 FPGA chip as the target device (2 million gates) Goal: 64 Lags, 250MHz clock rate Schematic diagram Top-level schematic Control module 4-Lag correlation module for 4-parellel inputs Elementary 1-Lag correlation
Top-level Schematic Using “design entry” of Xilinx F4.1i software
Control module
4-Lag correlation module for 4-parellel input
Elementary 1-Lag Correlation
Implementation Map report: Timing report: Target Device : x2v2000 Number of Slices: 5,505 out of 10,752 51% Number of Slice Flip Flops: 8,702 out of 21,504 40% Total Number 4 input LUTs: 4,402 out of 21,504 20% Number of bonded IOBs: 105 out of 408 25% Number of Tbufs: 2,176 out of 5,376 40% Timing report: NET "CLK" PERIOD = 4 nS (constraint); 10923 items analyzed, 0 timing errors detected. Minimum period is 3.909ns.
Simulation (1)
Simulation (2)
Simulation Results Correct Meet the design goal 64 lags 250 MHz clock rate Equivalent input signal can be up to 500 MHz (Due to the parellel correlation scheme)
Experiment Based on V2LC1000 Demo Board The Demo Board XC2V1000 FPGA chip PROM memory chip 100MHz/24MHz clock JTAG download connector and cable DDR memory chip RS232 connector User I/O connectors Voltage generators
Experiment To achieve 32-lag auto-correlation Using the on-board 100MHz clock Input signal generated by the same FPGA chip Controlled by a PC Set the integration time Read the correlation results Via printer port
Experiment --- the input signal Assuming an input signal as Quantized by a 2-bit ADC, achieve a 2-bit series (3, 3, 0, 1, 0, 2, 0, 0, 3, 3, 1, 1, 2, 3, 2, 0, 2, 2, 1, 1), … further through a 1:4 demultiplexer, obtain a 8-bit series (79, 8, 95, 46, 90), … This signal generator was implemented in the same FPGA
Slightly revised Top-level Schematic
Experiment --- result SIM CAL MEA Simulation is performed by Xilinx F4.1i software Experiment result agree well with the simulation & calculation result
Experiment --- result Power spectrum obtained by FFT Measureed auto-correlation function Red line ---measured data Blue line ---calculated from the unquantized input signal Power spectrum obtained by FFT Red line ---from the measured data Blue line ---calculated from the unquantized input signal
Conclusion A 64-Lag correlator design and simulation With XC2V2000 (2M-gate FPGA chip), fully cascadeble to achieve more lags. Speed meets the design requirement (250MHz verified by timing simulation, 100MHz verified by experiment). Results are correct. It is a efficient method (saving circuit resources) to use only 1 accumulator for 4-parellel inputs correlation. Speed requirement of the device for correlating operation decreased (input bandwidth of the overall correlator increased) by serial-to-parellel converting. Sampling rate can be up to 1-GSPS (input bandwidth up to 500MHz) With this correlation module. A 32-lag auto-correlator experiment has been performed.
Thanks Zuo Yingxi Yao Qijun Lin Zhenhui Purple Mountain Observatory Nanjing 210008, Chian Thanks