1 M. Kirchgessner TWEPP, Manfred Kirchgessner on behalf of the DSSC Collaboration DSSC = DEPFET Sensor with Signal Compression
2 M. Kirchgessner TWEPP, The European XFEL DSSC system overview Readout chain implementation Implementation details Summary Outline
3 M. Kirchgessner TWEPP, The European XFEL
4 M. Kirchgessner TWEPP, EuXFEL construction site at Hamburg Three 2D detector developments at the European XFEL (coordinator: M. Kuster) Adaptive Gain integrating Pixel Detector Consortium (AGIPD) (Project Leader: H. Graafsma) Large Pixel Detector Consortium (LPD) (Project Leader: M. French) DEPFET Sensor with Signal Compression Consortium (DSSC) (Project Leader: M. Porro)
5 M. Kirchgessner TWEPP, EuXFEL – bunch structure readout The EuXFEL runs in pulsed operation mode: Bunch repetition rate of 10 Hz Sequences of ~2700 pulses Up to min 220 ns distance (frame rate 4.5 MHz) ~100 fs wide X-Ray pulses (exposure time) 99.4 ms pause between macro bunches Max frame rate: 4.5 MHz Data readout
6 M. Kirchgessner TWEPP, The DSSC Detector
7 M. Kirchgessner TWEPP, Sensor and focal plane architecture DEPFET with non linear characteristic Silicon detector with internal gate Intrinsic low noise due to small internal gate capacitance Intrinsic signal compression Focal Plane composition 1024x1024 pixels 32 monolithic sensors Sensor bump bonded to 8 Readout ASICs Dead area: ~15% Power cycling 10.7 kW peak power 240 W average power Readout concept Full parallel readout Analogue shaping using trapezoidal filter In-Pixel 9 Bit ADC In-Pixel SRAM Memory ( 800 frames ) Focal-Plane 248x240mm © Image by Karsten Hansen
8 M. Kirchgessner TWEPP, DSSC – Design Parameters General Parameters Energy rangeoptimized for 0.5 … 6 keV Number of pixels1024 x 1024 Sensor Pixel ShapeHexagonal Sensor Pixel pitch~ 204 x 236 µm 2 Dynamic range / pixel / pulse ~ keV > E≥1 keV Resolution Single photon detection 0.25 keV Frame rate MHz Stored frames per Macro bunch 800 Operating temperature -20˚C optimum, RT possible
9 M. Kirchgessner TWEPP, The DSSC high throughput readout chain
10 M. Kirchgessner TWEPP, DSSC Ladder components Monolithic DEPFET sensor Mainboard Power Regulator Board Patch-Panel Flex Cable Readout ASIC I/O Board ( 1st FPGA stage) Module-Interconnection Board
11 M. Kirchgessner TWEPP, THE DSSC System overview © Image by Karsten Hansen Second FPGA Stage 1MPixel x 800 images x 2 Bytes per Pixel = 1600 MByte per 0.1 seconds Total data production rate of the detector: 128 GBit/s or 32 GBit/s per PPT
12 M. Kirchgessner TWEPP, DSSC DAQ Architecture
13 M. Kirchgessner TWEPP, DSSC DAQ Architecture – ASIC Readout ASIC: IBM 130nm technology 4096 pixels per ASIC In-Pixel SRAM cells for bit words One 10 bit serializer running at 350 MHz (400 MHz also successfully tested) 9 bit data + 1 bit parity
14 M. Kirchgessner TWEPP, DSSC DAQ Architecture – ASIC 16 ASICs are connected to first FPGA readout Board ( I/O Board ) Differencial 350MHz LVDS signals Connection via wire bonds and traces on PCB
15 M. Kirchgessner TWEPP, DSSC DAQ Architecture – I/O Board I/O Board implements the first FPGA stage FPGA: Spartan 6 LX 45 (xc6slx45t-3-csg324) Combines the data from 16 ASICs into one data stream Implements 3 high speed serial Xilinx Aurora links Additional capacitors for pulsed sensor supply Temperature sensor
16 M. Kirchgessner TWEPP, DSSC DAQ Architecture – ASIC Xilinx Aurora Protocol: GHz form one channel 8b10b encoding & 32 bit cyclic redundancy check (CRC) ERROR Correction: all single bit and most multi-bit errors Effective usable datarate per channel is 2.5 GBit/s Parallel user interface in FPGA is MHz = 7.5 Gbit/s Flexible cable connection to the Patch-Panel-Tranceiver ( PPT ) outside of the vaccum vessel: Rigid flex circuit board 320 mm length Aurora Eye-Diagram
17 M. Kirchgessner TWEPP, DSSC DAQ Architecture – PPT The Patch-Panel-Tranceiver ( PPT ) implements the main FPGA stage FPGA: Kintex 7 325T (xc7k325t-ffg900-2) Receives data from 4 IOBoards over 4 x 3 lanes = 12 Aurora lanes 1 GByte high speed DDR data buffer 4 x 10 GBit/s ethernet links that connect to a QSFP 40Gb/s Microblaze µC with an embedded linux for slow control via 1GBit/s ethernet
18 M. Kirchgessner TWEPP, PPT Firmware Details
19 M. Kirchgessner TWEPP, PPT FPGA Firmware – Datarates Data Input 22.4 Gbit/s
20 M. Kirchgessner TWEPP, Implementation Details
21 M. Kirchgessner TWEPP, PPT – FPGA Connections 1GB DDR Rx-Aurora QSFP Detector Slow-Control DDR3-800 µC FPGA Connections Kintex 7 325T
22 M. Kirchgessner TWEPP, Aurora - implementation details Aurora IP-core details: Simplex core implemented (no back channel) Streaming interface for easy data transmition Timer used for initialization sequence Only one differencial wire pair per lane between FPGAs required License comes with ISE MGT usage: One input clock can be connected to 3 GTX Quads. Each GTX transceiver can be driven by ist own PLL (CPLL) or by the QuadPLL CPLL (in each GTX Channel included) for linerates 1.6 – 3.3 Gbit/s (connected in Aurora) QPLL (one per Quad) for linerates 5.93 – 12.5 Gbit/s (required for 10GigE) Each Aurora channel is distributed to 3 Quads – 1 Lane per Quad Signal quality can be improved by optimizing swing and pre-emphasys settings GTP Quad Chan1 Chan2 Chan3 Chan4 Chan1 Chan2 Chan3 Chan4 Chan1 Chan2 Chan3 Chan4
23 M. Kirchgessner TWEPP, DDR implementation details IP-Core Version: Xilinx DRAM-controller mig 7 v1.9 License comes with ISE Interface: 4 DDR3 modules with 16 bit width = 64 bit data 800 MHz On the firmware (user) side: 512 bit data 200 MHz single data rate Running in burst mode of up to 256 words x 512 bits. Alternating read and write bursts to minimize latency In alternating read/write mode max bandwidth achieved is 88 GBit/s
24 M. Kirchgessner TWEPP, QSFP - implementation details IP Core: Adaptions required to support FOUR 10GigE channels, single links can directly be generated Licence required System Tests: System was tested using a standard desktop PC 1 x 10GigE PCI-Express SFP receiver card (single link tested via breakout cable) It is possible to receive 8kB UDP packets at ~10GBit/s without loss after some optimiziations: Linux driver adaptions of buffer sizes Move data-receiving in seperate CPU thread No data stored, just copied from buffer and checked
25 M. Kirchgessner TWEPP, PPT – board details Board details: ~ 5000 € per Board 14 layers Size: 80x160mm Supplied by 12 volts / 17W 9 different supply levels: 12 Volts + Booting and update: Boot chain for successive power-up of all required voltages FPGA & Linux boot automaticly from SPI-connected flash memory Firmware & Linux flash reprogrammable from Microblace Re-boot process triggerable from remote IO Board FPGAs programmed by PPT After ~5 min system is ready PPT top view 2 x 1.0V 1.2V 1.5V 1.8V 2.0V 2.5V 3.3V Full system update possible from remote
26 M. Kirchgessner TWEPP, PPT – board details Debugging: Xilinx JTAG Programmer Cable for early debugging Xilinx virtual cable implemented: Xilinx Chipscope access to all IOB FPGAs AND PPT FPGA remotely via ethernet USB ftdi interface ( linux boot output ) PPT top view Debug access available even when installed in vacuum
27 M. Kirchgessner TWEPP, Summary Differential links 350MBit/s Aurora GBit/s over 30cm LVDS on Flex-Cable runs reliable. Aurora Lanes can be distributed to different GTX Quads 10 GBit/s link works nice at >90% speed. DDR3 – 800 MHz works out of the box, if hardware timings are known. Outlook: First X-Ray beam 2015 First DSSC ladder camera (65k pixel) 2015 Full DSSC 1M pixel camera 2017
28 M. Kirchgessner TWEPP, M. Porro 1, L. Andricek 2, S. Aschauer 3, M. Bayer 4, A. Castoldi 4,5, D. Comotti 6, M. Donato 7, F. Erdinger 8, C. Fiorini 4,5, P. Fischer 8, H. Graafsma 9, C. Guazzoni 4,5, K. Hansen 9, P. Kalavakuru 9, H. Klaer 9, M. Kirchgessner 8, A. Kugel 8, M. Kuster 7, P. Lechner 3, G. Lutz 3, P. Majewski 3, M. Manghisoni 6, D. Moch 1, B. Nasri 4, S. Nidhi 7, V. Re 6, C. Reckleben 9, R. Richter 2, S. Schlee 7, J. Soldat 8, L. Strueder 8, J. Szymanski 9, M. Turcato 7, G. Weidenspointner 7, C. Wunderer 9 1) Max Planck Institut fuer Extraterrestrische Physik, Garching, Germany 2) MPG Halbleiterlabor, Muenchen, Germany 3) PNSensor GmbH, Muenchen, Germany 4) Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy 5) Sezione di Milano, Italian National Institute of Nuclear Physics (INFN), Milano, Italy 6) Dipartimento di ingegneria industriale, Università di Bergamo, Bergamo, Italy 7) European XFEL GmbH, Hamburg, Germany 8) Zentrales Institut für Technische Informatik, Universitaet Heidelberg, Heidelberg, Germany 9) Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany The DSSC Consortium
29 M. Kirchgessner TWEPP,
30 M. Kirchgessner TWEPP, PPT - FPGA Utilization LOGICUsedAvailableRatio Slice Registers % Slice LUTs % Occupied slices % RAMB36/FIFO % RAMB18/FIFO % GTXE2_CHANNELS16 100% FPGA: Kintex 7 325T (xc7k325t-ffg900-2)
31 M. Kirchgessner TWEPP, PPT – FPGA Logic distribution
32 M. Kirchgessner TWEPP, Used IP Cores ● Used Xilinx IP cores + self written wrapper code (verilog) Aurora 8B10B v8.3 Rx&Tx FIFO Generator v9.3 (no wrapper required) 1GB DDR memory controller MIG 7 Series 1.9 Ethernet MAC + PHY (v PCS/PMA 2.6) ● Xilinx EDK was used to implement the Microblace 100MHz Running a busybox linux Discrete 256 MB DDR3-800 DRAM controller Gigabit ethernet controller Only for slow control
33 M. Kirchgessner TWEPP, Kintex 7 GTX Quad - QPLL Certain refclock frequency for application required
34 M. Kirchgessner TWEPP, DSSC– implemented datarates