Large Pixel Detector LPD John Coughlan STFC Rutherford Appleton Laboratory on behalf of LPD collaboration
Project Team Technical Team: STFC Rutherford Appleton Laboratory Glasgow University Surrey University Science Team: UCL Daresbury Bath University others …
Work Packages WP1: Sensors WP2: Front End ASIC Stephen Thomas One overall project manager Marcus French WP1: Sensors WP2: Front End ASIC Stephen Thomas WP3: Mechanical Design WP4: On Detector Electronics WP5: Front End Readout John Coughlan WP6: Software, Controls Tim Nicholls WP7: Flat Geometry Design (Glasgow) Approved ISO9000 Project management system
Project Schedule 1) Phase 1: Develop a digitising pipelined XFEL 2D pixel detector (1k by 1k pixels) 3Y Project started Feb 2008 2) Phase 2: Construct complete XFEL instruments (~16 Mpix) as required before XFEL operation 2013 Mapped to XFEL Science areas tbd
Phase 1 Prototype Detector 1M Pixel Hybrid Pixel Detector 0.5 x 0.5 mm pixels MSCs readout Cards on rear 4 x 4 Super Modules Silicon Sensor Array Tiled Geometry
Module Concept Sensor Tile with mounted ASICs 4 K pixels High resistivity Silicon 500 micron thickness 200 V bias 0.5 x 0.5 mm Pixels
XFEL ASIC New design matched to the XFEL: - Time Structure - Dynamic Range - Channel Count - System Interfaces etc. Implemented in 0.13 micron CMOS
Multi-gain concept Required dynamic range compression Experience with calorimetry at CERN Relaxes ADC requirements Fits with CMOS complexity
Store in Pipeline during bunch train Readout during long gap XFEL ASIC 512 channels Analogue Pipelines 512 samples deep x 3 ADC 14 bits (12 eff) @10 MHz Sequencing and control Dynamic range stages ADC Stages Overload Control IO to DAQ Power supplies Pixel Resetting Store in Pipeline during bunch train Readout during long gap Deep pipeline memory (786,432 samples) Power supply Conditioning
XFEL Structure X-ray photons 100 fs FEL Data Sampling to Memory Electron bunch trains; up to 3000 bunches in 600 msec, repeated 10 times per second. Producing 100 fsec X-ray pulses (up to 30 000 bunches per second). 100 ms 100 ms 600 ms 99.4 ms XFEL ~ 30 000 bunches/s but 99.4 ms (%) emptiness 200 ns FEL process X-ray photons 100 fs Data Sampling to Memory Serialise and Transmit to DAQ
Front End Readout MSCs readout Cards on rear 16 Super Modules
Front End Readout 1st Stage On Detector Head Module Support Cards (MSC) 2 per Super Module -> 32 MSCs total Aim to keep MSC simple (need a lot for large detectors) FPGA interface to ASICs. ASIC Read Out: Select appropriate 1 of 3 Gains for Readout Keep 12 bits 320 MB/sec per MSC (~2.5 Gbps) ASIC Input: Clock and Controls , bunch fill pattern
Front End Readout 1st Stage Baseline LVDS off Detector Head Module Support Cards (MSC) 2 per SuperModule -> 32 total Data transmission off Detector Baseline Electrical Links Bi-dir E.g. LVDS Camera Link (NatSemi Channel Link) Advantage COTS receivers for prototype tests Issues Cable length, LSZH restrictions, Common mode, shielding Detector movements (every day?)
Front End Readout 2nd Stage Close to Detector FEM Flexible Layer Concentrator Buffering DMA controller Calibration corrections Peds.. XFEL Formatting (headers id) Serial Optical output GEthernet Traffic Shaping + Reject bunches MSCs FPGA Interface to DAQ Interface to CTR Fast Timing and Ctrls System Receive clocks, synch, event nr, bunch fill information, vetos? Configuration User Controls via DOOCs
Requirements on the Machine Info to FEMs Fast Signals: Clock(s) @ multiple of bx? psec rms Serial stream with encoded... Bunch fill serial bits transmitted well before each pulse train, FEM transmits to FE ASIC. Bunch currents FEM appends to data stream Event Vetos Interlocks, safety Slow Info Run type, modes, bunch energies CTR AMC in each FE crate? Local distribution?
Front End Readout Interface to common DAQ MSCs EoI Straw Man Event Builder & Processor X FPGA Interface to DAQ
Front End Readout Feasibility could we use Dev boards? MSC Cards x 32 COTS FEM : Virtex 5 dev board Buffering ~GB Xilinx MPMC Traffic Shaping GbE/UDP 1 GbE x 3-4 FPGA Virtex 5 Interface to DAQ LVDS links Run Linux on PPC in V5FX? Interface to Fast Timing and Ctrls System?
Modularity 1xMPixel Nr Super Modules 16 Nr Hybrids 256 Nr Module Support Cards 32 Nr Front End Modules (4 links) Nr Links (@1 GbE) 96 Each FEM Fragment = 64 KB (32 MB per train) 64 KB x 512 x 10 = 320 MB/s 1 M Pixels x 512 x ~ 2 bytes x 10 Hz ~ 10 GBytes/sec 10 GB/sec per MPix => 36 TeraBytes per hour
16 x MPixel Nr Super Modules 256 Nr Hybrids 4,096 Nr Module Support Cards 512 Nr Front End Modules (4 links) Nr Links (@1 GbE) 1,536 Each FEM Fragment = 64 KB (32 MB / train) 64 KB x 512 x 10 = 320 MB/s 160 GB/sec => ~ 600 TeraBytes per hour
C.f. LHC Event Builder CMS : 512 FEMs each bx 2 kB @ 100kHz Throughput ~ 100 GB/sec @ full Lumi XFEL : 512 FEMs each bx 32 MB @ 10 Hz Throughput ~ 160 GB/sec
10 GbE Front End Readout X 10 GbE switch 10 GbE xTCA MSCs Build 10 Gb Custom FEM Today quite challenging 4 x 2.5 Gbps links XAUI Xilinx V6 with 10 Gb GTP? X FPGA 10 GbE switch 10 GbE xTCA
Alternative Solutions for 10 GbE? Industry Processor Based iWARP = Internet Wide Area RDMA Protocol RDMA over TCP/IP Storage networking, HPC Major industry involvement Objective to exploit 10 GbE @ wire speed Standard to compete with proprietary solutions e.g. Myrinet COTS Multiple vendors NICs 1GbE, 10GbE OS agnostic PCIe form factor in xTCA one day? 10Gb iWARP NIC
Alternative Solutions for 10 GbE? iWARP = Internet Wide Area RDMA Protocol More than TCP Offload Engines Remote DMAs between User Space, eliminate copies (Infiniband) Offloads TCP/IP, avoids context switches PC offload factors 90% ? FEM FPGA functions in Processor. Adds protocol layers. Effort into programming. Requires iWARP NICs at PC farm end Worth investigating Combine Camera Link Receiver with NIC in PC? Full COTS solution
EoI Straw Man Event Builder 1 MPix Line Card Receiver 10 GbE Event builder and Processor. Assuming Data Processing with Data Reduction Need access to complete frames? Data Processing engines FPGA vs FPNA vs MicroProcessor? FPt ?? Implementation AMC Mezzanines on Commercial ATCA Carrier Board Or AMCs in MicroTCA Large scale data switching problem -> ATCA Serial mesh fabric sharing between 1 MPix cards
Software, Controls and Integration Delivery and Operation of Detector in beam-line environments. System Operation Timing and Controls. Software Development and Integration with Beam-line scientists. Real Time Data Monitoring and Analysis. Assume use common XFEL controls package DOOCS
Next Steps FPGA Solutions MSC FPGA LVDS Camera Link readout off detector head. FEM Firmware development on V5 dev boards. Fragment building Memory controller Traffic shaping UDP/GbE Profit from DESY developments Keep monitoring COTS dev boards Virtex 6 45nm? 10 Gb? R&D FEM in AMC FEM prototype with 1 Gb / 10 Gb link (SFP+?), V5, Cross point switch (elements of an ATCA Line Card) Alternative Solutions Investigate FEM processor based solution on iWARP Demonstrator 1 GbE iWARP NICs Or 10 GbE NICs Test FEM demonstrator PC with Camera Link Frame Grabber card and iWARP NIC
Spare Slides
Some Questions Assume problem is to design readout system to transmit raw data for all possible bunches/bunch trains continuously. Which scientific experiments require this? Assume 1-10 GbE layer is DAQ interface. Will science be done at XFEL with the our 1 MPix prototype? How often do Large Detectors need to move? Is data reduction before PC farm possible or tolerable to end users? Need clever ideas from knowledgeable people for data processing algorithms.
Phase 2 Final Detector Large Pixel Detector multi M Pixel …we were planning for this at XFEL start up…
Prototypes vs Final Systems Detector builders want some (basic) readout early Final systems design with full readout in mind Conflict of interest, resources Using standards COTS ready made solutions for prototyping
Camera Link Standard Industrial video NatSemi Channel Link chipset (implement in V5) LVDS cables (CERN SLINK) Base 24 bits @ 85 MHz ~ 2 Gbps (Medium ~ 4 Gbps High ~ 6 Gbps) Cables <10 m? LSZH Shielding, common mode
Power and Cooling Detector 1- 2 mW per pixel (power management on ASIC?) Water cooling LV .13 ASICs @ 1 V HV 200 V bias