First Performance Results of the ALICE TPC RCU2 Chengxin Zhao TWEPP 2015 - 30 September - Lisbon, Portugal 1 Chengxin Zhao University of Oslo, Norway On.

Slides:



Advertisements
Similar presentations
TPC / PHOS / HLT Readout Electronics overview Annual Evaluation Meeting for CERN-related Research in Norway November, 2004 University of Oslo Kjetil.
Advertisements

Scrubbing Approaches for Kintex-7 FPGAs
ICAP CONTROLLER FOR HIGH-RELIABLE INTERNAL SCRUBBING Quinn Martin Steven Fingulin.
Normal text - click to edit Status Report TPC Electronics Meeting, CERN Johan Alme & Ketil Røed, UoB.
A Gigabit Ethernet Link Source Card Robert E. Blair, John W. Dawson, Gary Drake, David J. Francis*, William N. Haberichter, James L. Schlereth Argonne.
HEP2005, Lisboa July 05 Roberto Campagnolo - CERN 1 HEP2005 International Europhysics Conference on High Energy Physics ( Lisboa-Portugal,
Alice EMCAL Meeting, July 2nd EMCAL global trigger status: STU design progress Olivier BOURRION LPSC, Grenoble.
A.R. Hertneky J.W. O’Brien J.T. Shin C.S. Wessels Laser Controller One (LC1)
The RCU2 ALICE TPC readout electronics consolidation for Run 2 Johan Alme Bergen University College, Norway on behalf of the ALICE-TPC collaboration TWEPP.
UNIVERSITY OF BERGEN DEPARTMENT OF PHYSICS 1 UiB DR 2003 High Level API for the TPC-FEE control and configuration.
Normal text - click to edit RCU – DCS system in ALICE RCU design, prototyping and test results (TPC & PHOS) Johan Alme.
FPGA IRRADIATION and TESTING PLANS (Update) Ray Mountain, Marina Artuso, Bin Gui Syracuse University OUTLINE: 1.Core 2.Peripheral 3.Testing Procedures.
DLS Digital Controller Tony Dobbing Head of Power Supplies Group.
GBT Interface Card for a Linux Computer Carson Teale 1.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
U N C L A S S I F I E D FVTX Detector Readout Concept S. Butsyk For LANL P-25 group.
Experience from using SRAM based FPGAs in the ALICE TPC Detector and Future Plans Johan Alme – for the ALICE TPC Collaboration FPGA.
Front-end Electronics for the Alice Detector Kjetil Ullaland Department of Physics and Technology, University of Bergen, Norway NFR meeting, University.
SALTRO TPC readout system Presented by Ulf Mjörnmark Lund University 1.
Evaluation of the Optical Link Card for the Phase II Upgrade of TileCal Detector F. Carrió 1, V. Castillo 2, A. Ferrer 2, V. González 1, E. Higón 2, C.
2/2/2009 Marina Artuso LHCb Electronics Upgrade Meeting1 Front-end FPGAs in the LHCb upgrade The issues What is known Work plan.
RCU Status 1.RCU design 2.RCU prototypes 3.RCU-SIU-RORC integration 4.RCU system for TPC test 2002 HiB, UiB, UiO.
Bernardo Mota (CERN PH/ED) 17/05/04ALICE TPC Meeting Progress on the RCU Prototyping Bernardo Mota CERN PH/ED Overview Architecture Trigger and Clock Distribution.
The ALICE Forward Multiplicity Detector Kristján Gulbrandsen Niels Bohr Institute for the ALICE Collaboration.
Upgrade of the CSC Endcap Muon Port Card Mikhail Matveev Rice University 1 November 2011.
Rome 4 Sep 04. Status of the Readout Electronics for the HMPID ALICE Jose C. DA SILVA ALICE.
Status report on the development of a readout system based on the SALTRO-16 chip Leif Jönsson Lund University LCTPC Collaboration Meeting
Features of the new Alibava firmware: 1. Universal for laboratory use (readout of stand-alone detector via USB interface) and for the telescope readout.
1 Dong Wang, Yaping Wang, Changzhou Xiang, Zhongbao Yin, Fan Zhang, Daicui Zhou (Huazhong Normal University, China) Status and planning on common readout.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
1 07/10/07 Forward Vertex Detector Technical Design – Electronics DAQ Readout electronics split into two parts – Near the detector (ROC) – Compresses and.
US Status of GbE Peripheral Crate Controller Ben Bylsma EMU meeting Fermilab, October 21, 2005 Section 1: Hardware Section 2: Firmware Development.
June 17th, 2002Gustaaf Brooijmans - All Experimenter's Meeting 1 DØ DAQ Status June 17th, 2002 S. Snyder (BNL), D. Chapin, M. Clements, D. Cutts, S. Mattingly.
KLM Trigger Status Barrel KLM RPC Front-End Brandon Kunkler, Gerard Visser Belle II Trigger and Data Acquistion Workshop January 17, 2012.
CERN, 18 december 2003Coincidence Matrix ASIC PRR Coincidence ASIC modifications E.Petrolo, R.Vari, S.Veneziano INFN-Rome.
DAQMB Status – Onward to Production! S. Durkin, J. Gu, B. Bylsma, J. Gilmore,T.Y. Ling DAQ Motherboard (DMB) Initiates FE digitization and readout Receives.
The Past... DDL in ALICE DAQ The DDL project ( )  Collaboration of CERN, Wigner RCP, and Cerntech Ltd.  The major Hungarian engineering contribution.
FPGAs in ATLAS Front-End Electronics Henrik Åkerstedt, Steffen Muschter and Christian Bohm Stockholm University.
ATLAS TDAQ RoI Builder and the Level 2 Supervisor system R. E. Blair, J. Dawson, G. Drake, W. Haberichter, J. Schlereth, M. Abolins, Y. Ermoline, B. G.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
D. Attié, P. Baron, D. Calvet, P. Colas, C. Coquelet, E. Delagnes, R. Joannes, A. Le Coguie, S. Lhenoret, I. Mandjavidze, M. Riallot, E. Zonca TPC Electronics:
Status of Integration of Busy Box and D-RORC Csaba Soós ALICE week 3 July 2007.
Status report on the development of a TPC readout system based on the SALTRO-16 chip and some thoughts about the next step Leif Jönsson AIDA Meeting Vienna.
TPC CRU Jorge Mercado (Heidelberg) Ken Oyama (Nagasaki IAS) CRU Team Meeting, Jan. 26, 2016.
Readout Control Unit of the Time Projection Chamber in ALICE Presented by Jørgen Lien, Høgskolen i Bergen / Universitetet i Bergen / CERN Authors: Håvard.
17/02/06H-RORCKIP HeidelbergTorsten Alt The new H-RORC H-RORC.
Rutherford Appleton Laboratory September 1999Fifth Workshop on Electronics for LHC Presented by S. Quinton.
Readout controller Block Diagram S. Hansen - CD-1 Lehman Review1 VXO Ø Det Links to 24 SiPM Front End Boards Clock Event Data USB ARM uC A D Rd Wrt 100Mbit.
August 24, 2011IDAP Kick-off meeting - TileCal ATLAS TileCal Upgrade LHC and ATLAS current status LHC designed for cm -2 s 7+7 TeV Limited to.
Ketil Røed - LECC2005 Heidelberg Irradiation tests of the ALICE TPC Front-End Electronics chain Ketil Røed Faculty of Engineering, Bergen University.
The ALICE TPC Readout Control Unit 10th Workshop on Electronics for LHC and future Experiments 13 – 17 September 2004, BOSTON, USA Carmen González Gutierrez.
Radiation Tolerance Studies using Fault Injection on the Readout Control FPGA Design of the ALICE TPC Detector Johan Alme Bergen University College, Norway.
Sampa testing Arild Velure. FPGA testboard design SoC-Kit Board HPS Memory Command and Control Module Clock Manager Uart to bus MUX Data manager.
PC-based L0TP Status Report “on behalf of the Ferrara L0TP Group” Ilaria Neri University of Ferrara and INFN - Italy Ferrara, September 02, 2014.
The Data Handling Hybrid Igor Konorov TUM Physics Department E18.
DHH at DESY Test Beam 2016 Igor Konorov TUM Physics Department E18 19-th DEPFET workshop May Kloster Seeon Overview: DHH system overview DHE/DHC.
29/05/09A. Salamon – TDAQ WG - CERN1 LKr calorimeter L0 trigger V. Bonaiuto, L. Cesaroni, A. Fucci, A. Salamon, G. Salina, F. Sargeni.
The ALICE Data-Acquisition Read-out Receiver Card C. Soós et al. (for the ALICE collaboration) LECC September 2004, Boston.
Use of FPGA for dataflow Filippo Costa ALICE O2 CERN
Irradiation test results for SAMPA MPW1 and plans for MPW2 irradiation tests Sohail Musa Mahmood
Update on CSC Endcap Muon Port Card
CoBo - Different Boundaries & Different Options of
Status of the Front-End Electronics and DCS for PHOS and TPC
Radiation Tolerance of an Used in a Large Tracking Detector
A microTCA Based DAQ System for the CMS GEM Upgrade
PCI BASED READ-OUT RECEIVER CARD IN THE ALICE DAQ SYSTEM
Torsten Alt, Kjetil Ullaland, Matthias Richter, Ketil Røed, Johan Alme
Using FPGAs with Processors in YOUR Designs
Multi Chip Module (MCM) The ALICE Silicon Pixel Detector (SPD)
Presentation transcript:

First Performance Results of the ALICE TPC RCU2 Chengxin Zhao TWEPP September - Lisbon, Portugal 1 Chengxin Zhao University of Oslo, Norway On behalf of the ALICE TPC collaboration

The ALICE TPC Readout electronics Chengxin Zhao TWEPP September - Lisbon, Portugal 2 ALICE TPC Readout electronics – detector pads divided between the two end-plates. – 4356 Front-end cards (FEC) – 216 Readout Control Units (RCU) – Each RCU connects to 18 to 25 FECs In LHC RUN2 – Pb-Pb collision rate is increased from 3kHz to 20kHz  Higher Readout Speed needed! – Expected radiation flux increased from 0.8kHz/cm^2 to 3kHz/cm^2  Improved Radiation tolerance required!

RCU1 to RCU2 Two branches  Four branches Detector Data Link (DDL) with higher bandwidth: –  3 PCB boards  1 PCB board Four FPGAs -> One FPGA: – Two Flash based FPGAs and two SRAM based FPGAs in RCU1 – Microsemi Smartfusion2 Flash based FPGA in RCU2 Detector pad row based readout scheme – Utilize the parallelism of RCU2 hardware structure 3

System overview of RCU2 Chengxin Zhao TWEPP September - Lisbon, Portugal 4 S Monitoring & Safety SF2 Microcontroller Subsystem DDL2 Trigger receiver Data Assembler Data conditioner FIFO ABI Readout control Data Packager Readout control Readout control Readout control RCU bus master Readout Module Ethernet Readout System Trigger Handler Event Buffer Event Manager ALTRO Interface Module FECs Branch AO FECs Branch AI FECs Branch BI FECs Branch BO ALICE DAQ TTCrx Triggers Peripherals (RS232, DDR3,etc) Control & Monitor System ALICE DCS FECs X4 branch Internal Logic Analyzer

System Irradiation Test System Irradiation TSL Uppsala – 170 MeV Protons – 28 – 29 April 2015 – Irradiation test of the RCU2 with all surrounding systems (Trigger, DCS & DAQ) in place. The focus in this presentation are: – Readout stability – Linux system Stablity – Trigger Interface stability – Ethernet stability The test results are presented as follows: – Cross Section for the different tests have been calculated – Mean Time Between Failure (MTBF) for LHC RUN2 has been calculated, assuming (Worst Case): >20MeV Hadrons: 3.0 kHz/cm 2 * 216 RCUs 4356 FECs Chengxin Zhao TWEPP September - Lisbon, Portugal5 * Preliminary number for Run 2 inner partitions.

Irradiation Test Setup I Chengxin Zhao TWEPP September - Lisbon, Portugal 6 Trigger Crate PC with Uart Data Computer with CRORC PC2 (1) Monitor the status of data taking. (2) Configure the RCU2 and FECs (3) Control Trigger Crate PC1 (1) Linux status monitor (2) MSS status monitor PC3 Monitor SF2 core current FEC4 -AIFEC7-AOFEC4 -BIFEC7 -BO RCU2 DDL2 L0, L1 trigger & clock Serial communication Ethernet 10Mb/s SF2 starterkit I/V monitor LAN In radiation Shielded Control Room

Irradiation Test Setup II Chengxin Zhao TWEPP September - Lisbon, Portugal7

Readout Stability - Observations PLL lose lockFEC error RCU2 irradiated FEC error SF2 irradiated Data transmission error RCU2 irradiated Cross-section8.8E-11 ±38% (7 errors) 3.9E-11±71% (2 error) 3.6E-11±100% (1 error) 2.0E-11±100% (1 error) Mean Time Between Failure in RUN2 4.9 ±1.8 hours0.5 ±0.4 hours0.6 ±0.6 hours21.4 hours±21.4 hours Irradiated the whole RCU2 and only SF2 (using collimator) – FECs are always irradiated. Note: collimator shields also the FECs Data taking was monitored with trigger – Readout stopped several times. Three categories of errors lead to the stops of data taking: (1) Reset due to PLL lose lock (2) SEUs induced error on FECs. (3) Data transimission error * No scenario that can be interpreted as a FPGA fabric error has been seen. * Data transimission error was observed only when complete RCU2 is irradiated Chengxin Zhao TWEPP September - Lisbon, Portugal 8

Mitigation – Reset Strategy Chengxin Zhao TWEPP September - Lisbon, Portugal 9 In SF2, PLL lock has three options: (1)PLL hold reset before get lock (2)PLL output before lock and do NOT synchronize after getting lock (3)PLL output before lock and synchronize after getting lock PLL clock is not realiable when it loses lock -> Minimize the use of PLL. -Can not fully avoid the use of PLLs Reset strategy has been redesigned following irradiation test -Reset no longer depends on PLL lock signal after power on.

Mitigation - FEC, DDL & Fabric SEUs induced error on FECs -> The following error handling are implemented: ⁻Front-end bus on the FECs are continuously monitored. ⁻Communication between the RCU2 and FECs are monitored. ⁻Trailer word of each data package, which contains the infomation like channel address, length of data, etc, is verified. ⁻Purpose: Detect (& possibly correct) error situations early Chengxin Zhao TWEPP September - Lisbon, Portugal 10 Data Transmission error ⁻ALICE DAQ enters in Pause and Recover (PAR) state if there is some data transmission error so that the physics run does not need to stop. ⁻The PAR scheme is to be supported by the RCU2. ⁻PAR is also a solution to FEC induced erros. Fabric errors ⁻Critical locations in the design will be considered for TMR or hamming protection

Linux Stability - Observations Irradiated RCU2 and SF2 only -DDR3 memory protected by SECDED, which could enabled/disabled manually Two kind of errors observed: -Linux rebooting -Linux frozen Estimantion of Mean Time Between Failure (MTBF) in RUN2 is based on the WORST CASE environment SECDED on MDDR will most likely not help -Awaiting response from Microsemi DDR3 RAM 512 MB DDR-bridge DDR3 RAM 512 MB ARM-Cortex M3 Micro-controller Subsystem MSS DDR controller Smartfusion2 User IO DDR SECDED protected on DDR Instruction cache 11 RCU2 irradiated MDDR SECDED OFF ONLY SF2 irradiated MDDR SECDED OFF ONLY SF2 irradiated MDDR SECDEC ON Cross-section (reboot)5.04E-10±22% (20 errors)2.57E-10±38% (7 errors)2.48E-10±50% (4 errors) MTBF in RUN2 (reboot)0.9 ± 0.2 hours1.7 ± 0.6 hours1.7 ± 0.9 hours Cross-section (frozen)1.26E-10 ±45% (5 errors)1.10E-10±58% (3 errors)0.5E-10 ±100% (1 error) MTBF in RUN2 (frozen)3.4 ± 1.5 hours3.9 ± 2.3 hours8.6 ± 8.6 hours

Linux Stability - Mitigation Possible reasons of Linux reboots and Linux freeze: -Single-event upsets (SEUs) and multi-bit upsets (MBUs) in the DDR memories which leads to Linux Kernel panic -SEUs (and MBUs) in the ARM processor Actions taken to improve RCU2 stability: -Stand-alone module for DDL2 SERDES initalizing. -Default is initializion by MSS on System boot-up -Configuration of the FECs/ALTROs via the DDL2 -Ongoing: (1)Exploring Linux replacement -freeRTOS that ONLY resides on the internal eSRAM in the SF2 (2)Exploring the proper use of SECDED on MDDR Chengxin Zhao TWEPP September - Lisbon, Portugal 12

Some Other Observations Trigger Reception (TTCrx) is stable: no error was seen Monitoring and Safety Module is stable: no error seen on RCU2 side Ethernet is stable -Two errors observed  cross-section = 2.52E-11 ±71%. -Mean Time Between Failure (MTBF) in RUN2 is 17.0±12.1 hours Chengxin Zhao TWEPP September - Lisbon, Portugal13

Readout Speed 14 One full readout partition with 25 FECs used for test DDL – DDL link saturated from number of samples set to be ~50 DDL – Readout  Speed improved with a factor of ~2 – Readout  Speed wil be improved with a factor of ~ 2.6

RCU2 Commissioning Chengxin Zhao TWEPP September - Lisbon, Portugal RCU2 boards have been produced 6 boards have been installed and commissioned since Jan Trigger reception is stable Readout with is stable – No corrupt data observed. – No readout stop has been observed. No linux reboot/frozen has been observed. – ~10 reboots were observed on 210 RCU1s – Statistics too low to draw any conclusion Monitoring & Safety Module is stable Ethernet is working stable ISP programming works – Stops prematurely % of the times – Retry always works

RCU2 FPGA Design Ongoing Readout system plans: -Integration and verification of -Increase the sys clock frequency from 80MHz to 100MHz -External Oscillator is 100 MHz – avoid use of PLLs. -Implement data sorting algorithm (for High Level Trigger) -Implement handling of triggers for Multi-Event-Buffering. Test & benchmark: -Additional benchmark counters and pattern generators at various points in the firmware. Chengxin Zhao TWEPP September - Lisbon, Portugal16

Conclusions RCU2 System irradiation campaigns have been performed – It revealed some stability issues, especially regarding Linux and Readout – All the radiation related problems have so far been solved or planned with mitigation actions. With the and System the readout speed will be improved with a factor of ~2.6 compared to the RCU1. The RCU2 with has been verified to be working stable at TPC. – integration and verificaion of is onging FPGA Fabric Design is in finalizing phase Installation is scheduled for the Winter Break (Dec 2015 – March 2016) 17 Chengxin Zhao TWEPP September - Lisbon, Portugal

Thanks for Listening RCU2 team (no particular order) : Chengxin Zhao ) – University of Oslo, Johan Alme – Bergen University College, Norway Harald Appelshäuser, Torsten Alt, Stefan Kirsch – Goethe University Frankfurt, Germany Lars Bratrud, Jørgen Lien, Rune Langøy – Vestfold University College, Norway Fillipo Costa – CERN, Switzerland Taku Gunji, Yuko Sekiguchi – University of Tokyo, Japan Tivadar Kiss, Ernö David – Cerntech, Budapest, Hungary Christian Lippmann– GSI Darmstadt, Germany Anders Oskarsson, Lennart Osterman – University of Lund, Sweden Ketil Røed – University of Oslo, Norway Attiq Ur Rehman, Kjetil Ullaland, Dieter Röhrich, Shiming Yang, Arild Velure – University of Bergen, Norway Meghan Stuart, Andrew Castro – University of Tennesse 18Chengxin Zhao TWEPP September - Lisbon, Portugal

Backup Slides Chengxin Zhao TWEPP September - Lisbon, Portugal 19

Status of RCU2 20 Chengxin Zhao TWEPP September - Lisbon, Portugal 240 RCU2 boards have been produced – 216 boards for installation – 24 for testing and backup 6 boards have been installed and commissioned at one out of the 36 TPC Sectors since Jan System irradiation campaign – April 2015 at TSL laboratory Linux and Firmware are in the finalizing phase Remotely monitoring and upgrading of the RCU2 in place Installation planned in LHC winter break – Between Dec and March 2016

SF2 eSRAM Tests Smartfusion2 MSS has two eSRAM memories – size of each of is 256 Kbits – SECDED can be enabled/disabled manually Test were performed on eSRAM_1: – Write a pattern to all eSRAM_1 locations. – SECDED mechanism stores the number of 1- bit and 2-bit errors into the error counter registers. – Read the register of error counter every second. eSRAM SEU test from TSL Chengxin Zhao TWEPP September - Lisbon, Portugal SF2 eSRAM (single eSRAM per RCU2) Cross section [cm 2 /bit]2.70E-14 ±5% (352 errors) Mean Time Between SEUs in RUN2219±11 seconds 21